Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
166 posts
AI Risk
Goodhart's Law
World Optimization
Threat Models
Instrumental Convergence
Corrigibility
Existential Risk
Coordination / Cooperation
Academic Papers
AI Safety Camp
Ethics & Morality
Treacherous Turn
689 posts
Newsletters
Logical Induction
Epistemology
SERI MATS
Logical Uncertainty
Intellectual Progress (Society-Level)
Practice & Philosophy of Science
AI Alignment Fieldbuilding
Distillation & Pedagogy
Bayes' Theorem
Postmortems & Retrospectives
Radical Probabilism
462
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
6mo
653
256
Six Dimensions of Operational Adequacy in AGI Projects
Eliezer Yudkowsky
6mo
65
255
It Looks Like You're Trying To Take Over The World
gwern
9mo
125
243
Counterarguments to the basic AI x-risk case
KatjaGrace
2mo
122
222
What failure looks like
paulfchristiano
3y
49
215
How To Get Into Independent Research On Alignment/Agency
johnswentworth
1y
33
214
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res
6mo
48
191
Some AI research areas and their relevance to existential safety
Andrew_Critch
2y
40
183
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More
Ben Pace
3y
60
180
Another (outer) alignment failure story
paulfchristiano
1y
38
173
Morality is Scary
Wei_Dai
1y
125
159
Possible takeaways from the coronavirus pandemic for slow AI takeoff
Vika
2y
36
156
Goodhart Taxonomy
Scott Garrabrant
4y
33
154
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
Andrew_Critch
1y
60
169
Alignment Research Field Guide
abramdemski
3y
9
164
Most People Start With The Same Few Bad Ideas
johnswentworth
3mo
30
164
Radical Probabilism
abramdemski
2y
47
163
Call For Distillers
johnswentworth
8mo
42
149
Lessons learned from talking to >100 academics about AI safety
Marius Hobbhahn
2mo
16
139
The Fusion Power Generator Scenario
johnswentworth
2y
29
133
An Intuitive Guide to Garrabrant Induction
Mark Xu
1y
18
126
Your posts should be on arXiv
JanBrauner
3mo
39
116
Logical induction for software engineers
Alex Flint
17d
2
104
Alignment Newsletter One Year Retrospective
Rohin Shah
3y
31
95
On Solving Problems Before They Appear: The Weird Epistemologies of Alignment
adamShimi
1y
11
95
Bayesian Probability is for things that are Space-like Separated from You
Scott Garrabrant
4y
22
95
How to do theoretical research, a personal perspective
Mark Xu
4mo
4
92
Quintin's alignment papers roundup - week 1
Quintin Pope
3mo
5