Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
166 posts
AI Risk
Goodhart's Law
World Optimization
Threat Models
Instrumental Convergence
Corrigibility
Existential Risk
Coordination / Cooperation
Academic Papers
AI Safety Camp
Ethics & Morality
Treacherous Turn
689 posts
Newsletters
Logical Induction
Epistemology
SERI MATS
Logical Uncertainty
Intellectual Progress (Society-Level)
Practice & Philosophy of Science
AI Alignment Fieldbuilding
Distillation & Pedagogy
Bayes' Theorem
Postmortems & Retrospectives
Radical Probabilism
986
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
6mo
653
517
It Looks Like You're Trying To Take Over The World
gwern
9mo
125
429
Counterarguments to the basic AI x-risk case
KatjaGrace
2mo
122
416
What failure looks like
paulfchristiano
3y
49
413
How To Get Into Independent Research On Alignment/Agency
johnswentworth
1y
33
292
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res
6mo
48
284
Six Dimensions of Operational Adequacy in AGI Projects
Eliezer Yudkowsky
6mo
65
252
What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)
Andrew_Critch
1y
60
240
Another (outer) alignment failure story
paulfchristiano
1y
38
227
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More
Ben Pace
3y
60
207
Some AI research areas and their relevance to existential safety
Andrew_Critch
2y
40
204
Goodhart Taxonomy
Scott Garrabrant
4y
33
201
Reshaping the AI Industry
Thane Ruthenis
6mo
34
189
The next decades might be wild
Marius Hobbhahn
5d
21
305
Alignment Research Field Guide
abramdemski
3y
9
265
Lessons learned from talking to >100 academics about AI safety
Marius Hobbhahn
2mo
16
221
Call For Distillers
johnswentworth
8mo
42
167
Conjecture: Internal Infohazard Policy
Connor Leahy
4mo
6
158
Most People Start With The Same Few Bad Ideas
johnswentworth
3mo
30
154
Radical Probabilism
abramdemski
2y
47
146
Quintin's alignment papers roundup - week 1
Quintin Pope
3mo
5
144
Your posts should be on arXiv
JanBrauner
3mo
39
133
The Fusion Power Generator Scenario
johnswentworth
2y
29
132
Logical induction for software engineers
Alex Flint
17d
2
100
Productive Mistakes, Not Perfect Answers
adamShimi
8mo
11
99
On Solving Problems Before They Appear: The Weird Epistemologies of Alignment
adamShimi
1y
11
97
Intuitions about solving hard problems
Richard_Ngo
7mo
23
97
An Intuitive Guide to Garrabrant Induction
Mark Xu
1y
18