Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
83 posts
AI Risk
Goodhart's Law
Corrigibility
Instrumental Convergence
Treacherous Turn
Programming
2017-2019 AI Alignment Prize
Satisficer
LessWrong Event Transcripts
Modeling People
Petrov Day
83 posts
World Optimization
Threat Models
Existential Risk
Coordination / Cooperation
Academic Papers
AI Safety Camp
Practical
Ethics & Morality
Symbol Grounding
Security Mindset
Sharp Left Turn
Fiction
58
You can still fetch the coffee today if you're dead tomorrow
davidad
11d
15
336
Counterarguments to the basic AI x-risk case
KatjaGrace
2mo
122
103
AI will change the world, but won’t take it over by playing “3-dimensional chess”.
boazbarak
28d
86
724
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
6mo
653
98
Niceness is unnatural
So8res
2mo
18
144
Worlds Where Iterative Design Fails
johnswentworth
3mo
26
72
What does it mean for an AGI to be 'safe'?
So8res
2mo
32
148
AGI ruin scenarios are likely (and disjunctive)
So8res
4mo
37
58
Eli's review of "Is power-seeking AI an existential risk?"
elifland
2mo
0
93
The alignment problem from a deep learning perspective
Richard_Ngo
4mo
13
36
Empowerment is (almost) All We Need
jacob_cannell
1mo
43
85
Oversight Misses 100% of Thoughts The AI Does Not Think
johnswentworth
4mo
49
13
Corrigibility Via Thought-Process Deference
Thane Ruthenis
26d
5
109
Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky
6mo
67
155
The next decades might be wild
Marius Hobbhahn
5d
21
39
AI Neorealism: a threat model & success criterion for existential safety
davidad
5d
0
94
Thoughts on AGI organizations and capabilities work
Rob Bensinger
13d
17
48
Deconfusing Direct vs Amortised Optimization
beren
18d
6
36
Refining the Sharp Left Turn threat model, part 2: applying alignment techniques
Vika
25d
4
93
Don't leave your fingerprints on the future
So8res
2mo
32
253
A central AI alignment problem: capabilities generalization, and the sharp left turn
So8res
6mo
48
134
AI coordination needs clear wins
evhub
3mo
15
270
Six Dimensions of Operational Adequacy in AGI Projects
Eliezer Yudkowsky
6mo
65
386
It Looks Like You're Trying To Take Over The World
gwern
9mo
125
41
Some advice on independent research
Marius Hobbhahn
1mo
4
118
An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet)
3mo
18
78
Nearcast-based "deployment problem" analysis
HoldenKarnofsky
3mo
2
88
Linkpost: Github Copilot productivity experiment
Daniel Kokotajlo
3mo
4