Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
29 posts
Corrigibility
Instrumental Convergence
Treacherous Turn
Programming
2017-2019 AI Alignment Prize
LessWrong Event Transcripts
Satisficer
Petrov Day
35 posts
AI Risk
227
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More
Ben Pace
3y
60
183
Seeking Power is Often Convergently Instrumental in MDPs
TurnTrout
3y
38
87
Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky
6mo
67
85
Announcement: AI alignment prize round 3 winners and next round
cousin_it
4y
7
76
A Gym Gridworld Environment for the Treacherous Turn
Michaël Trazzi
4y
9
60
Environmental Structure Can Cause Instrumental Convergence
TurnTrout
1y
44
58
Announcement: AI alignment prize round 4 winners
cousin_it
3y
41
58
Satisficers Tend To Seek Power: Instrumental Convergence Via Retargetability
TurnTrout
1y
8
56
You can still fetch the coffee today if you're dead tomorrow
davidad
11d
15
49
Corrigibility
paulfchristiano
4y
7
39
Solve Corrigibility Week
Logan Riggs
1y
21
33
Clarifying Power-Seeking and Instrumental Convergence
TurnTrout
3y
7
32
Empowerment is (almost) All We Need
jacob_cannell
1mo
43
30
Do what we mean vs. do what we say
Rohin Shah
4y
14
986
AGI Ruin: A List of Lethalities
Eliezer Yudkowsky
6mo
653
429
Counterarguments to the basic AI x-risk case
KatjaGrace
2mo
122
161
AGI ruin scenarios are likely (and disjunctive)
So8res
4mo
37
141
Worlds Where Iterative Design Fails
johnswentworth
3mo
26
108
AI will change the world, but won’t take it over by playing “3-dimensional chess”.
boazbarak
28d
86
102
The Main Sources of AI Risk?
Daniel Kokotajlo
3y
25
98
AI Safety "Success Stories"
Wei_Dai
3y
27
97
What can the principal-agent literature tell us about AI risk?
Alexis Carlier
2y
31
93
The alignment problem from a deep learning perspective
Richard_Ngo
4mo
13
91
Oversight Misses 100% of Thoughts The AI Does Not Think
johnswentworth
4mo
49
83
Niceness is unnatural
So8res
2mo
18
78
Clarifying some key hypotheses in AI alignment
Ben Cottier
3y
12
72
Complex Systems for AI Safety [Pragmatic AI Safety #3]
Dan H
7mo
2
66
Some abstract, non-technical reasons to be non-maximally-pessimistic about AI alignment
Rob Bensinger
1y
37