Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
15 posts
Corrigibility
Petrov Day
2 posts
2017-2019 AI Alignment Prize
131
Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky
6mo
67
55
Corrigibility
paulfchristiano
4y
7
41
Can corrigibility be learned safely?
Wei_Dai
4y
115
39
Solve Corrigibility Week
Logan Riggs
1y
21
38
Do what we mean vs. do what we say
Rohin Shah
4y
14
29
Addressing three problems with counterfactual corrigibility: bad bets, defending against backstops, and overconfidence.
RyanCarey
4y
1
28
Corrigibility doesn't always have a good action to take
Stuart_Armstrong
4y
0
27
Petrov corrigibility
Stuart_Armstrong
4y
10
26
Formalizing Policy-Modification Corrigibility
TurnTrout
1y
6
17
On corrigibility and its basin
Donald Hobson
6mo
3
17
Corrigibility Via Thought-Process Deference
Thane Ruthenis
26d
5
15
A first look at the hard problem of corrigibility
jessicata
7y
0
15
Corrigibility as Constrained Optimisation
Henrik Åslund
3y
3
8
An Idea For Corrigible, Recursively Improving Math Oracles
jimrandomh
7y
0
101
Announcement: AI alignment prize round 3 winners and next round
cousin_it
4y
7
90
Announcement: AI alignment prize round 4 winners
cousin_it
3y
41