Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
17 posts
Corrigibility
2017-2019 AI Alignment Prize
Petrov Day
3 posts
Treacherous Turn
Programming
17
Corrigibility Via Thought-Process Deference
Thane Ruthenis
26d
5
17
On corrigibility and its basin
Donald Hobson
6mo
3
27
Petrov corrigibility
Stuart_Armstrong
4y
10
7
Corrigible omniscient AI capable of making clones
Kaj_Sotala
7y
0
8
An Idea For Corrigible, Recursively Improving Math Oracles
jimrandomh
7y
0
15
A first look at the hard problem of corrigibility
jessicata
7y
0
38
Do what we mean vs. do what we say
Rohin Shah
4y
14
131
Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky
6mo
67
39
Solve Corrigibility Week
Logan Riggs
1y
21
26
Formalizing Policy-Modification Corrigibility
TurnTrout
1y
6
15
Corrigibility as Constrained Optimisation
Henrik Åslund
3y
3
55
Corrigibility
paulfchristiano
4y
7
101
Announcement: AI alignment prize round 3 winners and next round
cousin_it
4y
7
90
Announcement: AI alignment prize round 4 winners
cousin_it
3y
41
40
[Linkpost] Treacherous turns in the wild
Mark Xu
1y
6
27
[AN #165]: When large models are more likely to lie
Rohin Shah
1y
0
70
A Gym Gridworld Environment for the Treacherous Turn
Michaël Trazzi
4y
9