Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
48 posts
Corrigibility
Treacherous Turn
Tripwire
18 posts
Mild Optimization
Quantilization
Satisficer
26
People care about each other even though they have imperfect motivational pointers?
TurnTrout
1mo
25
26
Dumb and ill-posed question: Is conceptual research like this MIRI paper on the shutdown problem/Corrigibility "real"
joraine
26d
11
9
Corrigibility Via Thought-Process Deference
Thane Ruthenis
26d
5
4
Contrary to List of Lethality's point 22, alignment's door number 2
False Name, Esq.
6d
1
114
Interpretability/Tool-ness/Alignment/Corrigibility are not Composable
johnswentworth
4mo
8
14
What is wrong with this approach to corrigibility?
Rafael Cosman
5mo
8
91
Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky
6mo
67
102
Soares, Tallinn, and Yudkowsky discuss AGI cognition
So8res
1y
35
52
Corrigibility
paulfchristiano
4y
7
5
Simple question about corrigibility and values in AI.
jmh
1mo
1
25
CHAI, Assistance Games, And Fully-Updated Deference [Scott Alexander]
berglund
2mo
1
9
Superintelligence 13: Capability control methods
KatjaGrace
8y
48
16
On corrigibility and its basin
Donald Hobson
6mo
3
13
Petrov corrigibility
Stuart_Armstrong
4y
10
21
Quantilizers and Generative Models
Adam Jermyn
5mo
5
62
Steam
abramdemski
6mo
9
33
Exploring Mild Behaviour in Embedded Agents
Megan Kinniment
5mo
3
25
Satisficers want to become maximisers
Stuart_Armstrong
11y
68
32
In Praise of Maximizing – With Some Caveats
David Althaus
7y
19
19
Quantilizers maximize expected utility subject to a conservative cost constraint
jessicata
7y
0
15
Another view of quantilizers: avoiding Goodhart's Law
jessicata
6y
1
2
Thoughts on Quantilizers
Stuart_Armstrong
5y
0
10
Quantilal control for finite MDPs
Vanessa Kosoy
4y
0
7
Defining a limited satisficer
Stuart_Armstrong
7y
11
4
Anti-Pascaline satisficer
Stuart_Armstrong
7y
7
61
When to use quantilization
RyanCarey
3y
5
7
Is 'satificing' optimisation?
Riccardo Volpato
2y
3
8
Quantilizer ≡ Optimizer with a Bounded Amount of Output
itaibn0
1y
4