Go Back
You can't go any further
You can't go any further
meritocratic
regular
democratic
hot
top
alive
4 posts
The Pointers Problem
59 posts
Value Learning
93
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
2y
43
23
Stable Pointers to Value II: Environmental Goals
abramdemski
4y
2
19
Stable Pointers to Value III: Recursive Quantilization
abramdemski
4y
4
20
Stable Pointers to Value: An Agent Embedded in Its Own Utility Function
abramdemski
5y
9
21
Character alignment
p.b.
3mo
0
48
Different perspectives on concept extrapolation
Stuart_Armstrong
8mo
7
23
Value extrapolation vs Wireheading
Stuart_Armstrong
6mo
1
29
How an alien theory of mind might be unlearnable
Stuart_Armstrong
11mo
35
20
Value extrapolation, concept extrapolation, model splintering
Stuart_Armstrong
9mo
1
20
Morally underdefined situations can be deadly
Stuart_Armstrong
1y
8
13
An Open Philanthropy grant proposal: Causal representation learning of human preferences
PabloAMC
11mo
6
9
AIs should learn human preferences, not biases
Stuart_Armstrong
8mo
1
68
Clarifying "AI Alignment"
paulfchristiano
4y
82
7
The Pointers Problem - Distilled
NinaR
6mo
0
64
Why we need a *theory* of human values
Stuart_Armstrong
4y
15
42
Since figuring out human values is hard, what about, say, monkey values?
shminux
2y
13
58
The E-Coli Test for AI Alignment
johnswentworth
4y
24
32
Learning Values in Practice
Stuart_Armstrong
2y
0