Go Back
You can't go any further
You can't go any further
meritocratic
regular
democratic
hot
top
alive
4 posts
The Pointers Problem
59 posts
Value Learning
104
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
2y
43
19
Stable Pointers to Value III: Recursive Quantilization
abramdemski
4y
4
18
Stable Pointers to Value II: Environmental Goals
abramdemski
4y
2
15
Stable Pointers to Value: An Agent Embedded in Its Own Utility Function
abramdemski
5y
9
22
Character alignment
p.b.
3mo
0
42
Different perspectives on concept extrapolation
Stuart_Armstrong
8mo
7
16
Value extrapolation vs Wireheading
Stuart_Armstrong
6mo
1
26
How an alien theory of mind might be unlearnable
Stuart_Armstrong
11mo
35
19
An Open Philanthropy grant proposal: Causal representation learning of human preferences
PabloAMC
11mo
6
14
Value extrapolation, concept extrapolation, model splintering
Stuart_Armstrong
9mo
1
9
The Pointers Problem - Distilled
NinaR
6mo
0
17
Morally underdefined situations can be deadly
Stuart_Armstrong
1y
8
10
AIs should learn human preferences, not biases
Stuart_Armstrong
8mo
1
69
The E-Coli Test for AI Alignment
johnswentworth
4y
24
68
Preface to the sequence on value learning
Rohin Shah
4y
6
65
Why we need a *theory* of human values
Stuart_Armstrong
4y
15
64
Clarifying "AI Alignment"
paulfchristiano
4y
82
41
Using vector fields to visualise preferences and make them consistent
MichaelA
2y
32