Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
37 posts
Value Learning
Kolmogorov Complexity
5 posts
The Pointers Problem
80
Beyond Kolmogorov and Shannon
Alexander Gietelink Oldenziel
1mo
14
58
Humans can be assigned any values whatsoever…
Stuart_Armstrong
4y
26
10
AIs should learn human preferences, not biases
Stuart_Armstrong
8mo
1
31
Human-AI Interaction
Rohin Shah
3y
10
1
Humans can be assigned any values whatsoever...
Stuart_Armstrong
5y
0
0
Kolmogorov complexity makes reward learning worse
Stuart_Armstrong
5y
0
44
What is ambitious value learning?
Rohin Shah
4y
28
68
Parsing Chris Mingard on Neural Networks
Alex Flint
1y
27
13
Morally underdefined situations can be deadly
Stuart_Armstrong
1y
8
21
Thoughts on implementing corrigible robust alignment
Steven Byrnes
3y
2
22
Learning human preferences: black-box, white-box, and structured white-box access
Stuart_Armstrong
2y
9
8
Values, Valence, and Alignment
Gordon Seidoh Worley
3y
4
7
What's the dream for giving natural language commands to AI?
Charlie Steiner
3y
8
51
Intuitions about goal-directed behavior
Rohin Shah
4y
15
67
Don't design agents which exploit adversarial inputs
TurnTrout
1mo
61
25
People care about each other even though they have imperfect motivational pointers?
TurnTrout
1mo
25
10
Stable Pointers to Value: An Agent Embedded in Its Own Utility Function
abramdemski
5y
9
109
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
2y
43
13
Stable Pointers to Value II: Environmental Goals
abramdemski
4y
2