Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
101 posts
Reinforcement Learning
AI Capabilities
Inverse Reinforcement Learning
Wireheading
Definitions
Reward Functions
Stag Hunt
Road To AI Safety Excellence
Goals
Prompt Engineering
EfficientZero
PaLM
63 posts
Value Learning
The Pointers Problem
352
EfficientZero: How It Works
1a3orn
1y
42
286
Reward is not the optimization target
TurnTrout
4mo
97
271
Is AI Progress Impossible To Predict?
alyssavance
7mo
38
176
Are wireheads happy?
Scott Alexander
12y
107
129
EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised
gwern
1y
52
109
Will we run out of ML data? Evidence from projecting dataset size trends
Pablo Villalobos
1mo
12
91
When AI solves a game, focus on the game's mechanics, not its theme.
Cleo Nardo
27d
7
80
Jitters No Evidence of Stupidity in RL
1a3orn
1y
18
75
Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout
4mo
41
71
RAISE is launching their MVP
3y
1
70
Thoughts on "Human-Compatible"
TurnTrout
3y
35
65
Competitive programming with AlphaCode
Algon
10mo
37
62
Book Review: Human Compatible
Scott Alexander
2y
6
55
My take on Michael Littman on "The HCI of HAI"
Alex Flint
1y
4
115
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
2y
43
80
The E-Coli Test for AI Alignment
johnswentworth
4y
24
77
Preface to the sequence on value learning
Rohin Shah
4y
6
66
Why we need a *theory* of human values
Stuart_Armstrong
4y
15
61
Humans can be assigned any values whatsoever…
Stuart_Armstrong
4y
26
60
Clarifying "AI Alignment"
paulfchristiano
4y
82
55
The Urgent Meta-Ethics of Friendly Artificial Intelligence
lukeprog
11y
252
53
Intuitions about goal-directed behavior
Rohin Shah
4y
15
52
AI Alignment Problem: “Human Values” don’t Actually Exist
avturchin
3y
29
50
Future directions for ambitious value learning
Rohin Shah
4y
9
50
The easy goal inference problem is still hard
paulfchristiano
4y
19
47
Using vector fields to visualise preferences and make them consistent
MichaelA
2y
32
46
What is ambitious value learning?
Rohin Shah
4y
28
44
Conclusion to the sequence on value learning
Rohin Shah
3y
20