Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
50 posts
Reinforcement Learning
Inverse Reinforcement Learning
Road To AI Safety Excellence
23 posts
Wireheading
Reward Functions
218
Reward is not the optimization target
TurnTrout
4mo
97
92
Book Review: Human Compatible
Scott Alexander
2y
6
84
Jitters No Evidence of Stupidity in RL
1a3orn
1y
18
63
RAISE is launching their MVP
3y
1
63
My take on Michael Littman on "The HCI of HAI"
Alex Flint
1y
4
56
Thoughts on "Human-Compatible"
TurnTrout
3y
35
49
Book review: Human Compatible
PeterMcCluskey
2y
2
46
Learning biases and rewards simultaneously
Rohin Shah
3y
3
45
Reinforcement Learning: A Non-Standard Introduction (Part 1)
royf
10y
19
38
Model Mis-specification and Inverse Reinforcement Learning
Owain_Evans
4y
3
32
Making a Difference Tempore: Insights from 'Reinforcement Learning: An Introduction'
TurnTrout
4y
6
26
Reinforcement learning with imperceptible rewards
Vanessa Kosoy
3y
1
26
"Human-level control through deep reinforcement learning" - computer learns 49 different games
skeptical_lurker
7y
19
25
Reinforcement Learning in the Iterated Amplification Framework
William_S
3y
12
158
Are wireheads happy?
Scott Alexander
12y
107
77
Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout
4mo
41
66
A definition of wireheading
Anja
10y
80
58
The Stamp Collector
So8res
7y
14
51
Draft papers for REALab and Decoupled Approval on tampering
Jonathan Uesato
2y
2
39
You cannot be mistaken about (not) wanting to wirehead
Kaj_Sotala
12y
79
36
Thoughts on reward engineering
paulfchristiano
3y
30
35
A Short Dialogue on the Meaning of Reward Functions
Leon Lang
1mo
0
34
Wireheading is in the eye of the beholder
Stuart_Armstrong
3y
10
33
Wireheading as a potential problem with the new impact measure
Stuart_Armstrong
4y
20
29
Defining AI wireheading
Stuart_Armstrong
3y
9
27
$100/$50 rewards for good references
Stuart_Armstrong
1y
5
22
Wireheading and discontinuity
Michele Campolo
2y
4
17
Why we want unbiased learning processes
Stuart_Armstrong
4y
3