Go Back
You can't go any further
You can't go any further
meritocratic
regular
democratic
hot
top
alive
12 posts
Wireheading
11 posts
Reward Functions
7
Note on algorithms with multiple trained components
Steven Byrnes
6h
1
26
generalized wireheading
carado
1mo
7
20
Reinforcement Learner Wireheading
Nate Showell
5mo
2
32
The Stamp Collector
So8res
7y
14
49
You cannot be mistaken about (not) wanting to wirehead
Kaj_Sotala
12y
79
17
Wireheading as a potential problem with the new impact measure
Stuart_Armstrong
4y
20
18
Wireheading is in the eye of the beholder
Stuart_Armstrong
3y
10
176
Are wireheads happy?
Scott Alexander
12y
107
20
Wireheading and discontinuity
Michele Campolo
2y
4
38
A definition of wireheading
Anja
10y
80
14
Wireheading Done Right: Stay Positive Without Going Insane
9eB1
6y
2
15
Defining AI wireheading
Stuart_Armstrong
3y
9
75
Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout
4mo
41
8
Reward IS the Optimization Target
Carn
2mo
3
7
Reward function learning: the value function
Stuart_Armstrong
4y
0
45
A Short Dialogue on the Meaning of Reward Functions
Leon Lang
1mo
0
9
Why we want unbiased learning processes
Stuart_Armstrong
4y
3
24
Thoughts on reward engineering
paulfchristiano
3y
30
43
Draft papers for REALab and Decoupled Approval on tampering
Jonathan Uesato
2y
2
19
An investigation into when agents may be incentivized to manipulate our beliefs.
Felix Hofstätter
3mo
0
13
$100/$50 rewards for good references
Stuart_Armstrong
1y
5
3
Reward function learning: the learning process
Stuart_Armstrong
4y
11
47
Reward model hacking as a challenge for reward learning
Erik Jenner
8mo
1