Go Back
You can't go any further
You can't go any further
meritocratic
regular
democratic
hot
top
alive
12 posts
Wireheading
11 posts
Reward Functions
13
Note on algorithms with multiple trained components
Steven Byrnes
6h
1
16
generalized wireheading
carado
1mo
7
158
Are wireheads happy?
Scott Alexander
12y
107
29
Defining AI wireheading
Stuart_Armstrong
3y
9
34
Wireheading is in the eye of the beholder
Stuart_Armstrong
3y
10
22
Wireheading and discontinuity
Michele Campolo
2y
4
33
Wireheading as a potential problem with the new impact measure
Stuart_Armstrong
4y
20
58
The Stamp Collector
So8res
7y
14
66
A definition of wireheading
Anja
10y
80
39
You cannot be mistaken about (not) wanting to wirehead
Kaj_Sotala
12y
79
0
Wireheading Done Right: Stay Positive Without Going Insane
9eB1
6y
2
-4
Reinforcement Learner Wireheading
Nate Showell
5mo
2
35
A Short Dialogue on the Meaning of Reward Functions
Leon Lang
1mo
0
77
Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout
4mo
41
11
An investigation into when agents may be incentivized to manipulate our beliefs.
Felix Hofstätter
3mo
0
27
$100/$50 rewards for good references
Stuart_Armstrong
1y
5
51
Draft papers for REALab and Decoupled Approval on tampering
Jonathan Uesato
2y
2
36
Thoughts on reward engineering
paulfchristiano
3y
30
3
Reward model hacking as a challenge for reward learning
Erik Jenner
8mo
1
17
Why we want unbiased learning processes
Stuart_Armstrong
4y
3
11
Reward function learning: the value function
Stuart_Armstrong
4y
0
9
Reward function learning: the learning process
Stuart_Armstrong
4y
11
-10
Reward IS the Optimization Target
Carn
2mo
3