Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

50 posts Reinforcement Learning Inverse Reinforcement Learning Road To AI Safety Excellence

23 posts Wireheading Reward Functions

218 Reward is not the optimization target

TurnTrout

4mo

97

92 Book Review: Human Compatible

Scott Alexander

2y

6

84 Jitters No Evidence of Stupidity in RL

1a3orn

1y

18

63 RAISE is launching their MVP

3y

1

63 My take on Michael Littman on "The HCI of HAI"

Alex Flint

1y

4

56 Thoughts on "Human-Compatible"

TurnTrout

3y

35

49 Book review: Human Compatible

PeterMcCluskey

2y

2

46 Learning biases and rewards simultaneously

Rohin Shah

3y

3

45 Reinforcement Learning: A Non-Standard Introduction (Part 1)

royf

10y

19

38 Model Mis-specification and Inverse Reinforcement Learning

Owain_Evans

4y

3

32 Making a Difference Tempore: Insights from 'Reinforcement Learning: An Introduction'

TurnTrout

4y

6

26 Reinforcement learning with imperceptible rewards

Vanessa Kosoy

3y

1

26 "Human-level control through deep reinforcement learning" - computer learns 49 different games

skeptical_lurker

7y

19

25 Reinforcement Learning in the Iterated Amplification Framework

William_S

3y

12

158 Are wireheads happy?

Scott Alexander

12y

107

77 Seriously, what goes wrong with "reward the agent when it makes you smile"?

TurnTrout

4mo

41

66 A definition of wireheading

Anja

10y

80

58 The Stamp Collector

So8res

7y

14

51 Draft papers for REALab and Decoupled Approval on tampering

Jonathan Uesato

2y

2

39 You cannot be mistaken about (not) wanting to wirehead

Kaj_Sotala

12y

79

36 Thoughts on reward engineering

paulfchristiano

3y

30

35 A Short Dialogue on the Meaning of Reward Functions

Leon Lang

1mo

0

34 Wireheading is in the eye of the beholder

Stuart_Armstrong

3y

10

33 Wireheading as a potential problem with the new impact measure

Stuart_Armstrong

4y

20

29 Defining AI wireheading

Stuart_Armstrong

3y

9

27 $100/$50 rewards for good references

Stuart_Armstrong

1y

5

22 Wireheading and discontinuity

Michele Campolo

2y

4

17 Why we want unbiased learning processes

Stuart_Armstrong

4y

3