Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

50 posts Reinforcement Learning Inverse Reinforcement Learning Road To AI Safety Excellence

23 posts Wireheading Reward Functions

286 Reward is not the optimization target

TurnTrout

4mo

97

11 AGIs may value intrinsic rewards more than extrinsic ones

catubc

1mo

6

26 Is CIRL a promising agenda?

Chris_Leong

6mo

12

8 What messy problems do you see Deep Reinforcement Learning applicable to?

Riccardo Volpato

2y

0

0 Inverse reinforcement learning on self, pre-ontology-change

Stuart_Armstrong

7y

0

2 Some work on connecting UDT and Reinforcement Learning

IAFF-User-111

7y

0

11 Cooperative Inverse Reinforcement Learning vs. Irrational Human Preferences

orthonormal

6y

0

2 Modeling the capabilities of advanced AI systems as episodic reinforcement learning

jessicata

6y

0

1 (C)IRL is not solely a learning process

Stuart_Armstrong

6y

0

2 Vector-Valued Reinforcement Learning

orthonormal

6y

0

0 Reward/value learning for reinforcement learning

Stuart_Armstrong

5y

0

2 CIRL Wireheading

tom4everitt

5y

0

13 Delegative Inverse Reinforcement Learning

Vanessa Kosoy

5y

0

1 Delegative Reinforcement Learning with a Merely Sane Advisor

Vanessa Kosoy

5y

2

7 Note on algorithms with multiple trained components

Steven Byrnes

6h

1

75 Seriously, what goes wrong with "reward the agent when it makes you smile"?

TurnTrout

4mo

41

26 generalized wireheading

carado

1mo

7

8 Reward IS the Optimization Target

Carn

2mo

3

20 Reinforcement Learner Wireheading

Nate Showell

5mo

2

32 The Stamp Collector

So8res

7y

14

49 You cannot be mistaken about (not) wanting to wirehead

Kaj_Sotala

12y

79

7 Reward function learning: the value function

Stuart_Armstrong

4y

0

17 Wireheading as a potential problem with the new impact measure

Stuart_Armstrong

4y

20

18 Wireheading is in the eye of the beholder

Stuart_Armstrong

3y

10

176 Are wireheads happy?

Scott Alexander

12y

107

45 A Short Dialogue on the Meaning of Reward Functions

Leon Lang

1mo

0

20 Wireheading and discontinuity

Michele Campolo

2y

4

9 Why we want unbiased learning processes

Stuart_Armstrong

4y

3