Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

30 posts Reinforcement Learning Wireheading Reward Functions

11 posts AI Capabilities EfficientZero Tradeoffs

271 Reward is not the optimization target

TurnTrout

4mo

97

89 Scaling Laws for Reward Model Overoptimization

leogao

2mo

11

80 Big picture of phasic dopamine

Steven Byrnes

1y

18

77 Jitters No Evidence of Stupidity in RL

1a3orn

1y

18

72 Seriously, what goes wrong with "reward the agent when it makes you smile"?

TurnTrout

4mo

41

61 Towards deconfusing wireheading and reward maximization

leogao

3mo

7

52 My take on Michael Littman on "The HCI of HAI"

Alex Flint

1y

4

44 Reward model hacking as a challenge for reward learning

Erik Jenner

8mo

1

43 A Short Dialogue on the Meaning of Reward Functions

Leon Lang

1mo

0

42 Four usages of "loss" in AI

TurnTrout

2mo

18

41 Draft papers for REALab and Decoupled Approval on tampering

Jonathan Uesato

2y

2

28 Scalar reward is not enough for aligned AGI

Peter Vamplew

11mo

3

26 Conditioning, Prompts, and Fine-Tuning

Adam Jermyn

4mo

9

25 Reinforcement learning with imperceptible rewards

Vanessa Kosoy

3y

1

334 EfficientZero: How It Works

1a3orn

1y

42

124 EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised

gwern

1y

52

120 We have achieved Noob Gains in AI

phdead

7mo

21

107 Evaluations project @ ARC is hiring a researcher and a webdev/engineer

Beth Barnes

3mo

7

104 Will we run out of ML data? Evidence from projecting dataset size trends

Pablo Villalobos

1mo

12

77 OpenAI Solves (Some) Formal Math Olympiad Problems

Michaël Trazzi

10mo

26

76 The alignment problem in different capability regimes

Buck

1y

12

43 Remaking EfficientZero (as best I can)

Hoagy

5mo

9

32 It matters when the first sharp left turn happens

Adam Jermyn

2mo

9

32 Misc. questions about EfficientZero

Daniel Kokotajlo

1y

17

2 Epistemic Strategies of Safety-Capabilities Tradeoffs

adamShimi

1y

0