Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

101 posts Reinforcement Learning AI Capabilities Inverse Reinforcement Learning Wireheading Definitions Reward Functions Stag Hunt Road To AI Safety Excellence Goals Prompt Engineering EfficientZero PaLM

63 posts Value Learning The Pointers Problem

276 Is AI Progress Impossible To Predict?

alyssavance

7mo

38

273 EfficientZero: How It Works

1a3orn

1y

42

252 Reward is not the optimization target

TurnTrout

4mo

97

167 Are wireheads happy?

Scott Alexander

12y

107

134 EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised

gwern

1y

52

82 Jitters No Evidence of Stupidity in RL

1a3orn

1y

18

81 When AI solves a game, focus on the game's mechanics, not its theme.

Cleo Nardo

27d

7

77 Book Review: Human Compatible

Scott Alexander

2y

6

76 Seriously, what goes wrong with "reward the agent when it makes you smile"?

TurnTrout

4mo

41

74 Will we run out of ML data? Evidence from projecting dataset size trends

Pablo Villalobos

1mo

12

67 RAISE is launching their MVP

3y

1

63 Thoughts on "Human-Compatible"

TurnTrout

3y

35

59 My take on Michael Littman on "The HCI of HAI"

Alex Flint

1y

4

58 Competitive programming with AlphaCode

Algon

10mo

37

104 The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

johnswentworth

2y

43

76 The Urgent Meta-Ethics of Friendly Artificial Intelligence

lukeprog

11y

252

69 The E-Coli Test for AI Alignment

johnswentworth

4y

24

68 Preface to the sequence on value learning

Rohin Shah

4y

6

65 Why we need a *theory* of human values

Stuart_Armstrong

4y

15

64 Clarifying "AI Alignment"

paulfchristiano

4y

82

58 Where do selfish values come from?

Wei_Dai

11y

62

56 Humans can be assigned any values whatsoever…

Stuart_Armstrong

4y

26

52 Intuitions about goal-directed behavior

Rohin Shah

4y

15

50 The easy goal inference problem is still hard

paulfchristiano

4y

19

49 What is ambitious value learning?

Rohin Shah

4y

28

49 Conclusion to the sequence on value learning

Rohin Shah

3y

20

46 Future directions for ambitious value learning

Rohin Shah

4y

9

42 Different perspectives on concept extrapolation

Stuart_Armstrong

8mo

7