Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

344 posts Research Agendas Value Learning Reinforcement Learning Embedded Agency Suffering AI Capabilities Agency Animal Welfare Inverse Reinforcement Learning Risks of Astronomical Suffering (S-risks) Wireheading Robust Agents

14230 posts Decision Theory Utility Functions Counterfactuals Goal-Directedness Nutrition Newcomb's Problem VNM Theorem Updateless Decision Theory Timeless Decision Theory Literature Reviews Functional Decision Theory Counterfactual Mugging

286 Reward is not the optimization target

TurnTrout

4mo

97

7 Note on algorithms with multiple trained components

Steven Byrnes

6h

1

34 My AGI safety research—2022 review, ’23 plans

Steven Byrnes

6d

6

300 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

109 Will we run out of ML data? Evidence from projecting dataset size trends

Pablo Villalobos

1mo

12

91 When AI solves a game, focus on the game's mechanics, not its theme.

Cleo Nardo

27d

7

75 Seriously, what goes wrong with "reward the agent when it makes you smile"?

TurnTrout

4mo

41

23 Should you refrain from having children because of the risk posed by artificial intelligence?

Mientras

3mo

28

26 generalized wireheading

carado

1mo

7

35 What's the Most Impressive Thing That GPT-4 Could Plausibly Do?

bayesed

3mo

24

11 AGIs may value intrinsic rewards more than extrinsic ones

catubc

1mo

6

25 Latent Variables and Model Mis-Specification

jsteinhardt

4y

7

190 Some conceptual alignment research projects

Richard_Ngo

3mo

14

10 Stable Pointers to Value: An Agent Embedded in Its Own Utility Function

abramdemski

5y

9

28 K-complexity is silly; use cross-entropy instead

So8res

1h

4

170 Can you control the past?

Joe Carlsmith

1y

93

-6 Ponzi schemes can be highly profitable if your timing is good

GeneSmith

8d

18

36 Take 7: You should talk about "the human's utility function" less.

Charlie Steiner

12d

22

96 wrapper-minds are the enemy

nostalgebraist

6mo

36

46 What videos should Rational Animations make?

Writer

24d

23

146 Decision theory does not imply that we get to have nice things

So8res

2mo

53

23 "Attention Passengers": not for Signs

jefftk

13d

10

48 Notes on "Can you control the past"

So8res

2mo

40

36 Humans do acausal coordination all the time

Adam Jermyn

1mo

36

19 Decision Theory but also Ghosts

eva_

1mo

21

7 Cerebras Systems unveils a record 1.2 trillion transistor chip for AI

avturchin

3y

4

-31 ChatGPT's new novel rationality technique of fact checking

ChristianKl

9d

5

24 Two New Newcomb Variants

eva_

1mo

22