Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

344 posts Research Agendas Value Learning Reinforcement Learning Embedded Agency Suffering AI Capabilities Agency Animal Welfare Inverse Reinforcement Learning Risks of Astronomical Suffering (S-risks) Wireheading Robust Agents

14230 posts Decision Theory Utility Functions Counterfactuals Goal-Directedness Nutrition Newcomb's Problem VNM Theorem Updateless Decision Theory Timeless Decision Theory Literature Reviews Functional Decision Theory Counterfactual Mugging

7 Note on algorithms with multiple trained components

Steven Byrnes

6h

1

34 My AGI safety research—2022 review, ’23 plans

Steven Byrnes

6d

6

91 When AI solves a game, focus on the game's mechanics, not its theme.

Cleo Nardo

27d

7

109 Will we run out of ML data? Evidence from projecting dataset size trends

Pablo Villalobos

1mo

12

124 New book on s-risks

Tobias_Baumann

1mo

1

286 Reward is not the optimization target

TurnTrout

4mo

97

300 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

45 A Short Dialogue on the Meaning of Reward Functions

Leon Lang

1mo

0

190 Some conceptual alignment research projects

Richard_Ngo

3mo

14

15 Riffing on the agent type

Quinn

12d

0

247 Humans are very reliable agents

alyssavance

6mo

35

271 Is AI Progress Impossible To Predict?

alyssavance

7mo

38

26 generalized wireheading

carado

1mo

7

26 LLMs may capture key components of human agency

catubc

1mo

0

28 K-complexity is silly; use cross-entropy instead

So8res

1h

4

21 How can one literally buy time (from x-risk) with money?

Alex_Altair

7d

3

36 Take 7: You should talk about "the human's utility function" less.

Charlie Steiner

12d

22

22 Using Obsidian if you're used to using Roam

Solenoid_Entity

9d

4

146 Decision theory does not imply that we get to have nice things

So8res

2mo

53

14 Join the AI Testing Hackathon this Friday

Esben Kran

8d

0

46 What videos should Rational Animations make?

Writer

24d

23

23 "Attention Passengers": not for Signs

jefftk

13d

10

12 EA & LW Forums Weekly Summary (28th Nov - 4th Dec 22')

Zoe Williams

14d

1

48 Notes on "Can you control the past"

So8res

2mo

40

36 Humans do acausal coordination all the time

Adam Jermyn

1mo

36

24 Two New Newcomb Variants

eva_

1mo

22

19 Decision Theory but also Ghosts

eva_

1mo

21

155 why assume AGIs will optimize for fixed goals?

nostalgebraist

6mo

52