Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

344 posts Research Agendas Value Learning Reinforcement Learning Embedded Agency Suffering AI Capabilities Agency Animal Welfare Inverse Reinforcement Learning Risks of Astronomical Suffering (S-risks) Wireheading Robust Agents

14230 posts Decision Theory Utility Functions Counterfactuals Goal-Directedness Nutrition Newcomb's Problem VNM Theorem Updateless Decision Theory Timeless Decision Theory Literature Reviews Functional Decision Theory Counterfactual Mugging

276 Is AI Progress Impossible To Predict?

alyssavance

7mo

38

273 EfficientZero: How It Works

1a3orn

1y

42

258 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

252 Reward is not the optimization target

TurnTrout

4mo

97

248 Humans are very reliable agents

alyssavance

6mo

35

198 Embedded Agents

abramdemski

4y

41

168 Some conceptual alignment research projects

Richard_Ngo

3mo

14

167 Are wireheads happy?

Scott Alexander

12y

107

145 Introduction to Cartesian Frames

Scott Garrabrant

2y

29

143 Embedded Agency (full-text version)

Scott Garrabrant

4y

15

134 EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised

gwern

1y

52

131 Demand offsetting

paulfchristiano

1y

38

130 Being a Robust Agent

Raemon

4y

32

112 Our take on CHAI’s research agenda in under 1500 words

Alex Flint

2y

19

157 Impossibility results for unbounded utilities

paulfchristiano

10mo

104

147 Can you control the past?

Joe Carlsmith

1y

93

142 Decision theory does not imply that we get to have nice things

So8res

2mo

53

139 Coherent decisions imply consistent utilities

Eliezer Yudkowsky

3y

81

137 2020 AI Alignment Literature Review and Charity Comparison

Larks

1y

14

130 Saving Time

Scott Garrabrant

1y

19

128 Newcomb's Problem and Regret of Rationality

Eliezer Yudkowsky

14y

614

128 An Orthodox Case Against Utility Functions

abramdemski

2y

53

122 How I Lost 100 Pounds Using TDT

Zvi

11y

244

119 why assume AGIs will optimize for fixed goals?

nostalgebraist

6mo

52

114 Decision Theory

abramdemski

4y

46

112 Decision Theory FAQ

lukeprog

9y

484

108 Humans are utility monsters

PhilGoetz

9y

217

106 Decision Theories: A Less Wrong Primer

orthonormal

10y

174