Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

177 posts Rationality Decision Theory Abstraction Goal-Directedness Utility Functions Finite Factored Sets Causality Literature Reviews Quantilization Mild Optimization Open Problems Filtered Evidence

172 posts World Modeling Impact Regularization Human Values Shard Theory Anthropics Complexity of Value Exercises / Problem-Sets Gradient Hacking Evolution Fixed Point Theorems Heuristics & Biases Modularity

206 Realism about rationality

Richard_Ngo

4y

145

175 2021 AI Alignment Literature Review and Charity Comparison

Larks

12mo

26

171 Finite Factored Sets in Pictures

Magdalena Wache

9d

29

160 Can you control the past?

Joe Carlsmith

1y

93

146 why assume AGIs will optimize for fixed goals?

nostalgebraist

6mo

52

141 Finite Factored Sets

Scott Garrabrant

1y

94

137 What's Up With Confusingly Pervasive Consequentialism?

Raemon

11mo

88

137 2020 AI Alignment Literature Review and Charity Comparison

Larks

1y

14

133 Decision Theory

abramdemski

4y

46

130 An Orthodox Case Against Utility Functions

abramdemski

2y

53

121 Problem relaxation as a tactic

TurnTrout

2y

8

114 Saving Time

Scott Garrabrant

1y

19

108 Principles for Alignment/Agency Projects

johnswentworth

5mo

20

99 Utility ≠ Reward

vlad_m

3y

25

981 Where I agree and disagree with Eliezer

paulfchristiano

6mo

205

381 Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

5mo

89

249 The shard theory of human values

Quintin Pope

3mo

57

196 Utility Maximization = Description Length Minimization

johnswentworth

1y

40

191 Humans provide an untapped wealth of evidence about alignment

TurnTrout

5mo

92

163 Evolution of Modularity

johnswentworth

3y

12

137 My research methodology

paulfchristiano

1y

36

134 Towards a New Impact Measure

TurnTrout

4y

159

131 Testing The Natural Abstraction Hypothesis: Project Intro

johnswentworth

1y

34

120 Shard Theory: An Overview

David Udell

4mo

34

106 Selection Theorems: A Program For Understanding Agents

johnswentworth

1y

23

101 Fixing The Good Regulator Theorem

johnswentworth

1y

25

97 Reframing Impact

TurnTrout

3y

15

93 Frequent arguments about alignment

John Schulman

1y

16