Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

83 posts AI Risk Goodhart's Law Corrigibility Instrumental Convergence Treacherous Turn Programming 2017-2019 AI Alignment Prize Satisficer LessWrong Event Transcripts Modeling People Petrov Day

83 posts World Optimization Threat Models Existential Risk Coordination / Cooperation Academic Papers AI Safety Camp Practical Ethics & Morality Symbol Grounding Security Mindset Sharp Left Turn Fiction

462 AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

6mo

653

243 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

183 Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More

Ben Pace

3y

60

156 Goodhart Taxonomy

Scott Garrabrant

4y

33

147 Worlds Where Iterative Design Fails

johnswentworth

3mo

26

135 AGI ruin scenarios are likely (and disjunctive)

So8res

4mo

37

131 Let's See You Write That Corrigibility Tag

Eliezer Yudkowsky

6mo

67

124 AI Safety "Success Stories"

Wei_Dai

3y

27

123 Seeking Power is Often Convergently Instrumental in MDPs

TurnTrout

3y

38

113 Niceness is unnatural

So8res

2mo

18

108 The Main Sources of AI Risk?

Daniel Kokotajlo

3y

25

107 What can the principal-agent literature tell us about AI risk?

Alexis Carlier

2y

31

101 Announcement: AI alignment prize round 3 winners and next round

cousin_it

4y

7

98 AI will change the world, but won’t take it over by playing “3-dimensional chess”.

boazbarak

28d

86

256 Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

6mo

65

255 It Looks Like You're Trying To Take Over The World

gwern

9mo

125

222 What failure looks like

paulfchristiano

3y

49

215 How To Get Into Independent Research On Alignment/Agency

johnswentworth

1y

33

214 A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

6mo

48

191 Some AI research areas and their relevance to existential safety

Andrew_Critch

2y

40

180 Another (outer) alignment failure story

paulfchristiano

1y

38

173 Morality is Scary

Wei_Dai

1y

125

159 Possible takeaways from the coronavirus pandemic for slow AI takeoff

Vika

2y

36

154 What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch

1y

60

132 AI coordination needs clear wins

evhub

3mo

15

121 The next decades might be wild

Marius Hobbhahn

5d

21

119 List of resolved confusions about IDA

Wei_Dai

3y

18

118 Late 2021 MIRI Conversations: AMA / Discussion

Rob Bensinger

9mo

208