Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

83 posts AI Risk Goodhart's Law Corrigibility Instrumental Convergence Treacherous Turn Programming 2017-2019 AI Alignment Prize Satisficer LessWrong Event Transcripts Modeling People Petrov Day

83 posts World Optimization Threat Models Existential Risk Coordination / Cooperation Academic Papers AI Safety Camp Practical Ethics & Morality Symbol Grounding Security Mindset Sharp Left Turn Fiction

724 AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

6mo

653

336 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

205 Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More

Ben Pace

3y

60

180 Goodhart Taxonomy

Scott Garrabrant

4y

33

153 Seeking Power is Often Convergently Instrumental in MDPs

TurnTrout

3y

38

148 AGI ruin scenarios are likely (and disjunctive)

So8res

4mo

37

144 Worlds Where Iterative Design Fails

johnswentworth

3mo

26

111 AI Safety "Success Stories"

Wei_Dai

3y

27

109 Let's See You Write That Corrigibility Tag

Eliezer Yudkowsky

6mo

67

105 The Main Sources of AI Risk?

Daniel Kokotajlo

3y

25

103 AI will change the world, but won’t take it over by playing “3-dimensional chess”.

boazbarak

28d

86

102 What can the principal-agent literature tell us about AI risk?

Alexis Carlier

2y

31

98 Niceness is unnatural

So8res

2mo

18

93 The alignment problem from a deep learning perspective

Richard_Ngo

4mo

13

386 It Looks Like You're Trying To Take Over The World

gwern

9mo

125

319 What failure looks like

paulfchristiano

3y

49

314 How To Get Into Independent Research On Alignment/Agency

johnswentworth

1y

33

270 Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

6mo

65

253 A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

6mo

48

210 Another (outer) alignment failure story

paulfchristiano

1y

38

203 What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch

1y

60

199 Some AI research areas and their relevance to existential safety

Andrew_Critch

2y

40

175 Morality is Scary

Wei_Dai

1y

125

155 The next decades might be wild

Marius Hobbhahn

5d

21

143 Reshaping the AI Industry

Thane Ruthenis

6mo

34

135 Possible takeaways from the coronavirus pandemic for slow AI takeoff

Vika

2y

36

134 AI coordination needs clear wins

evhub

3mo

15

119 Late 2021 MIRI Conversations: AMA / Discussion

Rob Bensinger

9mo

208