Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

166 posts AI Risk Goodhart's Law World Optimization Threat Models Instrumental Convergence Corrigibility Existential Risk Coordination / Cooperation Academic Papers AI Safety Camp Ethics & Morality Treacherous Turn

689 posts Newsletters Logical Induction Epistemology SERI MATS Logical Uncertainty Intellectual Progress (Society-Level) Practice & Philosophy of Science AI Alignment Fieldbuilding Distillation & Pedagogy Bayes' Theorem Postmortems & Retrospectives Radical Probabilism

462 AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

6mo

653

256 Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

6mo

65

255 It Looks Like You're Trying To Take Over The World

gwern

9mo

125

243 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

222 What failure looks like

paulfchristiano

3y

49

215 How To Get Into Independent Research On Alignment/Agency

johnswentworth

1y

33

214 A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

6mo

48

191 Some AI research areas and their relevance to existential safety

Andrew_Critch

2y

40

183 Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More

Ben Pace

3y

60

180 Another (outer) alignment failure story

paulfchristiano

1y

38

173 Morality is Scary

Wei_Dai

1y

125

159 Possible takeaways from the coronavirus pandemic for slow AI takeoff

Vika

2y

36

156 Goodhart Taxonomy

Scott Garrabrant

4y

33

154 What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch

1y

60

169 Alignment Research Field Guide

abramdemski

3y

9

164 Most People Start With The Same Few Bad Ideas

johnswentworth

3mo

30

164 Radical Probabilism

abramdemski

2y

47

163 Call For Distillers

johnswentworth

8mo

42

149 Lessons learned from talking to >100 academics about AI safety

Marius Hobbhahn

2mo

16

139 The Fusion Power Generator Scenario

johnswentworth

2y

29

133 An Intuitive Guide to Garrabrant Induction

Mark Xu

1y

18

126 Your posts should be on arXiv

JanBrauner

3mo

39

116 Logical induction for software engineers

Alex Flint

17d

2

104 Alignment Newsletter One Year Retrospective

Rohin Shah

3y

31

95 On Solving Problems Before They Appear: The Weird Epistemologies of Alignment

adamShimi

1y

11

95 Bayesian Probability is for things that are Space-like Separated from You

Scott Garrabrant

4y

22

95 How to do theoretical research, a personal perspective

Mark Xu

4mo

4

92 Quintin's alignment papers roundup - week 1

Quintin Pope

3mo

5