Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

166 posts AI Risk Goodhart's Law World Optimization Threat Models Instrumental Convergence Corrigibility Existential Risk Coordination / Cooperation Academic Papers AI Safety Camp Ethics & Morality Treacherous Turn

689 posts Newsletters Logical Induction Epistemology SERI MATS Logical Uncertainty Intellectual Progress (Society-Level) Practice & Philosophy of Science AI Alignment Fieldbuilding Distillation & Pedagogy Bayes' Theorem Postmortems & Retrospectives Radical Probabilism

986 AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

6mo

653

517 It Looks Like You're Trying To Take Over The World

gwern

9mo

125

429 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

416 What failure looks like

paulfchristiano

3y

49

413 How To Get Into Independent Research On Alignment/Agency

johnswentworth

1y

33

292 A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

6mo

48

284 Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

6mo

65

252 What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch

1y

60

240 Another (outer) alignment failure story

paulfchristiano

1y

38

227 Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More

Ben Pace

3y

60

207 Some AI research areas and their relevance to existential safety

Andrew_Critch

2y

40

204 Goodhart Taxonomy

Scott Garrabrant

4y

33

201 Reshaping the AI Industry

Thane Ruthenis

6mo

34

189 The next decades might be wild

Marius Hobbhahn

5d

21

305 Alignment Research Field Guide

abramdemski

3y

9

265 Lessons learned from talking to >100 academics about AI safety

Marius Hobbhahn

2mo

16

221 Call For Distillers

johnswentworth

8mo

42

167 Conjecture: Internal Infohazard Policy

Connor Leahy

4mo

6

158 Most People Start With The Same Few Bad Ideas

johnswentworth

3mo

30

154 Radical Probabilism

abramdemski

2y

47

146 Quintin's alignment papers roundup - week 1

Quintin Pope

3mo

5

144 Your posts should be on arXiv

JanBrauner

3mo

39

133 The Fusion Power Generator Scenario

johnswentworth

2y

29

132 Logical induction for software engineers

Alex Flint

17d

2

100 Productive Mistakes, Not Perfect Answers

adamShimi

8mo

11

99 On Solving Problems Before They Appear: The Weird Epistemologies of Alignment

adamShimi

1y

11

97 Intuitions about solving hard problems

Richard_Ngo

7mo

23

97 An Intuitive Guide to Garrabrant Induction

Mark Xu

1y

18