Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

1913 posts AI World Modeling Inner Alignment Rationality Interpretability (ML & AI) AI Timelines Decision Theory GPT Research Agendas Abstraction Value Learning Impact Regularization

855 posts Logical Induction Threat Models Goodhart's Law Practice & Philosophy of Science Logical Uncertainty Intellectual Progress (Society-Level) Radical Probabilism Epistemology Ethics & Morality Software Tools Fiction Bayes' Theorem

777 Where I agree and disagree with Eliezer

paulfchristiano

6mo

205

472 Simulators

janus

3mo

103

369 What 2026 looks like

Daniel Kokotajlo

1y

98

364 chinchilla's wild implications

nostalgebraist

4mo

114

364 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

344 (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen

3mo

83

338 A Mechanistic Interpretability Analysis of Grokking

Neel Nanda

4mo

39

325 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

310 Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

5mo

89

291 The Parable of Predict-O-Matic

abramdemski

3y

42

287 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

273 EfficientZero: How It Works

1a3orn

1y

42

265 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

258 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

724 AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

6mo

653

386 It Looks Like You're Trying To Take Over The World

gwern

9mo

125

336 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

319 What failure looks like

paulfchristiano

3y

49

314 How To Get Into Independent Research On Alignment/Agency

johnswentworth

1y

33

270 Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

6mo

65

253 A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

6mo

48

237 Alignment Research Field Guide

abramdemski

3y

9

210 Another (outer) alignment failure story

paulfchristiano

1y

38

207 Lessons learned from talking to >100 academics about AI safety

Marius Hobbhahn

2mo

16

205 Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More

Ben Pace

3y

60

203 What Multipolar Failure Looks Like, and Robust Agent-Agnostic Processes (RAAPs)

Andrew_Critch

1y

60

199 Some AI research areas and their relevance to existential safety

Andrew_Critch

2y

40

192 Call For Distillers

johnswentworth

8mo

42