Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

1913 posts AI World Modeling Inner Alignment Rationality Interpretability (ML & AI) AI Timelines Decision Theory GPT Research Agendas Abstraction Value Learning Impact Regularization

855 posts Logical Induction Threat Models Goodhart's Law Practice & Philosophy of Science Logical Uncertainty Intellectual Progress (Society-Level) Radical Probabilism Epistemology Ethics & Morality Software Tools Fiction Bayes' Theorem

573 Where I agree and disagree with Eliezer

paulfchristiano

6mo

205

318 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

259 Humans are very reliable agents

alyssavance

6mo

35

259 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

258 The Parable of Predict-O-Matic

abramdemski

3y

42

254 A Mechanistic Interpretability Analysis of Grokking

Neel Nanda

4mo

39

252 What 2026 looks like

Daniel Kokotajlo

1y

98

242 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

241 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

239 Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

5mo

89

234 chinchilla's wild implications

nostalgebraist

4mo

114

233 Reward is not the optimization target

TurnTrout

4mo

97

231 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

231 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

462 AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

6mo

653

256 Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

6mo

65

255 It Looks Like You're Trying To Take Over The World

gwern

9mo

125

243 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

222 What failure looks like

paulfchristiano

3y

49

215 How To Get Into Independent Research On Alignment/Agency

johnswentworth

1y

33

214 A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

6mo

48

191 Some AI research areas and their relevance to existential safety

Andrew_Critch

2y

40

183 Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More

Ben Pace

3y

60

180 Another (outer) alignment failure story

paulfchristiano

1y

38

173 Morality is Scary

Wei_Dai

1y

125

169 Alignment Research Field Guide

abramdemski

3y

9

164 Most People Start With The Same Few Bad Ideas

johnswentworth

3mo

30

164 Radical Probabilism

abramdemski

2y

47