Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

1125 posts AI Research Agendas AI Timelines Value Learning AI Takeoff Embedded Agency Eliciting Latent Knowledge (ELK) Community Reinforcement Learning Iterated Amplification Debate (AI safety technique) Game Theory

321 posts Conjecture (org) GPT Oracle AI Interpretability (ML & AI) Myopia Language Models OpenAI AI Boxing (Containment) Machine Learning (ML) DeepMind Acausal Trade Scaling Laws

503 (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen

3mo

83

486 What 2026 looks like

Daniel Kokotajlo

1y

98

409 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

334 EfficientZero: How It Works

1a3orn

1y

42

315 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

297 Why Agent Foundations? An Overly Abstract Explanation

johnswentworth

9mo

54

296 Are we in an AI overhang?

Andy Jones

2y

109

285 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

276 Fun with +12 OOMs of Compute

Daniel Kokotajlo

1y

78

271 Reward is not the optimization target

TurnTrout

4mo

97

265 The Plan

johnswentworth

1y

77

265 Ngo and Yudkowsky on alignment difficulty

Eliezer Yudkowsky

1y

143

265 An overview of 11 proposals for building safe advanced AI

evhub

2y

36

263 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

759 Simulators

janus

3mo

103

494 chinchilla's wild implications

nostalgebraist

4mo

114

422 A Mechanistic Interpretability Analysis of Grokking

Neel Nanda

4mo

39

410 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

324 The Parable of Predict-O-Matic

abramdemski

3y

42

307 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

255 New Scaling Laws for Large Language Models

1a3orn

8mo

21

254 We Are Conjecture, A New Alignment Research Startup

Connor Leahy

8mo

24

253 Common misconceptions about OpenAI

Jacob_Hilton

3mo

138

248 Mysteries of mode collapse

janus

1mo

35

239 The Plan - 2022 Update

johnswentworth

19d

33

227 Chris Olah’s views on AGI safety

evhub

3y

38

223 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

222 The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable

beren

22d

27