Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

1446 posts AI Interpretability (ML & AI) AI Timelines GPT Research Agendas Value Learning AI Takeoff Conjecture (org) Embedded Agency Machine Learning (ML) Eliciting Latent Knowledge (ELK) Community

118 posts Inner Alignment Optimization Solomonoff Induction Predictive Processing Selection vs Control Neocortex Mesa-Optimization Neuroscience Priors AI Services (CAIS) Occam's Razor General Intelligence

759 Simulators

janus

3mo

103

503 (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen

3mo

83

494 chinchilla's wild implications

nostalgebraist

4mo

114

486 What 2026 looks like

Daniel Kokotajlo

1y

98

422 A Mechanistic Interpretability Analysis of Grokking

Neel Nanda

4mo

39

410 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

409 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

334 EfficientZero: How It Works

1a3orn

1y

42

324 The Parable of Predict-O-Matic

abramdemski

3y

42

315 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

307 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

297 Why Agent Foundations? An Overly Abstract Explanation

johnswentworth

9mo

54

296 Are we in an AI overhang?

Andy Jones

2y

109

285 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

228 The ground of optimization

Alex Flint

2y

74

184 Risks from Learned Optimization: Introduction

evhub

3y

42

175 Inner Alignment: Explain like I'm 12 Edition

Rafael Harth

2y

46

174 My computational framework for the brain

Steven Byrnes

2y

26

143 Matt Botvinick on the spontaneous emergence of learning algorithms

Adam Scholl

2y

87

141 Selection vs Control

abramdemski

3y

25

140 Reframing Superintelligence: Comprehensive AI Services as General Intelligence

Rohin Shah

3y

75

134 The Solomonoff Prior is Malign

Mark Xu

2y

52

132 A Semitechnical Introductory Dialogue on Solomonoff Induction

Eliezer Yudkowsky

1y

34

131 Inner Alignment in Salt-Starved Rats

Steven Byrnes

2y

39

127 Externalized reasoning oversight: a research direction for language model alignment

tamera

4mo

22

111 What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworth

4mo

15

111 Theoretical Neuroscience For Alignment Theory

Cameron Berg

1y

19

102 Inner and outer alignment decompose one hard problem into two extremely hard problems

TurnTrout

18d

18