Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

1014 posts AI AI Timelines Value Learning AI Takeoff Embedded Agency Community Eliciting Latent Knowledge (ELK) Reinforcement Learning Infra-Bayesianism Counterfactuals Logic & Mathematics Interviews

111 posts Iterated Amplification Game Theory Factored Cognition Humans Consulting HCH Research Agendas Ought Debate (AI safety technique) Risks of Astronomical Suffering (S-risks) Center on Long-Term Risk (CLR) Mechanism Design Fairness Group Rationality

369 What 2026 looks like

Daniel Kokotajlo

1y

98

344 (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen

3mo

83

325 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

287 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

273 EfficientZero: How It Works

1a3orn

1y

42

255 Are we in an AI overhang?

Andy Jones

2y

109

252 Reward is not the optimization target

TurnTrout

4mo

97

248 Humans are very reliable agents

alyssavance

6mo

35

247 Why Agent Foundations? An Overly Abstract Explanation

johnswentworth

9mo

54

247 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

245 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

237 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

235 The Plan

johnswentworth

1y

77

235 Ngo and Yudkowsky on alignment difficulty

Eliezer Yudkowsky

1y

143

258 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

189 Unifying Bargaining Notions (1/2)

Diffractor

4mo

38

168 Some conceptual alignment research projects

Richard_Ngo

3mo

14

140 «Boundaries», Part 1: a key missing concept from utility theory

Andrew_Critch

4mo

26

125 Paul's research agenda FAQ

zhukeepa

4y

73

125 Debate update: Obfuscated arguments problem

Beth Barnes

1y

21

124 Thoughts on Human Models

Ramana Kumar

3y

32

122 The Commitment Races problem

Daniel Kokotajlo

3y

39

119 My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda

Chi Nguyen

2y

21

118 Supervise Process, not Outcomes

stuhlmueller

8mo

8

112 Our take on CHAI’s research agenda in under 1500 words

Alex Flint

2y

19

106 "Zero Sum" is a misnomer.

abramdemski

2y

35

102 Unifying Bargaining Notions (2/2)

Diffractor

4mo

11

94 Writeup: Progress on AI Safety via Debate

Beth Barnes

2y

18