Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
1014 posts
AI
AI Timelines
Value Learning
AI Takeoff
Embedded Agency
Community
Eliciting Latent Knowledge (ELK)
Reinforcement Learning
Infra-Bayesianism
Counterfactuals
Logic & Mathematics
Interviews
111 posts
Iterated Amplification
Game Theory
Factored Cognition
Humans Consulting HCH
Research Agendas
Ought
Debate (AI safety technique)
Risks of Astronomical Suffering (S-risks)
Center on Long-Term Risk (CLR)
Mechanism Design
Fairness
Group Rationality
369
What 2026 looks like
Daniel Kokotajlo
1y
98
344
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
3mo
83
325
Discussion with Eliezer Yudkowsky on AGI interventions
Rob Bensinger
1y
257
287
Two-year update on my personal AI timelines
Ajeya Cotra
4mo
60
273
EfficientZero: How It Works
1a3orn
1y
42
255
Are we in an AI overhang?
Andy Jones
2y
109
252
Reward is not the optimization target
TurnTrout
4mo
97
248
Humans are very reliable agents
alyssavance
6mo
35
247
Why Agent Foundations? An Overly Abstract Explanation
johnswentworth
9mo
54
247
DeepMind: Generally capable agents emerge from open-ended play
Daniel Kokotajlo
1y
53
245
Visible Thoughts Project and Bounty Announcement
So8res
1y
104
237
larger language models may disappoint you [or, an eternally unfinished draft]
nostalgebraist
1y
29
235
The Plan
johnswentworth
1y
77
235
Ngo and Yudkowsky on alignment difficulty
Eliezer Yudkowsky
1y
143
258
On how various plans miss the hard bits of the alignment challenge
So8res
5mo
81
189
Unifying Bargaining Notions (1/2)
Diffractor
4mo
38
168
Some conceptual alignment research projects
Richard_Ngo
3mo
14
140
«Boundaries», Part 1: a key missing concept from utility theory
Andrew_Critch
4mo
26
125
Paul's research agenda FAQ
zhukeepa
4y
73
125
Debate update: Obfuscated arguments problem
Beth Barnes
1y
21
124
Thoughts on Human Models
Ramana Kumar
3y
32
122
The Commitment Races problem
Daniel Kokotajlo
3y
39
119
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda
Chi Nguyen
2y
21
118
Supervise Process, not Outcomes
stuhlmueller
8mo
8
112
Our take on CHAI’s research agenda in under 1500 words
Alex Flint
2y
19
106
"Zero Sum" is a misnomer.
abramdemski
2y
35
102
Unifying Bargaining Notions (2/2)
Diffractor
4mo
11
94
Writeup: Progress on AI Safety via Debate
Beth Barnes
2y
18