Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
26 posts
Game Theory
Center on Long-Term Risk (CLR)
Risks of Astronomical Suffering (S-risks)
Mechanism Design
Suffering
Fairness
Blackmail / Extortion
Group Rationality
Terminology / Jargon (meta)
Reading Group
20 posts
Research Agendas
Goal Factoring
Mind Crime
174
Unifying Bargaining Notions (1/2)
Diffractor
4mo
38
143
The Commitment Races problem
Daniel Kokotajlo
3y
39
125
«Boundaries», Part 1: a key missing concept from utility theory
Andrew_Critch
4mo
26
84
Unifying Bargaining Notions (2/2)
Diffractor
4mo
11
81
"Zero Sum" is a misnomer.
abramdemski
2y
35
80
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
Diffractor
2mo
11
77
Preface to CLR's Research Agenda on Cooperation, Conflict, and TAI
JesseClifton
3y
10
73
CLR's recent work on multi-agent systems
JesseClifton
1y
1
67
Siren worlds and the perils of over-optimised search
Stuart_Armstrong
8y
417
60
What counts as defection?
TurnTrout
2y
21
58
Reducing collective rationality to individual optimization in common-payoff games using MCMC
jessicata
4y
12
56
When Hindsight Isn't 20/20: Incentive Design With Imperfect Credit Allocation
johnswentworth
2y
24
44
«Boundaries», Part 3b: Alignment problems in terms of boundaries
Andrew_Critch
6d
2
41
Sections 1 & 2: Introduction, Strategy and Governance
JesseClifton
3y
5
285
On how various plans miss the hard bits of the alignment challenge
So8res
5mo
81
181
Some conceptual alignment research projects
Richard_Ngo
3mo
14
121
Thoughts on Human Models
Ramana Kumar
3y
32
102
Our take on CHAI’s research agenda in under 1500 words
Alex Flint
2y
19
75
The Learning-Theoretic AI Alignment Research Agenda
Vanessa Kosoy
4y
39
52
Resources for AI Alignment Cartography
Gyrodiot
2y
8
52
Research Agenda v0.9: Synthesising a human's preferences into a utility function
Stuart_Armstrong
3y
25
42
Technical AGI safety research outside AI
Richard_Ngo
3y
3
39
Research agenda update
Steven Byrnes
1y
40
33
My AGI safety research—2022 review, ’23 plans
Steven Byrnes
6d
6
28
Ultra-simplified research agenda
Stuart_Armstrong
3y
4
26
New safety research agenda: scalable agent alignment via reward modeling
Vika
4y
13
24
Research Agenda in reverse: what *would* a solution look like?
Stuart_Armstrong
3y
25
24
New year, new research agenda post
Charlie Steiner
11mo
4