Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
26 posts
Game Theory
Center on Long-Term Risk (CLR)
Risks of Astronomical Suffering (S-risks)
Mechanism Design
Suffering
Fairness
Blackmail / Extortion
Group Rationality
Terminology / Jargon (meta)
Reading Group
20 posts
Research Agendas
Goal Factoring
Mind Crime
44
«Boundaries», Part 3b: Alignment problems in terms of boundaries
Andrew_Critch
6d
2
174
Unifying Bargaining Notions (1/2)
Diffractor
4mo
38
80
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
Diffractor
2mo
11
125
«Boundaries», Part 1: a key missing concept from utility theory
Andrew_Critch
4mo
26
84
Unifying Bargaining Notions (2/2)
Diffractor
4mo
11
33
Announcing: Mechanism Design for AI Safety - Reading Group
Rubi J. Hudson
4mo
3
73
CLR's recent work on multi-agent systems
JesseClifton
1y
1
143
The Commitment Races problem
Daniel Kokotajlo
3y
39
81
"Zero Sum" is a misnomer.
abramdemski
2y
35
32
My take on higher-order game theory
Nisan
1y
6
56
When Hindsight Isn't 20/20: Incentive Design With Imperfect Credit Allocation
johnswentworth
2y
24
77
Preface to CLR's Research Agenda on Cooperation, Conflict, and TAI
JesseClifton
3y
10
60
What counts as defection?
TurnTrout
2y
21
41
Sections 1 & 2: Introduction, Strategy and Governance
JesseClifton
3y
5
33
My AGI safety research—2022 review, ’23 plans
Steven Byrnes
6d
6
285
On how various plans miss the hard bits of the alignment challenge
So8res
5mo
81
181
Some conceptual alignment research projects
Richard_Ngo
3mo
14
16
Theories of impact for Science of Deep Learning
Marius Hobbhahn
19d
0
20
Distilled Representations Research Agenda
Hoagy
2mo
2
102
Our take on CHAI’s research agenda in under 1500 words
Alex Flint
2y
19
39
Research agenda update
Steven Byrnes
1y
40
24
New year, new research agenda post
Charlie Steiner
11mo
4
121
Thoughts on Human Models
Ramana Kumar
3y
32
52
Resources for AI Alignment Cartography
Gyrodiot
2y
8
19
Immobile AI makes a move: anti-wireheading, ontology change, and model splintering
Stuart_Armstrong
1y
3
75
The Learning-Theoretic AI Alignment Research Agenda
Vanessa Kosoy
4y
39
52
Research Agenda v0.9: Synthesising a human's preferences into a utility function
Stuart_Armstrong
3y
25
42
Technical AGI safety research outside AI
Richard_Ngo
3y
3