Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
26 posts
Game Theory
Center on Long-Term Risk (CLR)
Risks of Astronomical Suffering (S-risks)
Mechanism Design
Suffering
Fairness
Blackmail / Extortion
Group Rationality
Terminology / Jargon (meta)
Reading Group
20 posts
Research Agendas
Goal Factoring
Mind Crime
54
«Boundaries», Part 3b: Alignment problems in terms of boundaries
Andrew_Critch
6d
2
155
«Boundaries», Part 1: a key missing concept from utility theory
Andrew_Critch
4mo
26
98
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
Diffractor
2mo
11
204
Unifying Bargaining Notions (1/2)
Diffractor
4mo
38
120
Unifying Bargaining Notions (2/2)
Diffractor
4mo
11
1
Announcing: Mechanism Design for AI Safety - Reading Group
Rubi J. Hudson
4mo
3
26
Sections 5 & 6: Contemporary Architectures, Humans in the Loop
JesseClifton
3y
4
41
Formal Open Problem in Decision Theory
Scott Garrabrant
4y
11
24
The Ubiquitous Converse Lawvere Problem
Scott Garrabrant
4y
0
28
Hyperreal Brouwer
Scott Garrabrant
4y
0
102
What counts as defection?
TurnTrout
2y
21
15
Sections 3 & 4: Credibility, Peaceful Bargaining Mechanisms
JesseClifton
3y
2
44
My take on higher-order game theory
Nisan
1y
6
131
"Zero Sum" is a misnomer.
abramdemski
2y
35
35
My AGI safety research—2022 review, ’23 plans
Steven Byrnes
6d
6
231
On how various plans miss the hard bits of the alignment challenge
So8res
5mo
81
155
Some conceptual alignment research projects
Richard_Ngo
3mo
14
10
Distilled Representations Research Agenda
Hoagy
2mo
2
38
Resources for AI Alignment Cartography
Gyrodiot
2y
8
44
Technical AGI safety research outside AI
Richard_Ngo
3y
3
36
Why I am not currently working on the AAMLS agenda
jessicata
5y
1
77
The Learning-Theoretic AI Alignment Research Agenda
Vanessa Kosoy
4y
39
127
Thoughts on Human Models
Ramana Kumar
3y
32
82
Research Agenda v0.9: Synthesising a human's preferences into a utility function
Stuart_Armstrong
3y
25
69
Research agenda update
Steven Byrnes
1y
40
42
New safety research agenda: scalable agent alignment via reward modeling
Vika
4y
13
44
Research Agenda in reverse: what *would* a solution look like?
Stuart_Armstrong
3y
25
4
Acknowledgements & References
JesseClifton
3y
0