Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
46 posts
Research Agendas
Game Theory
Center on Long-Term Risk (CLR)
Risks of Astronomical Suffering (S-risks)
Mechanism Design
Suffering
Fairness
Blackmail / Extortion
Group Rationality
Terminology / Jargon (meta)
Reading Group
Mind Crime
65 posts
Iterated Amplification
Debate (AI safety technique)
Factored Cognition
Humans Consulting HCH
Ought
Adversarial Collaboration
Delegation
54
«Boundaries», Part 3b: Alignment problems in terms of boundaries
Andrew_Critch
6d
2
35
My AGI safety research—2022 review, ’23 plans
Steven Byrnes
6d
6
231
On how various plans miss the hard bits of the alignment challenge
So8res
5mo
81
155
Some conceptual alignment research projects
Richard_Ngo
3mo
14
204
Unifying Bargaining Notions (1/2)
Diffractor
4mo
38
98
Threat-Resistant Bargaining Megapost: Introducing the ROSE Value
Diffractor
2mo
11
16
Theories of impact for Science of Deep Learning
Marius Hobbhahn
19d
0
155
«Boundaries», Part 1: a key missing concept from utility theory
Andrew_Critch
4mo
26
120
Unifying Bargaining Notions (2/2)
Diffractor
4mo
11
10
Distilled Representations Research Agenda
Hoagy
2mo
2
131
"Zero Sum" is a misnomer.
abramdemski
2y
35
69
Research agenda update
Steven Byrnes
1y
40
122
Our take on CHAI’s research agenda in under 1500 words
Alex Flint
2y
19
44
My take on higher-order game theory
Nisan
1y
6
47
Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.
Charlie Steiner
8d
14
52
Notes on OpenAI’s alignment plan
Alex Flint
12d
5
85
Rant on Problem Factorization for Alignment
johnswentworth
4mo
48
122
Supervise Process, not Outcomes
stuhlmueller
8mo
8
31
A Library and Tutorial for Factored Cognition with Language Models
stuhlmueller
2mo
0
21
Ought will host a factored cognition “Lab Meeting”
jungofthewon
3mo
1
139
Debate update: Obfuscated arguments problem
Beth Barnes
1y
21
35
A Small Negative Result on Debate
Sam Bowman
8mo
11
106
Imitative Generalisation (AKA 'Learning the Prior')
Beth Barnes
1y
14
70
Garrabrant and Shah on human modeling in AGI
Rob Bensinger
1y
10
124
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda
Chi Nguyen
2y
21
80
Why I'm excited about Debate
Richard_Ngo
1y
12
88
Model splintering: moving from one imperfect model to another
Stuart_Armstrong
2y
10
97
Writeup: Progress on AI Safety via Debate
Beth Barnes
2y
18