Go Back
You can't go any further
You can't go any further
meritocratic
regular
democratic
hot
top
alive
26 posts
Iterated Amplification
17 posts
Humans Consulting HCH
132
Debate update: Obfuscated arguments problem
Beth Barnes
1y
21
117
My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda
Chi Nguyen
2y
21
111
Paul's research agenda FAQ
zhukeepa
4y
73
85
Model splintering: moving from one imperfect model to another
Stuart_Armstrong
2y
10
61
Relaxed adversarial training for inner alignment
evhub
3y
28
56
Directions and desiderata for AI alignment
paulfchristiano
3y
1
50
Notes on OpenAI’s alignment plan
Alex Flint
12d
5
47
My confusions with Paul's Agenda
Vaniver
4y
1
45
Understanding Iterated Distillation and Amplification: Claims and Oversight
William_S
4y
30
44
Iterated Distillation and Amplification
Ajeya Cotra
4y
13
39
Preface to the sequence on iterated amplification
paulfchristiano
4y
8
37
Synthesizing amplification and debate
evhub
2y
10
36
Disagreement with Paul: alignment induction
Stuart_Armstrong
4y
6
32
Capability amplification
paulfchristiano
3y
8
68
Garrabrant and Shah on human modeling in AGI
Rob Bensinger
1y
10
60
Relating HCH and Logical Induction
abramdemski
2y
4
48
HCH Speculation Post #2A
Charlie Steiner
1y
7
45
HCH is not just Mechanical Turk
William_S
3y
6
43
What's wrong with these analogies for understanding Informed Oversight and IDA?
Wei_Dai
3y
3
34
Towards formalizing universality
paulfchristiano
3y
19
33
What are the differences between all the iterative/recursive approaches to AI alignment?
riceissa
3y
14
32
Humans Consulting HCH
paulfchristiano
4y
10
32
Universality Unwrapped
adamShimi
2y
2
31
Can HCH epistemically dominate Ramanujan?
zhukeepa
3y
4
27
Meta-execution
paulfchristiano
4y
1
20
Epistemology of HCH
adamShimi
1y
2
15
Mapping the Conceptual Territory in AI Existential Safety and Alignment
jbkjr
1y
0
12
HCH and Adversarial Questions
David Udell
10mo
7