Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
46 posts
Factored Cognition
Experiments
Ought
AI-assisted Alignment
Memory and Mnemonics
Air Conditioning
23 posts
Debate (AI safety technique)
151
Godzilla Strategies
johnswentworth
6mo
65
118
Supervise Process, not Outcomes
stuhlmueller
8mo
8
109
Preregistration: Air Conditioner Test
johnswentworth
8mo
64
98
Solving Math Problems by Relay
bgold
2y
26
92
Beliefs and Disagreements about Automating Alignment Research
Ian McKenzie
3mo
4
87
Ought: why it matters and ways to help
paulfchristiano
3y
7
80
Air Conditioner Test Results & Discussion
johnswentworth
6mo
38
73
Rant on Problem Factorization for Alignment
johnswentworth
4mo
48
50
Experiment: a good researcher is hard to find
gwern
10y
21
48
Vaniver's View on Factored Cognition
Vaniver
3y
4
47
A Library and Tutorial for Factored Cognition with Language Models
stuhlmueller
2mo
0
45
Factored Cognition
stuhlmueller
4y
6
42
Scientific Wrestling: Beyond Passive Hypothesis-Testing
adamShimi
9mo
6
42
The Majority Is Always Wrong
Eliezer Yudkowsky
15y
54
94
Writeup: Progress on AI Safety via Debate
Beth Barnes
2y
18
92
Imitative Generalisation (AKA 'Learning the Prior')
Beth Barnes
1y
14
73
Why I'm excited about Debate
Richard_Ngo
1y
12
68
A guide to Iterated Amplification & Debate
Rafael Harth
2y
10
55
Three mental images from thinking about AGI debate & corrigibility
Steven Byrnes
2y
35
52
Looking for adversarial collaborators to test our Debate protocol
Beth Barnes
2y
5
49
How should AI debate be judged?
abramdemski
2y
27
42
A Small Negative Result on Debate
Sam Bowman
8mo
11
37
Debate Minus Factored Cognition
abramdemski
1y
42
36
AI Safety Debate and Its Applications
VojtaKovarik
3y
5
36
Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.
Charlie Steiner
8d
14
35
Thoughts on AI Safety via Debate
Vaniver
4y
12
35
Can there be an indescribable hellworld?
Stuart_Armstrong
3y
19
32
New paper: (When) is Truth-telling Favored in AI debate?
VojtaKovarik
2y
7