Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
46 posts
Factored Cognition
Experiments
Ought
AI-assisted Alignment
Memory and Mnemonics
Air Conditioning
23 posts
Debate (AI safety technique)
11
Alignment with argument-networks and assessment-predictions
Tor Økland Barstad
7d
3
21
Research request (alignment strategy): Deep dive on "making AI solve alignment for us"
JanBrauner
19d
3
184
Godzilla Strategies
johnswentworth
6mo
65
105
Beliefs and Disagreements about Automating Alignment Research
Ian McKenzie
3mo
4
66
A Library and Tutorial for Factored Cognition with Language Models
stuhlmueller
2mo
0
52
Ought will host a factored cognition “Lab Meeting”
jungofthewon
3mo
1
62
Rant on Problem Factorization for Alignment
johnswentworth
4mo
48
17
Provably Honest - A First Step
Srijanak De
1mo
2
80
Air Conditioner Test Results & Discussion
johnswentworth
6mo
38
112
Preregistration: Air Conditioner Test
johnswentworth
8mo
64
120
Supervise Process, not Outcomes
stuhlmueller
8mo
8
21
Discussion on utilizing AI for alignment
elifland
3mo
3
28
Making it harder for an AGI to "trick" us, with STVs
Tor Økland Barstad
5mo
5
46
A bicycle for your memory
sortega
8mo
8
26
Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.
Charlie Steiner
8d
14
52
A Small Negative Result on Debate
Sam Bowman
8mo
11
13
Briefly thinking through some analogs of debate
Eli Tyre
3mo
3
32
Learning the smooth prior
Geoffrey Irving
7mo
0
82
Imitative Generalisation (AKA 'Learning the Prior')
Beth Barnes
1y
14
24
An AI-in-a-box success model
azsantosk
8mo
1
69
Why I'm excited about Debate
Richard_Ngo
1y
12
74
A guide to Iterated Amplification & Debate
Rafael Harth
2y
10
97
Writeup: Progress on AI Safety via Debate
Beth Barnes
2y
18
44
Three mental images from thinking about AGI debate & corrigibility
Steven Byrnes
2y
35
41
Looking for adversarial collaborators to test our Debate protocol
Beth Barnes
2y
5
39
How should AI debate be judged?
abramdemski
2y
27
28
Debate Minus Factored Cognition
abramdemski
1y
42
46
AI Safety Debate and Its Applications
VojtaKovarik
3y
5