Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
2237 posts
AI
AI Timelines
AI Takeoff
Careers
Audio
Infra-Bayesianism
DeepMind
Interviews
SERI MATS
Dialogue (format)
Agent Foundations
Redwood Research
358 posts
Iterated Amplification
Myopia
Factored Cognition
Humans Consulting HCH
Corrigibility
Interpretability (ML & AI)
Debate (AI safety technique)
Experiments
Self Fulfilling/Refuting Prophecies
Ought
Orthogonality Thesis
Instrumental Convergence
296
DeepMind alignment team opinions on AGI ruin arguments
Vika
4mo
34
275
What should you change in response to an "emergency"? And AI risk
AnnaSalamon
5mo
60
242
Two-year update on my personal AI timelines
Ajeya Cotra
4mo
60
228
Visible Thoughts Project and Bounty Announcement
So8res
1y
104
218
DeepMind: Generally capable agents emerge from open-ended play
Daniel Kokotajlo
1y
53
217
Contra Hofstadter on GPT-3 Nonsense
rictic
6mo
22
216
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
5mo
89
207
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
203
larger language models may disappoint you [or, an eternally unfinished draft]
nostalgebraist
1y
29
202
Safetywashing
Adam Scholl
5mo
17
200
Are we in an AI overhang?
Andy Jones
2y
109
199
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
199
Don't die with dignity; instead play to your outs
Jeffrey Ladish
8mo
58
195
What do ML researchers think about AI in 2022?
KatjaGrace
4mo
33
230
A Mechanistic Interpretability Analysis of Grokking
Neel Nanda
4mo
39
174
Self-fulfilling correlations
PhilGoetz
12y
50
172
The Plan - 2022 Update
johnswentworth
19d
33
171
Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More
Ben Pace
3y
60
158
MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"
Rob Bensinger
1y
13
149
A transparency and interpretability tech tree
evhub
6mo
10
138
Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers
lifelonglearner
1y
16
134
Soares, Tallinn, and Yudkowsky discuss AGI cognition
So8res
1y
35
132
Debate update: Obfuscated arguments problem
Beth Barnes
1y
21
132
Sorting Pebbles Into Correct Heaps
Eliezer Yudkowsky
14y
109
130
Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc
johnswentworth
6mo
52
127
Let's See You Write That Corrigibility Tag
Eliezer Yudkowsky
6mo
67
125
Goal retention discussion with Eliezer
MaxTegmark
8y
26
118
Godzilla Strategies
johnswentworth
6mo
65