Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
2595 posts
AI
AI Timelines
AI Takeoff
Interpretability (ML & AI)
Careers
Instrumental Convergence
Iterated Amplification
Corrigibility
Audio
Debate (AI safety technique)
Infra-Bayesianism
DeepMind
488 posts
GPT
Conjecture (org)
Art
Music
Machine Learning (ML)
Bounties & Prizes (active)
OpenAI
QURI
Language Models
Project Announcement
DALL-E
Meta-Humor
296
DeepMind alignment team opinions on AGI ruin arguments
Vika
4mo
34
275
What should you change in response to an "emergency"? And AI risk
AnnaSalamon
5mo
60
242
Two-year update on my personal AI timelines
Ajeya Cotra
4mo
60
230
A Mechanistic Interpretability Analysis of Grokking
Neel Nanda
4mo
39
228
Visible Thoughts Project and Bounty Announcement
So8res
1y
104
218
DeepMind: Generally capable agents emerge from open-ended play
Daniel Kokotajlo
1y
53
217
Contra Hofstadter on GPT-3 Nonsense
rictic
6mo
22
216
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
5mo
89
207
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
203
larger language models may disappoint you [or, an eternally unfinished draft]
nostalgebraist
1y
29
202
Safetywashing
Adam Scholl
5mo
17
200
Are we in an AI overhang?
Andy Jones
2y
109
199
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
199
Don't die with dignity; instead play to your outs
Jeffrey Ladish
8mo
58
287
What DALL-E 2 can and cannot do
Swimmer963
7mo
305
207
chinchilla's wild implications
nostalgebraist
4mo
114
202
Hiring engineers and researchers to help align GPT-3
paulfchristiano
2y
14
195
The case for aligning narrowly superhuman models
Ajeya Cotra
1y
74
188
Common misconceptions about OpenAI
Jacob_Hilton
3mo
138
179
New Scaling Laws for Large Language Models
1a3orn
8mo
21
169
dalle2 comments
nostalgebraist
7mo
13
164
Mysteries of mode collapse
janus
1mo
35
160
Jailbreaking ChatGPT on Release Day
Zvi
18d
74
155
Language models seem to be much better than humans at next-token prediction
Buck
4mo
56
151
Transformer Circuits
evhub
12mo
4
142
Humans Who Are Not Concentrating Are Not General Intelligences
sarahconstantin
3y
35
136
interpreting GPT: the logit lens
nostalgebraist
2y
32
136
Simulators
janus
3mo
103