Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

2595 posts AI AI Timelines AI Takeoff Interpretability (ML & AI) Careers Instrumental Convergence Iterated Amplification Corrigibility Audio Debate (AI safety technique) Infra-Bayesianism DeepMind

488 posts GPT Conjecture (org) Art Music Machine Learning (ML) Bounties & Prizes (active) OpenAI QURI Language Models Project Announcement DALL-E Meta-Humor

296 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

275 What should you change in response to an "emergency"? And AI risk

AnnaSalamon

5mo

60

242 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

230 A Mechanistic Interpretability Analysis of Grokking

Neel Nanda

4mo

39

228 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

218 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

217 Contra Hofstadter on GPT-3 Nonsense

rictic

6mo

22

216 Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

5mo

89

207 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

203 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

202 Safetywashing

Adam Scholl

5mo

17

200 Are we in an AI overhang?

Andy Jones

2y

109

199 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

199 Don't die with dignity; instead play to your outs

Jeffrey Ladish

8mo

58

287 What DALL-E 2 can and cannot do

Swimmer963

7mo

305

207 chinchilla's wild implications

nostalgebraist

4mo

114

202 Hiring engineers and researchers to help align GPT-3

paulfchristiano

2y

14

195 The case for aligning narrowly superhuman models

Ajeya Cotra

1y

74

188 Common misconceptions about OpenAI

Jacob_Hilton

3mo

138

179 New Scaling Laws for Large Language Models

1a3orn

8mo

21

169 dalle2 comments

nostalgebraist

7mo

13

164 Mysteries of mode collapse

janus

1mo

35

160 Jailbreaking ChatGPT on Release Day

Zvi

18d

74

155 Language models seem to be much better than humans at next-token prediction

Buck

4mo

56

151 Transformer Circuits

evhub

12mo

4

142 Humans Who Are Not Concentrating Are Not General Intelligences

sarahconstantin

3y

35

136 interpreting GPT: the logit lens

nostalgebraist

2y

32

136 Simulators

janus

3mo

103