Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

2237 posts AI AI Timelines AI Takeoff Careers Audio Infra-Bayesianism DeepMind Interviews SERI MATS Dialogue (format) Agent Foundations Redwood Research

358 posts Iterated Amplification Myopia Factored Cognition Humans Consulting HCH Corrigibility Interpretability (ML & AI) Debate (AI safety technique) Experiments Self Fulfilling/Refuting Prophecies Ought Orthogonality Thesis Instrumental Convergence

296 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

275 What should you change in response to an "emergency"? And AI risk

AnnaSalamon

5mo

60

242 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

228 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

218 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

217 Contra Hofstadter on GPT-3 Nonsense

rictic

6mo

22

216 Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

5mo

89

207 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

203 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

202 Safetywashing

Adam Scholl

5mo

17

200 Are we in an AI overhang?

Andy Jones

2y

109

199 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

199 Don't die with dignity; instead play to your outs

Jeffrey Ladish

8mo

58

195 What do ML researchers think about AI in 2022?

KatjaGrace

4mo

33

230 A Mechanistic Interpretability Analysis of Grokking

Neel Nanda

4mo

39

174 Self-fulfilling correlations

PhilGoetz

12y

50

172 The Plan - 2022 Update

johnswentworth

19d

33

171 Debate on Instrumental Convergence between LeCun, Russell, Bengio, Zador, and More

Ben Pace

3y

60

158 MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"

Rob Bensinger

1y

13

149 A transparency and interpretability tech tree

evhub

6mo

10

138 Opinions on Interpretable Machine Learning and 70 Summaries of Recent Papers

lifelonglearner

1y

16

134 Soares, Tallinn, and Yudkowsky discuss AGI cognition

So8res

1y

35

132 Debate update: Obfuscated arguments problem

Beth Barnes

1y

21

132 Sorting Pebbles Into Correct Heaps

Eliezer Yudkowsky

14y

109

130 Deep Learning Systems Are Not Less Interpretable Than Logic/Probability/Etc

johnswentworth

6mo

52

127 Let's See You Write That Corrigibility Tag

Eliezer Yudkowsky

6mo

67

125 Goal retention discussion with Eliezer

MaxTegmark

8y

26

118 Godzilla Strategies

johnswentworth

6mo

65