Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

83 posts AI Risk Goodhart's Law Corrigibility Instrumental Convergence Treacherous Turn Programming 2017-2019 AI Alignment Prize Satisficer LessWrong Event Transcripts Modeling People Petrov Day

83 posts World Optimization Threat Models Existential Risk Coordination / Cooperation Academic Papers AI Safety Camp Practical Ethics & Morality Symbol Grounding Security Mindset Sharp Left Turn Fiction

58 You can still fetch the coffee today if you're dead tomorrow

davidad

11d

15

336 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

103 AI will change the world, but won’t take it over by playing “3-dimensional chess”.

boazbarak

28d

86

724 AGI Ruin: A List of Lethalities

Eliezer Yudkowsky

6mo

653

98 Niceness is unnatural

So8res

2mo

18

144 Worlds Where Iterative Design Fails

johnswentworth

3mo

26

72 What does it mean for an AGI to be 'safe'?

So8res

2mo

32

148 AGI ruin scenarios are likely (and disjunctive)

So8res

4mo

37

58 Eli's review of "Is power-seeking AI an existential risk?"

elifland

2mo

0

93 The alignment problem from a deep learning perspective

Richard_Ngo

4mo

13

36 Empowerment is (almost) All We Need

jacob_cannell

1mo

43

85 Oversight Misses 100% of Thoughts The AI Does Not Think

johnswentworth

4mo

49

13 Corrigibility Via Thought-Process Deference

Thane Ruthenis

26d

5

109 Let's See You Write That Corrigibility Tag

Eliezer Yudkowsky

6mo

67

155 The next decades might be wild

Marius Hobbhahn

5d

21

39 AI Neorealism: a threat model & success criterion for existential safety

davidad

5d

0

94 Thoughts on AGI organizations and capabilities work

Rob Bensinger

13d

17

48 Deconfusing Direct vs Amortised Optimization

beren

18d

6

36 Refining the Sharp Left Turn threat model, part 2: applying alignment techniques

Vika

25d

4

93 Don't leave your fingerprints on the future

So8res

2mo

32

253 A central AI alignment problem: capabilities generalization, and the sharp left turn

So8res

6mo

48

134 AI coordination needs clear wins

evhub

3mo

15

270 Six Dimensions of Operational Adequacy in AGI Projects

Eliezer Yudkowsky

6mo

65

386 It Looks Like You're Trying To Take Over The World

gwern

9mo

125

41 Some advice on independent research

Marius Hobbhahn

1mo

4

118 An Update on Academia vs. Industry (one year into my faculty job)

David Scott Krueger (formerly: capybaralet)

3mo

18

78 Nearcast-based "deployment problem" analysis

HoldenKarnofsky

3mo

2

88 Linkpost: Github Copilot productivity experiment

Daniel Kokotajlo

3mo

4