Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
166 posts
AI Risk
Goodhart's Law
World Optimization
Threat Models
Instrumental Convergence
Corrigibility
Existential Risk
Coordination / Cooperation
Academic Papers
AI Safety Camp
Ethics & Morality
Treacherous Turn
689 posts
Newsletters
Logical Induction
Epistemology
SERI MATS
Logical Uncertainty
Intellectual Progress (Society-Level)
Practice & Philosophy of Science
AI Alignment Fieldbuilding
Distillation & Pedagogy
Bayes' Theorem
Postmortems & Retrospectives
Radical Probabilism
189
The next decades might be wild
Marius Hobbhahn
5d
21
56
You can still fetch the coffee today if you're dead tomorrow
davidad
11d
15
141
Worlds Where Iterative Design Fails
johnswentworth
3mo
26
108
AI will change the world, but won’t take it over by playing “3-dimensional chess”.
boazbarak
28d
86
82
Thoughts on AGI organizations and capabilities work
Rob Bensinger
13d
17
429
Counterarguments to the basic AI x-risk case
KatjaGrace
2mo
122
44
AI X-risk >35% mostly based on a recent peer-reviewed argument
michaelcohen
1mo
31
69
Deconfusing Direct vs Amortised Optimization
beren
18d
6
9
Corrigibility Via Thought-Process Deference
Thane Ruthenis
26d
5
67
We may be able to see sharp left turns coming
Ethan Perez
3mo
26
80
Don't leave your fingerprints on the future
So8res
2mo
32
91
Oversight Misses 100% of Thoughts The AI Does Not Think
johnswentworth
4mo
49
30
Refining the Sharp Left Turn threat model, part 2: applying alignment techniques
Vika
25d
4
144
An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet)
3mo
18
55
Methodological Therapy: An Agenda For Tackling Research Bottlenecks
adamShimi
2mo
6
144
Your posts should be on arXiv
JanBrauner
3mo
39
132
Logical induction for software engineers
Alex Flint
17d
2
54
Principles of Privacy for Alignment Research
johnswentworth
4mo
30
40
Behaviour Manifolds and the Hessian of the Total Loss - Notes and Criticism
Spencer Becker-Kahn
3mo
4
25
What are concrete examples of potential "lock-in" in AI research?
Grue_Slinky
3y
6
158
Most People Start With The Same Few Bad Ideas
johnswentworth
3mo
30
28
[AN #112]: Engineering a Safer World
Rohin Shah
2y
2
167
Conjecture: Internal Infohazard Policy
Connor Leahy
4mo
6
29
Attempts at Forwarding Speed Priors
james.lucassen
2mo
2
13
Rob B's Shortform Feed
Rob Bensinger
3y
79
73
How to do theoretical research, a personal perspective
Mark Xu
4mo
4
97
Intuitions about solving hard problems
Richard_Ngo
7mo
23
15
Abram Demski's ELK thoughts and proposal - distillation
Rubi J. Hudson
5mo
4