Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
3083 posts
AI
GPT
AI Timelines
Machine Learning (ML)
AI Takeoff
Interpretability (ML & AI)
Language Models
Conjecture (org)
Careers
Instrumental Convergence
Iterated Amplification
Art
763 posts
Anthropics
Existential Risk
Whole Brain Emulation
Sleeping Beauty Paradox
Threat Models
Academic Papers
Space Exploration & Colonization
Great Filter
Paradoxes
Extraterrestrial Life
Pascal's Mugging
Longtermism
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
40
Towards Hodge-podge Alignment
Cleo Nardo
1d
20
10
An Open Agency Architecture for Safe Transformative AI
davidad
11h
11
108
The next decades might be wild
Marius Hobbhahn
5d
21
0
I believe some AI doomers are overconfident
FTPickle
6h
4
33
The "Minimal Latents" Approach to Natural Abstractions
johnswentworth
22h
6
57
Reframing inner alignment
davidad
9d
13
3
Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois
1d
6
70
Bad at Arithmetic, Promising at Math
cohenmacaulay
2d
17
22
Existential AI Safety is NOT separate from near-term applications
scasper
7d
15
43
Next Level Seinfeld
Zvi
1d
6
46
Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.
Charlie Steiner
8d
14
13
Will Machines Ever Rule the World? MLAISU W50
Esben Kran
4d
4
106
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
Collin
5d
18
58
Who are some prominent reasonable people who are confident that AI won't kill everyone?
Optimization Process
15d
40
93
AI will change the world, but won’t take it over by playing “3-dimensional chess”.
boazbarak
28d
86
87
Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)
Jacy Reese Anthis
28d
64
34
All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Robert Miles
1mo
100
217
Counterarguments to the basic AI x-risk case
KatjaGrace
2mo
122
140
Worlds Where Iterative Design Fails
johnswentworth
3mo
26
63
Could a single alien message destroy us?
Writer
25d
23
23
Three Fables of Magical Girls and Longtermism
Ulisse Mini
18d
11
-3
AGI Impossible due to Energy Constrains
TheKlaus
20d
13
5
Introducing The Logical Foundation, A Plan to End Poverty With Guaranteed Income
Michael Simm
1mo
23
38
All AGI safety questions welcome (especially basic ones) [July 2022]
plex
5mo
130
25
AI X-risk >35% mostly based on a recent peer-reviewed argument
michaelcohen
1mo
31
72
Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)
Davidmanheim
1mo
27
20
Quantifying anthropic effects on the Fermi paradox
Lanrian
3y
5