Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
3083 posts
AI
GPT
AI Timelines
Machine Learning (ML)
AI Takeoff
Interpretability (ML & AI)
Language Models
Conjecture (org)
Careers
Instrumental Convergence
Iterated Amplification
Art
763 posts
Anthropics
Existential Risk
Whole Brain Emulation
Sleeping Beauty Paradox
Threat Models
Academic Papers
Space Exploration & Colonization
Great Filter
Paradoxes
Extraterrestrial Life
Pascal's Mugging
Longtermism
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
84
Towards Hodge-podge Alignment
Cleo Nardo
1d
20
16
An Open Agency Architecture for Safe Transformative AI
davidad
11h
11
198
The next decades might be wild
Marius Hobbhahn
5d
21
6
I believe some AI doomers are overconfident
FTPickle
6h
4
41
The "Minimal Latents" Approach to Natural Abstractions
johnswentworth
22h
6
37
Reframing inner alignment
davidad
9d
13
7
Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois
1d
6
112
Bad at Arithmetic, Promising at Math
cohenmacaulay
2d
17
52
Existential AI Safety is NOT separate from near-term applications
scasper
7d
15
47
Next Level Seinfeld
Zvi
1d
6
26
Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.
Charlie Steiner
8d
14
11
Will Machines Ever Rule the World? MLAISU W50
Esben Kran
4d
4
140
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
Collin
5d
18
64
Who are some prominent reasonable people who are confident that AI won't kill everyone?
Optimization Process
15d
40
113
AI will change the world, but won’t take it over by playing “3-dimensional chess”.
boazbarak
28d
86
103
Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)
Jacy Reese Anthis
28d
64
100
All AGI Safety questions welcome (especially basic ones) [~monthly thread]
Robert Miles
1mo
100
455
Counterarguments to the basic AI x-risk case
KatjaGrace
2mo
122
148
Worlds Where Iterative Design Fails
johnswentworth
3mo
26
55
Could a single alien message destroy us?
Writer
25d
23
35
Three Fables of Magical Girls and Longtermism
Ulisse Mini
18d
11
-13
AGI Impossible due to Energy Constrains
TheKlaus
20d
13
13
Introducing The Logical Foundation, A Plan to End Poverty With Guaranteed Income
Michael Simm
1mo
23
130
All AGI safety questions welcome (especially basic ones) [July 2022]
plex
5mo
130
47
AI X-risk >35% mostly based on a recent peer-reviewed argument
michaelcohen
1mo
31
68
Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)
Davidmanheim
1mo
27
30
Quantifying anthropic effects on the Fermi paradox
Lanrian
3y
5