Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
13671 posts
Rationality
World Modeling
Practical
World Optimization
Covid-19
Community
Fiction
Site Meta
Scholarship & Learning
Politics
Book Reviews
Open Threads
18722 posts
AI
AI Risk
GPT
AI Timelines
Decision Theory
Interpretability (ML & AI)
Machine Learning (ML)
AI Takeoff
Inner Alignment
Anthropics
Research Agendas
Language Models
65
Shard Theory in Nine Theses: a Distillation and Critical Appraisal
LawrenceC
1d
9
61
AGI Timelines in Governance: Different Strategies for Different Timeframes
simeon_c
1d
14
35
Notice when you stop reading right before you understand
just_browsing
18h
4
150
How to Convince my Son that Drugs are Bad
concerned_dad
3d
77
56
Results from a survey on tool use and workflows in alignment research
jacquesthibs
1d
2
30
our deepest wishes
carado
23h
0
51
The True Spirit of Solstice?
Raemon
1d
23
8
Properties of current AIs and some predictions of the evolution of AI from the perspective of scale-free theories of agency and regulative development
Roman Leventov
6h
0
12
Under-Appreciated Ways to Use Flashcards
Florence Hinder
11h
0
22
More notes from raising a late-talking kid
Steven Byrnes
21h
1
30
Avoiding Psychopathic AI
Cameron Berg
1d
2
17
[Fiction] Unspoken Stone
Gordon Seidoh Worley
18h
0
4
Proliferating Education
Haris Rashid
4h
0
4
Reflections: Bureaucratic Hell
Haris Rashid
4h
1
28
K-complexity is silly; use cross-entropy instead
So8res
1h
4
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
84
Towards Hodge-podge Alignment
Cleo Nardo
1d
20
41
The "Minimal Latents" Approach to Natural Abstractions
johnswentworth
22h
6
5
Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic
Akash
2h
0
112
Bad at Arithmetic, Promising at Math
cohenmacaulay
2d
17
16
An Open Agency Architecture for Safe Transformative AI
davidad
11h
11
47
Next Level Seinfeld
Zvi
1d
6
198
The next decades might be wild
Marius Hobbhahn
5d
21
265
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
7
Note on algorithms with multiple trained components
Steven Byrnes
6h
1
140
How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme
Collin
5d
18
6
I believe some AI doomers are overconfident
FTPickle
6h
4
5
Career Scouting: Housing Coordination
koratkar
5h
0