Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
934 posts
AI
Value Learning
Embedded Agency
Community
Eliciting Latent Knowledge (ELK)
Reinforcement Learning
Infra-Bayesianism
Counterfactuals
Logic & Mathematics
Interviews
AI Capabilities
Inverse Reinforcement Learning
80 posts
AI Timelines
AI Takeoff
AI Persuasion
History
Forecasting & Prediction
Dialogue (format)
Technological Forecasting
Forecasts (Specific Predictions)
Industrial Revolution
Effective Altruism
Transformative AI
Progress Studies
79
Towards Hodge-podge Alignment
Cleo Nardo
1d
20
39
The "Minimal Latents" Approach to Natural Abstractions
johnswentworth
22h
6
251
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
7
Note on algorithms with multiple trained components
Steven Byrnes
7h
1
12
Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.
Charlie Steiner
19h
0
53
Can we efficiently explain model behaviors?
paulfchristiano
4d
0
85
Trying to disambiguate different questions about whether RLHF is “good”
Buck
6d
39
182
Using GPT-Eliezer against ChatGPT Jailbreaking
Stuart_Armstrong
14d
77
13
Event [Berkeley]: Alignment Collaborator Speed-Meeting
AlexMennen
1d
2
154
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
49
Existential AI Safety is NOT separate from near-term applications
scasper
7d
15
29
High-level hopes for AI alignment
HoldenKarnofsky
5d
3
13
Looking for an alignment tutor
JanBrauner
3d
2
129
Mechanistic anomaly detection and ELK
paulfchristiano
25d
17
182
What does it take to defend the world against out-of-control AGIs?
Steven Byrnes
1mo
31
84
Disagreement with bio anchors that lead to shorter timelines
Marius Hobbhahn
1mo
16
104
Applying superintelligence without collusion
Eric Drexler
1mo
56
315
Two-year update on my personal AI timelines
Ajeya Cotra
4mo
60
13
How promising are legal avenues to restrict AI training data?
thehalliard
10d
2
110
How might we align transformative AI if it’s developed very soon?
HoldenKarnofsky
3mo
17
29
The economy as an analogy for advanced AI systems
rosehadshar
1mo
0
62
my current outlook on AI risk mitigation
carado
2mo
4
206
Deepmind's Gato: Generalist Agent
Daniel Kokotajlo
7mo
61
62
A review of the Bio-Anchors report
jylin04
2mo
4
486
What 2026 looks like
Daniel Kokotajlo
1y
98
84
AI strategy nearcasting
HoldenKarnofsky
3mo
3
110
Announcing Epoch: A research organization investigating the road to Transformative AI
Jsevillamol
5mo
2
118
Everything I Need To Know About Takeoff Speeds I Learned From Air Conditioner Ratings On Amazon
johnswentworth
8mo
130