Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
934 posts
AI
Value Learning
Embedded Agency
Community
Eliciting Latent Knowledge (ELK)
Reinforcement Learning
Infra-Bayesianism
Counterfactuals
Logic & Mathematics
Interviews
AI Capabilities
Inverse Reinforcement Learning
80 posts
AI Timelines
AI Takeoff
AI Persuasion
History
Forecasting & Prediction
Dialogue (format)
Technological Forecasting
Forecasts (Specific Predictions)
Industrial Revolution
Effective Altruism
Transformative AI
Progress Studies
13
Note on algorithms with multiple trained components
Steven Byrnes
7h
1
45
Towards Hodge-podge Alignment
Cleo Nardo
1d
20
30
Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.
Charlie Steiner
19h
0
35
The "Minimal Latents" Approach to Natural Abstractions
johnswentworth
22h
6
213
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
73
Can we efficiently explain model behaviors?
paulfchristiano
4d
0
99
Trying to disambiguate different questions about whether RLHF is “good”
Buck
6d
39
23
Event [Berkeley]: Alignment Collaborator Speed-Meeting
AlexMennen
1d
2
55
High-level hopes for AI alignment
HoldenKarnofsky
5d
3
136
Using GPT-Eliezer against ChatGPT Jailbreaking
Stuart_Armstrong
14d
77
17
Looking for an alignment tutor
JanBrauner
3d
2
106
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
106
Finding gliders in the game of life
paulfchristiano
19d
7
37
Take 10: Fine-tuning with RLHF is aesthetically unsatisfying.
Charlie Steiner
7d
3
60
Disagreement with bio anchors that lead to shorter timelines
Marius Hobbhahn
1mo
16
100
What does it take to defend the world against out-of-control AGIs?
Steven Byrnes
1mo
31
70
Applying superintelligence without collusion
Eric Drexler
1mo
56
259
Two-year update on my personal AI timelines
Ajeya Cotra
4mo
60
104
How might we align transformative AI if it’s developed very soon?
HoldenKarnofsky
3mo
17
23
The economy as an analogy for advanced AI systems
rosehadshar
1mo
0
54
my current outlook on AI risk mitigation
carado
2mo
4
5
How promising are legal avenues to restrict AI training data?
thehalliard
10d
2
180
Everything I Need To Know About Takeoff Speeds I Learned From Air Conditioner Ratings On Amazon
johnswentworth
8mo
130
74
AI strategy nearcasting
HoldenKarnofsky
3mo
3
122
Deepmind's Gato: Generalist Agent
Daniel Kokotajlo
7mo
61
52
Replacement for PONR concept
Daniel Kokotajlo
3mo
6
124
Takeoff speeds have a huge effect on what it means to work on AI x-risk
Buck
8mo
25
51
AGI Timelines Are Mostly Not Strategically Relevant To Alignment
johnswentworth
3mo
35