Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
808 posts
AI
Embedded Agency
Eliciting Latent Knowledge (ELK)
Reinforcement Learning
Infra-Bayesianism
Counterfactuals
Logic & Mathematics
AI Capabilities
Interviews
Audio
Subagents
Wireheading
126 posts
Value Learning
Inverse Reinforcement Learning
Machine Intelligence Research Institute (MIRI)
Agent Foundations
Meta-Philosophy
Metaethics
Community
Philosophy
The Pointers Problem
Moral Uncertainty
Cognitive Reduction
Center for Human-Compatible AI (CHAI)
503
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
3mo
83
409
Discussion with Eliezer Yudkowsky on AGI interventions
Rob Bensinger
1y
257
334
EfficientZero: How It Works
1a3orn
1y
42
271
Reward is not the optimization target
TurnTrout
4mo
97
265
The Plan
johnswentworth
1y
77
265
Ngo and Yudkowsky on alignment difficulty
Eliezer Yudkowsky
1y
143
265
An overview of 11 proposals for building safe advanced AI
evhub
2y
36
263
DeepMind: Generally capable agents emerge from open-ended play
Daniel Kokotajlo
1y
53
257
larger language models may disappoint you [or, an eternally unfinished draft]
nostalgebraist
1y
29
253
Embedded Agents
abramdemski
4y
41
251
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
248
Visible Thoughts Project and Bounty Announcement
So8res
1y
104
247
Optimality is the tiger, and agents are its teeth
Veedrac
8mo
31
237
Humans are very reliable agents
alyssavance
6mo
35
297
Why Agent Foundations? An Overly Abstract Explanation
johnswentworth
9mo
54
250
The Rocket Alignment Problem
Eliezer Yudkowsky
4y
42
203
2018 AI Alignment Literature Review and Charity Comparison
Larks
4y
26
127
2019 AI Alignment Literature Review and Charity Comparison
Larks
3y
18
109
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
2y
43
100
What I’ll be doing at MIRI
evhub
3y
6
99
Announcing the Introduction to ML Safety course
Dan H
4mo
6
94
Call for research on evaluating alignment (funding + advice available)
Beth Barnes
1y
11
94
Introducing the ML Safety Scholars Program
Dan H
7mo
2
93
Full-time AGI Safety!
Steven Byrnes
1y
3
91
AI Safety and Neighboring Communities: A Quick-Start Guide, as of Summer 2022
Sam Bowman
3mo
2
87
Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility
Akash
28d
20
85
Prize and fast track to alignment research at ALTER
Vanessa Kosoy
3mo
4
85
Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]
habryka
1y
4