Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
67 posts
Value Learning
Inverse Reinforcement Learning
The Pointers Problem
Meta-Philosophy
Metaethics
Kolmogorov Complexity
Philosophy
Book Reviews
Perceptual Control Theory
59 posts
Community
Agent Foundations
Machine Intelligence Research Institute (MIRI)
Cognitive Reduction
Center for Human-Compatible AI (CHAI)
Regulation and AI Risk
Grants & Fundraising Opportunities
Future of Humanity Institute (FHI)
Population Ethics
Utilitarianism
Moral Uncertainty
The SF Bay Area
109
The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables
johnswentworth
2y
43
80
Beyond Kolmogorov and Shannon
Alexander Gietelink Oldenziel
1mo
14
74
Preface to the sequence on value learning
Rohin Shah
4y
6
71
[Book Review] "The Alignment Problem" by Brian Christian
lsusr
1y
16
68
Parsing Chris Mingard on Neural Networks
Alex Flint
1y
27
67
Thoughts on "Human-Compatible"
TurnTrout
3y
35
67
Don't design agents which exploit adversarial inputs
TurnTrout
1mo
61
58
Humans can be assigned any values whatsoever…
Stuart_Armstrong
4y
26
57
Clarifying "AI Alignment"
paulfchristiano
4y
82
51
Intuitions about goal-directed behavior
Rohin Shah
4y
15
48
Some Thoughts on Metaphilosophy
Wei_Dai
3y
27
48
The easy goal inference problem is still hard
paulfchristiano
4y
19
47
Future directions for ambitious value learning
Rohin Shah
4y
9
46
Policy Alignment
abramdemski
4y
25
297
Why Agent Foundations? An Overly Abstract Explanation
johnswentworth
9mo
54
250
The Rocket Alignment Problem
Eliezer Yudkowsky
4y
42
203
2018 AI Alignment Literature Review and Charity Comparison
Larks
4y
26
127
2019 AI Alignment Literature Review and Charity Comparison
Larks
3y
18
100
What I’ll be doing at MIRI
evhub
3y
6
99
Announcing the Introduction to ML Safety course
Dan H
4mo
6
94
Call for research on evaluating alignment (funding + advice available)
Beth Barnes
1y
11
94
Introducing the ML Safety Scholars Program
Dan H
7mo
2
93
Full-time AGI Safety!
Steven Byrnes
1y
3
91
AI Safety and Neighboring Communities: A Quick-Start Guide, as of Summer 2022
Sam Bowman
3mo
2
87
Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility
Akash
28d
20
85
Prize and fast track to alignment research at ALTER
Vanessa Kosoy
3mo
4
85
Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]
habryka
1y
4
83
Challenges with Breaking into MIRI-Style Research
Chris_Leong
11mo
15