Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
177 posts
Rationality
Decision Theory
Abstraction
Goal-Directedness
Utility Functions
Finite Factored Sets
Causality
Literature Reviews
Quantilization
Mild Optimization
Open Problems
Filtered Evidence
172 posts
World Modeling
Impact Regularization
Human Values
Shard Theory
Anthropics
Complexity of Value
Exercises / Problem-Sets
Gradient Hacking
Evolution
Fixed Point Theorems
Heuristics & Biases
Modularity
206
Realism about rationality
Richard_Ngo
4y
145
175
2021 AI Alignment Literature Review and Charity Comparison
Larks
12mo
26
171
Finite Factored Sets in Pictures
Magdalena Wache
9d
29
160
Can you control the past?
Joe Carlsmith
1y
93
146
why assume AGIs will optimize for fixed goals?
nostalgebraist
6mo
52
141
Finite Factored Sets
Scott Garrabrant
1y
94
137
What's Up With Confusingly Pervasive Consequentialism?
Raemon
11mo
88
137
2020 AI Alignment Literature Review and Charity Comparison
Larks
1y
14
133
Decision Theory
abramdemski
4y
46
130
An Orthodox Case Against Utility Functions
abramdemski
2y
53
121
Problem relaxation as a tactic
TurnTrout
2y
8
114
Saving Time
Scott Garrabrant
1y
19
108
Principles for Alignment/Agency Projects
johnswentworth
5mo
20
99
Utility ≠ Reward
vlad_m
3y
25
981
Where I agree and disagree with Eliezer
paulfchristiano
6mo
205
381
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
5mo
89
249
The shard theory of human values
Quintin Pope
3mo
57
196
Utility Maximization = Description Length Minimization
johnswentworth
1y
40
191
Humans provide an untapped wealth of evidence about alignment
TurnTrout
5mo
92
163
Evolution of Modularity
johnswentworth
3y
12
137
My research methodology
paulfchristiano
1y
36
134
Towards a New Impact Measure
TurnTrout
4y
159
131
Testing The Natural Abstraction Hypothesis: Project Intro
johnswentworth
1y
34
120
Shard Theory: An Overview
David Udell
4mo
34
106
Selection Theorems: A Program For Understanding Agents
johnswentworth
1y
23
101
Fixing The Good Regulator Theorem
johnswentworth
1y
25
97
Reframing Impact
TurnTrout
3y
15
93
Frequent arguments about alignment
John Schulman
1y
16