Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
177 posts
Rationality
Decision Theory
Abstraction
Goal-Directedness
Utility Functions
Finite Factored Sets
Causality
Literature Reviews
Quantilization
Mild Optimization
Open Problems
Filtered Evidence
172 posts
World Modeling
Impact Regularization
Human Values
Shard Theory
Anthropics
Complexity of Value
Exercises / Problem-Sets
Gradient Hacking
Evolution
Fixed Point Theorems
Heuristics & Biases
Modularity
180
Realism about rationality
Richard_Ngo
4y
145
169
What's Up With Confusingly Pervasive Consequentialism?
Raemon
11mo
88
164
2021 AI Alignment Literature Review and Charity Comparison
Larks
12mo
26
148
Finite Factored Sets in Pictures
Magdalena Wache
9d
29
147
Can you control the past?
Joe Carlsmith
1y
93
137
Finite Factored Sets
Scott Garrabrant
1y
94
137
2020 AI Alignment Literature Review and Charity Comparison
Larks
1y
14
130
Saving Time
Scott Garrabrant
1y
19
128
An Orthodox Case Against Utility Functions
abramdemski
2y
53
119
why assume AGIs will optimize for fixed goals?
nostalgebraist
6mo
52
115
Principles for Alignment/Agency Projects
johnswentworth
5mo
20
114
Decision Theory
abramdemski
4y
46
113
Problem relaxation as a tactic
TurnTrout
2y
8
102
Utility ≠ Reward
vlad_m
3y
25
777
Where I agree and disagree with Eliezer
paulfchristiano
6mo
205
310
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
5mo
89
202
The shard theory of human values
Quintin Pope
3mo
57
183
Utility Maximization = Description Length Minimization
johnswentworth
1y
40
175
Humans provide an untapped wealth of evidence about alignment
TurnTrout
5mo
92
159
Evolution of Modularity
johnswentworth
3y
12
148
My research methodology
paulfchristiano
1y
36
145
Testing The Natural Abstraction Hypothesis: Project Intro
johnswentworth
1y
34
130
Shard Theory: An Overview
David Udell
4mo
34
123
Fixing The Good Regulator Theorem
johnswentworth
1y
25
105
A broad basin of attraction around human values?
Wei_Dai
8mo
16
103
Selection Theorems: A Program For Understanding Agents
johnswentworth
1y
23
100
Towards a New Impact Measure
TurnTrout
4y
159
95
Frequent arguments about alignment
John Schulman
1y
16