Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
94 posts
Rationality
Abstraction
Finite Factored Sets
Causality
Open Problems
Consequentialism
Filtered Evidence
Techniques
Consciousness
Intuition
Free Will
Adding Up to Normality
83 posts
Decision Theory
Goal-Directedness
Utility Functions
Literature Reviews
Quantilization
Mild Optimization
Coherence Arguments
Bounded Rationality
Orthogonality Thesis
Law-Thinking
Coherent Extrapolated Volition
Indexical Information
148
Finite Factored Sets in Pictures
Magdalena Wache
9d
29
36
Counterfactability
Scott Garrabrant
1mo
4
70
Builder/Breaker for Deconfusion
abramdemski
2mo
9
115
Principles for Alignment/Agency Projects
johnswentworth
5mo
20
169
What's Up With Confusingly Pervasive Consequentialism?
Raemon
11mo
88
52
All the posts I will never write
Alexander Gietelink Oldenziel
4mo
8
81
[Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA
Steven Byrnes
7mo
11
65
Distributed Decisions
johnswentworth
6mo
4
50
Open Problems in AI X-Risk [PAIS #5]
Dan H
6mo
3
137
Finite Factored Sets
Scott Garrabrant
1y
94
130
Saving Time
Scott Garrabrant
1y
19
83
Testing The Natural Abstraction Hypothesis: Project Update
johnswentworth
1y
17
34
Exploring Finite Factored Sets with some toy examples
Thomas Kehrenberg
9mo
1
70
Search-in-Territory vs Search-in-Map
johnswentworth
1y
13
47
Take 7: You should talk about "the human's utility function" less.
Charlie Steiner
12d
22
55
Notes on "Can you control the past"
So8res
2mo
40
119
why assume AGIs will optimize for fixed goals?
nostalgebraist
6mo
52
92
wrapper-minds are the enemy
nostalgebraist
6mo
36
55
Finding Goals in the World Model
Jeremy Gillen
4mo
8
164
2021 AI Alignment Literature Review and Charity Comparison
Larks
12mo
26
69
The "Measuring Stick of Utility" Problem
johnswentworth
6mo
22
147
Can you control the past?
Joe Carlsmith
1y
93
24
Quantilizers and Generative Models
Adam Jermyn
5mo
5
137
2020 AI Alignment Literature Review and Charity Comparison
Larks
1y
14
21
Exploring Mild Behaviour in Embedded Agents
Megan Kinniment
5mo
3
88
Coherence arguments imply a force for goal-directed behavior
KatjaGrace
1y
27
128
An Orthodox Case Against Utility Functions
abramdemski
2y
53
52
When Most VNM-Coherent Preference Orderings Have Convergent Instrumental Incentives
TurnTrout
1y
4