Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
94 posts
Rationality
Abstraction
Finite Factored Sets
Causality
Open Problems
Consequentialism
Filtered Evidence
Techniques
Consciousness
Intuition
Free Will
Adding Up to Normality
83 posts
Decision Theory
Goal-Directedness
Utility Functions
Literature Reviews
Quantilization
Mild Optimization
Coherence Arguments
Bounded Rationality
Orthogonality Thesis
Law-Thinking
Coherent Extrapolated Volition
Indexical Information
125
Finite Factored Sets in Pictures
Magdalena Wache
9d
29
40
Counterfactability
Scott Garrabrant
1mo
4
78
Builder/Breaker for Deconfusion
abramdemski
2mo
9
122
Principles for Alignment/Agency Projects
johnswentworth
5mo
20
201
What's Up With Confusingly Pervasive Consequentialism?
Raemon
11mo
88
78
Distributed Decisions
johnswentworth
6mo
4
74
[Intro to brain-like-AGI safety] 15. Conclusion: Open problems, how to help, AMA
Steven Byrnes
7mo
11
35
All the posts I will never write
Alexander Gietelink Oldenziel
4mo
8
146
Saving Time
Scott Garrabrant
1y
19
133
Finite Factored Sets
Scott Garrabrant
1y
94
80
Testing The Natural Abstraction Hypothesis: Project Update
johnswentworth
1y
17
29
Open Problems in AI X-Risk [PAIS #5]
Dan H
6mo
3
34
Exploring Finite Factored Sets with some toy examples
Thomas Kehrenberg
9mo
1
76
Search-in-Territory vs Search-in-Map
johnswentworth
1y
13
59
Take 7: You should talk about "the human's utility function" less.
Charlie Steiner
12d
22
64
Notes on "Can you control the past"
So8res
2mo
40
93
wrapper-minds are the enemy
nostalgebraist
6mo
36
92
why assume AGIs will optimize for fixed goals?
nostalgebraist
6mo
52
50
Finding Goals in the World Model
Jeremy Gillen
4mo
8
82
The "Measuring Stick of Utility" Problem
johnswentworth
6mo
22
153
2021 AI Alignment Literature Review and Charity Comparison
Larks
12mo
26
134
Can you control the past?
Joe Carlsmith
1y
93
28
Quantilizers and Generative Models
Adam Jermyn
5mo
5
137
2020 AI Alignment Literature Review and Charity Comparison
Larks
1y
14
102
Coherence arguments imply a force for goal-directed behavior
KatjaGrace
1y
27
69
When Most VNM-Coherent Preference Orderings Have Convergent Instrumental Incentives
TurnTrout
1y
4
126
An Orthodox Case Against Utility Functions
abramdemski
2y
53
72
My Current Take on Counterfactuals
abramdemski
1y
57