Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
125 posts
World Modeling
Impact Regularization
Anthropics
Exercises / Problem-Sets
Fixed Point Theorems
AIXI
Updateless Decision Theory
Sleeping Beauty Paradox
Cognitive Science
Extraterrestrial Life
Economics
Grabby Aliens
47 posts
Human Values
Shard Theory
Complexity of Value
Gradient Hacking
Heuristics & Biases
Evolution
Value Drift
Information Theory
Gradient Descent
Modularity
Ontology
Biology
981
Where I agree and disagree with Eliezer
paulfchristiano
6mo
205
381
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
5mo
89
137
My research methodology
paulfchristiano
1y
36
134
Towards a New Impact Measure
TurnTrout
4y
159
131
Testing The Natural Abstraction Hypothesis: Project Intro
johnswentworth
1y
34
106
Selection Theorems: A Program For Understanding Agents
johnswentworth
1y
23
101
Fixing The Good Regulator Theorem
johnswentworth
1y
25
97
Reframing Impact
TurnTrout
3y
15
93
Frequent arguments about alignment
John Schulman
1y
16
89
Worrying about the Vase: Whitelisting
TurnTrout
4y
26
85
Topological Fixed Point Exercises
Scott Garrabrant
4y
52
78
There is essentially one best-validated theory of cognition.
abramdemski
1y
34
70
Fixed Point Exercises
Scott Garrabrant
4y
8
63
The Goldbach conjecture is probably correct; so was Fermat's last theorem
Stuart_Armstrong
2y
27
249
The shard theory of human values
Quintin Pope
3mo
57
196
Utility Maximization = Description Length Minimization
johnswentworth
1y
40
191
Humans provide an untapped wealth of evidence about alignment
TurnTrout
5mo
92
163
Evolution of Modularity
johnswentworth
3y
12
120
Shard Theory: An Overview
David Udell
4mo
34
93
A broad basin of attraction around human values?
Wei_Dai
8mo
16
92
Human values & biases are inaccessible to the genome
TurnTrout
5mo
51
81
The Telephone Theorem: Information At A Distance Is Mediated By Deterministic Constraints
johnswentworth
1y
21
73
Contra shard theory, in the context of the diamond maximizer problem
So8res
2mo
16
71
Conditions for mathematical equivalence of Stochastic Gradient Descent and Natural Selection
Oliver Sourbut
7mo
12
71
Two Neglected Problems in Human-AI Safety
Wei_Dai
4y
24
69
Ten experiments in modularity, which we'd like you to run!
TheMcDouglas
6mo
2
63
Why we need a *theory* of human values
Stuart_Armstrong
4y
15
62
Shard Theory in Nine Theses: a Distillation and Critical Appraisal
LawrenceC
1d
9