Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
71 posts
Outer Alignment
Optimization
Mesa-Optimization
Neuroscience
Neuromorphic AI
General Intelligence
Predictive Processing
AI Services (CAIS)
Selection vs Control
Neocortex
Distinctions
Computing Overhang
47 posts
Inner Alignment
Solomonoff Induction
Priors
Occam's Razor
217
The ground of optimization
Alex Flint
2y
74
166
Risks from Learned Optimization: Introduction
evhub
3y
42
147
Matt Botvinick on the spontaneous emergence of learning algorithms
Adam Scholl
2y
87
144
My computational framework for the brain
Steven Byrnes
2y
26
139
Selection vs Control
abramdemski
3y
25
136
Inner Alignment in Salt-Starved Rats
Steven Byrnes
2y
39
118
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
Rohin Shah
3y
75
110
Book review: "A Thousand Brains" by Jeff Hawkins
Steven Byrnes
1y
18
103
What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?
johnswentworth
4mo
15
98
Optimization Amplifies
Scott Garrabrant
4y
12
79
Bottle Caps Aren't Optimisers
DanielFilan
4y
21
78
Risks from Learned Optimization: Conclusion and Related Work
evhub
3y
4
78
How uniform is the neocortex?
zhukeepa
2y
23
78
How special are human brains among animal brains?
zhukeepa
2y
38
175
Inner Alignment: Explain like I'm 12 Edition
Rafael Harth
2y
46
148
The Solomonoff Prior is Malign
Mark Xu
2y
52
127
A Semitechnical Introductory Dialogue on Solomonoff Induction
Eliezer Yudkowsky
1y
34
103
Externalized reasoning oversight: a research direction for language model alignment
tamera
4mo
22
103
Demons in Imperfect Search
johnswentworth
2y
21
99
The Inner Alignment Problem
evhub
3y
17
99
Gradient hacking
evhub
3y
39
96
Inner and outer alignment decompose one hard problem into two extremely hard problems
TurnTrout
18d
18
87
Tessellating Hills: a toy model for demons in imperfect search
DaemonicSigil
2y
17
81
Open question: are minimal circuits daemon-free?
paulfchristiano
4y
70
79
Learning the prior
paulfchristiano
2y
29
77
2-D Robustness
vlad_m
3y
8
70
A simple environment for showing mesa misalignment
Matthew Barnett
3y
9
66
Are minimal circuits deceptive?
evhub
3y
11