Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
71 posts
Outer Alignment
Optimization
Mesa-Optimization
Neuroscience
Neuromorphic AI
General Intelligence
Predictive Processing
AI Services (CAIS)
Selection vs Control
Neocortex
Distinctions
Computing Overhang
47 posts
Inner Alignment
Solomonoff Induction
Priors
Occam's Razor
206
The ground of optimization
Alex Flint
2y
74
151
Matt Botvinick on the spontaneous emergence of learning algorithms
Adam Scholl
2y
87
148
Risks from Learned Optimization: Introduction
evhub
3y
42
141
Inner Alignment in Salt-Starved Rats
Steven Byrnes
2y
39
137
Selection vs Control
abramdemski
3y
25
123
Book review: "A Thousand Brains" by Jeff Hawkins
Steven Byrnes
1y
18
114
My computational framework for the brain
Steven Byrnes
2y
26
102
Optimization Amplifies
Scott Garrabrant
4y
12
96
Reframing Superintelligence: Comprehensive AI Services as General Intelligence
Rohin Shah
3y
75
95
What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?
johnswentworth
4mo
15
89
Bottle Caps Aren't Optimisers
DanielFilan
4y
21
85
Risks from Learned Optimization: Conclusion and Related Work
evhub
3y
4
80
Reflective Bayesianism
abramdemski
1y
27
78
"Inner Alignment Failures" Which Are Actually Outer Alignment Failures
johnswentworth
2y
38
175
Inner Alignment: Explain like I'm 12 Edition
Rafael Harth
2y
46
162
The Solomonoff Prior is Malign
Mark Xu
2y
52
122
A Semitechnical Introductory Dialogue on Solomonoff Induction
Eliezer Yudkowsky
1y
34
113
Demons in Imperfect Search
johnswentworth
2y
21
103
Gradient hacking
evhub
3y
39
98
The Inner Alignment Problem
evhub
3y
17
96
Tessellating Hills: a toy model for demons in imperfect search
DaemonicSigil
2y
17
94
Learning the prior
paulfchristiano
2y
29
90
Inner and outer alignment decompose one hard problem into two extremely hard problems
TurnTrout
18d
18
87
Open question: are minimal circuits daemon-free?
paulfchristiano
4y
70
79
Externalized reasoning oversight: a research direction for language model alignment
tamera
4mo
22
79
Are minimal circuits deceptive?
evhub
3y
11
77
2-D Robustness
vlad_m
3y
8
73
Concrete experiments in inner alignment
evhub
3y
12