Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
42 posts
Outer Alignment
Mesa-Optimization
Neuroscience
Neuromorphic AI
Predictive Processing
Neocortex
Computing Overhang
Planning & Decision-Making
Intentionality
Hansonian Pre-Rationality
Emergent Behavior ( Emergence )
29 posts
Optimization
General Intelligence
AI Services (CAIS)
Selection vs Control
Distinctions
Adaptation Executors
Narrow AI
World Modeling Techniques
60
Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)
LawrenceC
4d
10
61
My take on Jacob Cannell’s take on AGI safety
Steven Byrnes
22d
13
68
Human Mimicry Mainly Works When We’re Already Close
johnswentworth
4mo
16
7
Inner alignment: what are we pointing at?
lcmgcd
3mo
2
24
Outer alignment and imitative amplification
evhub
2y
11
60
Multi-agent predictive minds and AI alignment
Jan_Kulveit
4y
18
78
Risks from Learned Optimization: Conclusion and Related Work
evhub
3y
4
43
The Steering Problem
paulfchristiano
4y
12
18
Gary Marcus vs Cortical Uniformity
Steven Byrnes
2y
0
55
What Decision Theory is Implied By Predictive Processing?
johnswentworth
2y
17
20
Towards an Intentional Research Agenda
romeostevensit
3y
8
47
Formal Solution to the Inner Alignment Problem
michaelcohen
1y
123
15
Minimization of prediction error as a foundation for human values in AI alignment
Gordon Seidoh Worley
3y
42
76
Inner alignment in the brain
Steven Byrnes
2y
16
37
Don't align agents to evaluations of plans
TurnTrout
24d
46
14
Take 6: CAIS is actually Orwellian.
Charlie Steiner
13d
5
103
What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?
johnswentworth
4mo
15
52
Humans aren't fitness maximizers
So8res
2mo
45
45
Mesa-Optimizers vs “Steered Optimizers”
Steven Byrnes
2y
7
79
Bottle Caps Aren't Optimisers
DanielFilan
4y
21
15
Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts - Part 4)
Davidmanheim
3y
3
52
Aligning a toy model of optimization
paulfchristiano
3y
26
76
Comments on CAIS
Richard_Ngo
3y
14
16
Motivations, Natural Selection, and Curriculum Engineering
Oliver Sourbut
1y
0
8
A summary of aligning narrowly superhuman models
gugu
10mo
0
139
Selection vs Control
abramdemski
3y
25
38
Optimization Provenance
Adele Lopez
3y
5
9
Quantifying General Intelligence
JasonBrown
6mo
6