Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
42 posts
Outer Alignment
Mesa-Optimization
Neuroscience
Neuromorphic AI
Predictive Processing
Neocortex
Computing Overhang
Planning & Decision-Making
Intentionality
Hansonian Pre-Rationality
Emergent Behavior ( Emergence )
29 posts
Optimization
General Intelligence
AI Services (CAIS)
Selection vs Control
Distinctions
Adaptation Executors
Narrow AI
World Modeling Techniques
64
Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)
LawrenceC
4d
10
75
My take on Jacob Cannell’s take on AGI safety
Steven Byrnes
22d
13
71
Human Mimicry Mainly Works When We’re Already Close
johnswentworth
4mo
16
6
Inner alignment: what are we pointing at?
lcmgcd
3mo
2
32
Outer alignment and imitative amplification
evhub
2y
11
61
Multi-agent predictive minds and AI alignment
Jan_Kulveit
4y
18
85
Risks from Learned Optimization: Conclusion and Related Work
evhub
3y
4
46
The Steering Problem
paulfchristiano
4y
12
17
Gary Marcus vs Cortical Uniformity
Steven Byrnes
2y
0
63
What Decision Theory is Implied By Predictive Processing?
johnswentworth
2y
17
16
Towards an Intentional Research Agenda
romeostevensit
3y
8
38
Formal Solution to the Inner Alignment Problem
michaelcohen
1y
123
16
Minimization of prediction error as a foundation for human values in AI alignment
Gordon Seidoh Worley
3y
42
73
Inner alignment in the brain
Steven Byrnes
2y
16
42
Don't align agents to evaluations of plans
TurnTrout
24d
46
20
Take 6: CAIS is actually Orwellian.
Charlie Steiner
13d
5
95
What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?
johnswentworth
4mo
15
59
Humans aren't fitness maximizers
So8res
2mo
45
41
Mesa-Optimizers vs “Steered Optimizers”
Steven Byrnes
2y
7
89
Bottle Caps Aren't Optimisers
DanielFilan
4y
21
17
Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts - Part 4)
Davidmanheim
3y
3
60
Aligning a toy model of optimization
paulfchristiano
3y
26
77
Comments on CAIS
Richard_Ngo
3y
14
14
Motivations, Natural Selection, and Curriculum Engineering
Oliver Sourbut
1y
0
6
A summary of aligning narrowly superhuman models
gugu
10mo
0
137
Selection vs Control
abramdemski
3y
25
41
Optimization Provenance
Adele Lopez
3y
5
4
Quantifying General Intelligence
JasonBrown
6mo
6