Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
42 posts
Outer Alignment
Mesa-Optimization
Neuroscience
Neuromorphic AI
Predictive Processing
Neocortex
Computing Overhang
Planning & Decision-Making
Intentionality
Hansonian Pre-Rationality
Emergent Behavior ( Emergence )
29 posts
Optimization
General Intelligence
AI Services (CAIS)
Selection vs Control
Distinctions
Adaptation Executors
Narrow AI
World Modeling Techniques
56
Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)
LawrenceC
4d
10
47
My take on Jacob Cannell’s take on AGI safety
Steven Byrnes
22d
13
65
Human Mimicry Mainly Works When We’re Already Close
johnswentworth
4mo
16
8
Inner alignment: what are we pointing at?
lcmgcd
3mo
2
16
Outer alignment and imitative amplification
evhub
2y
11
59
Multi-agent predictive minds and AI alignment
Jan_Kulveit
4y
18
71
Risks from Learned Optimization: Conclusion and Related Work
evhub
3y
4
40
The Steering Problem
paulfchristiano
4y
12
19
Gary Marcus vs Cortical Uniformity
Steven Byrnes
2y
0
47
What Decision Theory is Implied By Predictive Processing?
johnswentworth
2y
17
24
Towards an Intentional Research Agenda
romeostevensit
3y
8
56
Formal Solution to the Inner Alignment Problem
michaelcohen
1y
123
14
Minimization of prediction error as a foundation for human values in AI alignment
Gordon Seidoh Worley
3y
42
79
Inner alignment in the brain
Steven Byrnes
2y
16
32
Don't align agents to evaluations of plans
TurnTrout
24d
46
8
Take 6: CAIS is actually Orwellian.
Charlie Steiner
13d
5
111
What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?
johnswentworth
4mo
15
45
Humans aren't fitness maximizers
So8res
2mo
45
49
Mesa-Optimizers vs “Steered Optimizers”
Steven Byrnes
2y
7
69
Bottle Caps Aren't Optimisers
DanielFilan
4y
21
13
Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts - Part 4)
Davidmanheim
3y
3
44
Aligning a toy model of optimization
paulfchristiano
3y
26
75
Comments on CAIS
Richard_Ngo
3y
14
18
Motivations, Natural Selection, and Curriculum Engineering
Oliver Sourbut
1y
0
10
A summary of aligning narrowly superhuman models
gugu
10mo
0
141
Selection vs Control
abramdemski
3y
25
35
Optimization Provenance
Adele Lopez
3y
5
14
Quantifying General Intelligence
JasonBrown
6mo
6