Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
22 posts
Outer Alignment
Mesa-Optimization
20 posts
Neuroscience
Neuromorphic AI
Predictive Processing
Neocortex
Computing Overhang
Planning & Decision-Making
Hansonian Pre-Rationality
Intentionality
Emergent Behavior ( Emergence )
56
Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)
LawrenceC
4d
10
65
Human Mimicry Mainly Works When We’re Already Close
johnswentworth
4mo
16
8
Inner alignment: what are we pointing at?
lcmgcd
3mo
2
16
Outer alignment and imitative amplification
evhub
2y
11
71
Risks from Learned Optimization: Conclusion and Related Work
evhub
3y
4
40
The Steering Problem
paulfchristiano
4y
12
56
Formal Solution to the Inner Alignment Problem
michaelcohen
1y
123
51
An Increasingly Manipulative Newsfeed
Michaël Trazzi
3y
16
184
Risks from Learned Optimization: Introduction
evhub
3y
42
44
"Inner Alignment Failures" Which Are Actually Outer Alignment Failures
johnswentworth
2y
38
26
If I were a well-intentioned AI... III: Extremal Goodhart
Stuart_Armstrong
2y
0
34
Mesa-Search vs Mesa-Control
abramdemski
2y
45
58
[AN #58] Mesa optimization: what it is, and why we should care
Rohin Shah
3y
9
20
If I were a well-intentioned AI... II: Acting in a world
Stuart_Armstrong
2y
0
47
My take on Jacob Cannell’s take on AGI safety
Steven Byrnes
22d
13
59
Multi-agent predictive minds and AI alignment
Jan_Kulveit
4y
18
19
Gary Marcus vs Cortical Uniformity
Steven Byrnes
2y
0
47
What Decision Theory is Implied By Predictive Processing?
johnswentworth
2y
17
24
Towards an Intentional Research Agenda
romeostevensit
3y
8
14
Minimization of prediction error as a foundation for human values in AI alignment
Gordon Seidoh Worley
3y
42
79
Inner alignment in the brain
Steven Byrnes
2y
16
61
Human instincts, symbol grounding, and the blank-slate neocortex
Steven Byrnes
3y
23
49
Building brain-inspired AGI is infinitely easier than understanding the brain
Steven Byrnes
2y
14
51
Brain-inspired AGI and the "lifetime anchor"
Steven Byrnes
1y
16
78
How uniform is the neocortex?
zhukeepa
2y
23
143
Matt Botvinick on the spontaneous emergence of learning algorithms
Adam Scholl
2y
87
60
Predictive coding = RL + SL + Bayes + MPC
Steven Byrnes
3y
8
174
My computational framework for the brain
Steven Byrnes
2y
26