Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
76 posts
Inner Alignment
Outer Alignment
Mesa-Optimization
78 posts
Neuroscience
Predictive Processing
Neuromorphic AI
Brain-Computer Interfaces
Neocortex
Neuralink
Systems Thinking
Emergent Behavior ( Emergence )
108
Inner and outer alignment decompose one hard problem into two extremely hard problems
TurnTrout
18d
18
59
Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)
LawrenceC
4d
10
21
Value Formation: An Overarching Model
Thane Ruthenis
1mo
6
5
Don't you think RLHF solves outer alignment?
Raphaël S
1mo
19
29
Mesa-Optimizers via Grokking
orthonormal
14d
4
93
Trying to Make a Treacherous Mesa-Optimizer
MadHatter
1mo
13
24
Take 8: Queer the inner/outer alignment dichotomy.
Charlie Steiner
11d
2
20
I there a demo of "You can't fetch the coffee if you're dead"?
Ram Rachum
1mo
9
84
How likely is deceptive alignment?
evhub
3mo
21
80
2-D Robustness
vlad_m
3y
8
185
Inner Alignment: Explain like I'm 12 Edition
Rafael Harth
2y
46
6
How much should we worry about mesa-optimization challenges?
sudo -i
4mo
13
20
Greed Is the Root of This Evil
Thane Ruthenis
2mo
4
15
Alignment as Game Design
Shoshannah Tekofsky
5mo
7
33
Unpacking "Shard Theory" as Hunch, Question, Theory, and Insight
Jacy Reese Anthis
1mo
8
28
Predictive Processing, Heterosexuality and Delusions of Grandeur
lsusr
3d
2
50
My take on Jacob Cannell’s take on AGI safety
Steven Byrnes
22d
13
43
AI researchers announce NeuroAI agenda
Cameron Berg
1mo
12
39
[Hebbian Natural Abstractions] Introduction
Samuel Nellessen
29d
3
22
On oxytocin-sensitive neurons in auditory cortex
Steven Byrnes
3mo
6
24
A physicist's approach to Origins of Life
pchvykov
5mo
6
28
Quick notes on “mirror neurons”
Steven Byrnes
2mo
2
45
[Intro to brain-like-AGI safety] 2. “Learning from scratch” in the brain
Steven Byrnes
10mo
12
13
(Link) I'm Missing a Chunk of My Brain
mukashi
3mo
2
138
Inner Alignment in Salt-Starved Rats
Steven Byrnes
2y
39
11
Brain-Brain communication
Jordan
11y
22
30
A future for neuroscience
Mike Johnson
4y
12
10
FAI and the Information Theory of Pleasure
johnsonmx
7y
19