Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

71 posts Outer Alignment Optimization Mesa-Optimization Neuroscience Neuromorphic AI General Intelligence Predictive Processing AI Services (CAIS) Selection vs Control Neocortex Distinctions Computing Overhang

47 posts Inner Alignment Solomonoff Induction Priors Occam's Razor

217 The ground of optimization

Alex Flint

2y

74

166 Risks from Learned Optimization: Introduction

evhub

3y

42

147 Matt Botvinick on the spontaneous emergence of learning algorithms

Adam Scholl

2y

87

144 My computational framework for the brain

Steven Byrnes

2y

26

139 Selection vs Control

abramdemski

3y

25

136 Inner Alignment in Salt-Starved Rats

Steven Byrnes

2y

39

118 Reframing Superintelligence: Comprehensive AI Services as General Intelligence

Rohin Shah

3y

75

110 Book review: "A Thousand Brains" by Jeff Hawkins

Steven Byrnes

1y

18

103 What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworth

4mo

15

98 Optimization Amplifies

Scott Garrabrant

4y

12

79 Bottle Caps Aren't Optimisers

DanielFilan

4y

21

78 Risks from Learned Optimization: Conclusion and Related Work

evhub

3y

4

78 How uniform is the neocortex?

zhukeepa

2y

23

78 How special are human brains among animal brains?

zhukeepa

2y

38

175 Inner Alignment: Explain like I'm 12 Edition

Rafael Harth

2y

46

148 The Solomonoff Prior is Malign

Mark Xu

2y

52

127 A Semitechnical Introductory Dialogue on Solomonoff Induction

Eliezer Yudkowsky

1y

34

103 Externalized reasoning oversight: a research direction for language model alignment

tamera

4mo

22

103 Demons in Imperfect Search

johnswentworth

2y

21

99 The Inner Alignment Problem

evhub

3y

17

99 Gradient hacking

evhub

3y

39

96 Inner and outer alignment decompose one hard problem into two extremely hard problems

TurnTrout

18d

18

87 Tessellating Hills: a toy model for demons in imperfect search

DaemonicSigil

2y

17

81 Open question: are minimal circuits daemon-free?

paulfchristiano

4y

70

79 Learning the prior

paulfchristiano

2y

29

77 2-D Robustness

vlad_m

3y

8

70 A simple environment for showing mesa misalignment

Matthew Barnett

3y

9

66 Are minimal circuits deceptive?

evhub

3y

11