Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

22 posts Outer Alignment Mesa-Optimization

20 posts Neuroscience Neuromorphic AI Predictive Processing Neocortex Computing Overhang Planning & Decision-Making Hansonian Pre-Rationality Intentionality Emergent Behavior ( Emergence )

60 Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)

LawrenceC

4d

10

68 Human Mimicry Mainly Works When We’re Already Close

johnswentworth

4mo

16

55 Agency As a Natural Abstraction

Thane Ruthenis

7mo

9

54 Meta learning to gradient hack

Quintin Pope

1y

11

166 Risks from Learned Optimization: Introduction

evhub

3y

42

24 [ASoT] Some thoughts about deceptive mesaoptimization

leogao

8mo

5

7 Inner alignment: what are we pointing at?

lcmgcd

3mo

2

61 "Inner Alignment Failures" Which Are Actually Outer Alignment Failures

johnswentworth

2y

38

33 Thoughts on gradient hacking

Richard_Ngo

1y

12

47 Formal Solution to the Inner Alignment Problem

michaelcohen

1y

123

54 Mesa-Search vs Mesa-Control

abramdemski

2y

45

78 Risks from Learned Optimization: Conclusion and Related Work

evhub

3y

4

75 Conditions for Mesa-Optimization

evhub

3y

48

62 An Increasingly Manipulative Newsfeed

Michaël Trazzi

3y

16

61 My take on Jacob Cannell’s take on AGI safety

Steven Byrnes

22d

13

136 Inner Alignment in Salt-Starved Rats

Steven Byrnes

2y

39

144 My computational framework for the brain

Steven Byrnes

2y

26

110 Book review: "A Thousand Brains" by Jeff Hawkins

Steven Byrnes

1y

18

41 [Intro to brain-like-AGI safety] 8. Takeaways from neuro 1/2: On AGI development

Steven Byrnes

9mo

2

147 Matt Botvinick on the spontaneous emergence of learning algorithms

Adam Scholl

2y

87

64 Brain-inspired AGI and the "lifetime anchor"

Steven Byrnes

1y

16

43 [Intro to brain-like-AGI safety] 2. “Learning from scratch” in the brain

Steven Byrnes

10mo

12

45 Value loading in the human brain: a worked example

Steven Byrnes

1y

2

78 How uniform is the neocortex?

zhukeepa

2y

23

76 Inner alignment in the brain

Steven Byrnes

2y

16

55 What Decision Theory is Implied By Predictive Processing?

johnswentworth

2y

17

51 Building brain-inspired AGI is infinitely easier than understanding the brain

Steven Byrnes

2y

14

58 Human instincts, symbol grounding, and the blank-slate neocortex

Steven Byrnes

3y

23