Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

42 posts Outer Alignment Mesa-Optimization Neuroscience Neuromorphic AI Predictive Processing Neocortex Computing Overhang Planning & Decision-Making Intentionality Hansonian Pre-Rationality Emergent Behavior ( Emergence )

29 posts Optimization General Intelligence AI Services (CAIS) Selection vs Control Distinctions Adaptation Executors Narrow AI World Modeling Techniques

56 Paper: Constitutional AI: Harmlessness from AI Feedback (Anthropic)

LawrenceC

4d

10

47 My take on Jacob Cannell’s take on AGI safety

Steven Byrnes

22d

13

65 Human Mimicry Mainly Works When We’re Already Close

johnswentworth

4mo

16

8 Inner alignment: what are we pointing at?

lcmgcd

3mo

2

16 Outer alignment and imitative amplification

evhub

2y

11

59 Multi-agent predictive minds and AI alignment

Jan_Kulveit

4y

18

71 Risks from Learned Optimization: Conclusion and Related Work

evhub

3y

4

40 The Steering Problem

paulfchristiano

4y

12

19 Gary Marcus vs Cortical Uniformity

Steven Byrnes

2y

0

47 What Decision Theory is Implied By Predictive Processing?

johnswentworth

2y

17

24 Towards an Intentional Research Agenda

romeostevensit

3y

8

56 Formal Solution to the Inner Alignment Problem

michaelcohen

1y

123

14 Minimization of prediction error as a foundation for human values in AI alignment

Gordon Seidoh Worley

3y

42

79 Inner alignment in the brain

Steven Byrnes

2y

16

32 Don't align agents to evaluations of plans

TurnTrout

24d

46

8 Take 6: CAIS is actually Orwellian.

Charlie Steiner

13d

5

111 What's General-Purpose Search, And Why Might We Expect To See It In Trained ML Systems?

johnswentworth

4mo

15

45 Humans aren't fitness maximizers

So8res

2mo

45

49 Mesa-Optimizers vs “Steered Optimizers”

Steven Byrnes

2y

7

69 Bottle Caps Aren't Optimisers

DanielFilan

4y

21

13 Mesa-Optimizers and Over-optimization Failure (Optimizing and Goodhart Effects, Clarifying Thoughts - Part 4)

Davidmanheim

3y

3

44 Aligning a toy model of optimization

paulfchristiano

3y

26

75 Comments on CAIS

Richard_Ngo

3y

14

18 Motivations, Natural Selection, and Curriculum Engineering

Oliver Sourbut

1y

0

10 A summary of aligning narrowly superhuman models

gugu

10mo

0

141 Selection vs Control

abramdemski

3y

25

35 Optimization Provenance

Adele Lopez

3y

5

14 Quantifying General Intelligence

JasonBrown

6mo

6