Tree of Tags

Go Back

You can't go any further

You can't go any further

meritocratic regular democratic

hot top alive

4 posts The Pointers Problem

59 posts Value Learning

104 The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

johnswentworth

2y

43

19 Stable Pointers to Value III: Recursive Quantilization

abramdemski

4y

4

18 Stable Pointers to Value II: Environmental Goals

abramdemski

4y

2

15 Stable Pointers to Value: An Agent Embedded in Its Own Utility Function

abramdemski

5y

9

22 Character alignment

p.b.

3mo

0

42 Different perspectives on concept extrapolation

Stuart_Armstrong

8mo

7

16 Value extrapolation vs Wireheading

Stuart_Armstrong

6mo

1

26 How an alien theory of mind might be unlearnable

Stuart_Armstrong

11mo

35

19 An Open Philanthropy grant proposal: Causal representation learning of human preferences

PabloAMC

11mo

6

14 Value extrapolation, concept extrapolation, model splintering

Stuart_Armstrong

9mo

1

9 The Pointers Problem - Distilled

NinaR

6mo

0

17 Morally underdefined situations can be deadly

Stuart_Armstrong

1y

8

10 AIs should learn human preferences, not biases

Stuart_Armstrong

8mo

1

69 The E-Coli Test for AI Alignment

johnswentworth

4y

24

68 Preface to the sequence on value learning

Rohin Shah

4y

6

65 Why we need a *theory* of human values

Stuart_Armstrong

4y

15

64 Clarifying "AI Alignment"

paulfchristiano

4y

82

41 Using vector fields to visualise preferences and make them consistent

MichaelA

2y

32