Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

37 posts Value Learning Kolmogorov Complexity

5 posts The Pointers Problem

80 Beyond Kolmogorov and Shannon

Alexander Gietelink Oldenziel

1mo

14

58 Humans can be assigned any values whatsoever…

Stuart_Armstrong

4y

26

10 AIs should learn human preferences, not biases

Stuart_Armstrong

8mo

1

31 Human-AI Interaction

Rohin Shah

3y

10

1 Humans can be assigned any values whatsoever...

Stuart_Armstrong

5y

0

0 Kolmogorov complexity makes reward learning worse

Stuart_Armstrong

5y

0

44 What is ambitious value learning?

Rohin Shah

4y

28

68 Parsing Chris Mingard on Neural Networks

Alex Flint

1y

27

13 Morally underdefined situations can be deadly

Stuart_Armstrong

1y

8

21 Thoughts on implementing corrigible robust alignment

Steven Byrnes

3y

2

22 Learning human preferences: black-box, white-box, and structured white-box access

Stuart_Armstrong

2y

9

8 Values, Valence, and Alignment

Gordon Seidoh Worley

3y

4

7 What's the dream for giving natural language commands to AI?

Charlie Steiner

3y

8

51 Intuitions about goal-directed behavior

Rohin Shah

4y

15

67 Don't design agents which exploit adversarial inputs

TurnTrout

1mo

61

25 People care about each other even though they have imperfect motivational pointers?

TurnTrout

1mo

25

10 Stable Pointers to Value: An Agent Embedded in Its Own Utility Function

abramdemski

5y

9

109 The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

johnswentworth

2y

43

13 Stable Pointers to Value II: Environmental Goals

abramdemski

4y

2