Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

17 posts Corrigibility 2017-2019 AI Alignment Prize Petrov Day

3 posts Treacherous Turn Programming

17 Corrigibility Via Thought-Process Deference

Thane Ruthenis

26d

5

17 On corrigibility and its basin

Donald Hobson

6mo

3

27 Petrov corrigibility

Stuart_Armstrong

4y

10

7 Corrigible omniscient AI capable of making clones

Kaj_Sotala

7y

0

8 An Idea For Corrigible, Recursively Improving Math Oracles

jimrandomh

7y

0

15 A first look at the hard problem of corrigibility

jessicata

7y

0

38 Do what we mean vs. do what we say

Rohin Shah

4y

14

131 Let's See You Write That Corrigibility Tag

Eliezer Yudkowsky

6mo

67

39 Solve Corrigibility Week

Logan Riggs

1y

21

26 Formalizing Policy-Modification Corrigibility

TurnTrout

1y

6

15 Corrigibility as Constrained Optimisation

Henrik Åslund

3y

3

55 Corrigibility

paulfchristiano

4y

7

101 Announcement: AI alignment prize round 3 winners and next round

cousin_it

4y

7

90 Announcement: AI alignment prize round 4 winners

cousin_it

3y

41

40 [Linkpost] Treacherous turns in the wild

Mark Xu

1y

6

27 [AN #165]: When large models are more likely to lie

Rohin Shah

1y

0

70 A Gym Gridworld Environment for the Treacherous Turn

Michaël Trazzi

4y

9