Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

26 posts Game Theory Center on Long-Term Risk (CLR) Risks of Astronomical Suffering (S-risks) Mechanism Design Suffering Fairness Blackmail / Extortion Group Rationality Terminology / Jargon (meta) Reading Group

20 posts Research Agendas Goal Factoring Mind Crime

44 «Boundaries», Part 3b: Alignment problems in terms of boundaries

Andrew_Critch

6d

2

174 Unifying Bargaining Notions (1/2)

Diffractor

4mo

38

80 Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

Diffractor

2mo

11

125 «Boundaries», Part 1: a key missing concept from utility theory

Andrew_Critch

4mo

26

84 Unifying Bargaining Notions (2/2)

Diffractor

4mo

11

33 Announcing: Mechanism Design for AI Safety - Reading Group

Rubi J. Hudson

4mo

3

73 CLR's recent work on multi-agent systems

JesseClifton

1y

1

143 The Commitment Races problem

Daniel Kokotajlo

3y

39

81 "Zero Sum" is a misnomer.

abramdemski

2y

35

32 My take on higher-order game theory

Nisan

1y

6

56 When Hindsight Isn't 20/20: Incentive Design With Imperfect Credit Allocation

johnswentworth

2y

24

77 Preface to CLR's Research Agenda on Cooperation, Conflict, and TAI

JesseClifton

3y

10

60 What counts as defection?

TurnTrout

2y

21

41 Sections 1 & 2: Introduction, Strategy and Governance

JesseClifton

3y

5

33 My AGI safety research—2022 review, ’23 plans

Steven Byrnes

6d

6

285 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

181 Some conceptual alignment research projects

Richard_Ngo

3mo

14

16 Theories of impact for Science of Deep Learning

Marius Hobbhahn

19d

0

20 Distilled Representations Research Agenda

Hoagy

2mo

2

102 Our take on CHAI’s research agenda in under 1500 words

Alex Flint

2y

19

39 Research agenda update

Steven Byrnes

1y

40

24 New year, new research agenda post

Charlie Steiner

11mo

4

121 Thoughts on Human Models

Ramana Kumar

3y

32

52 Resources for AI Alignment Cartography

Gyrodiot

2y

8

19 Immobile AI makes a move: anti-wireheading, ontology change, and model splintering

Stuart_Armstrong

1y

3

75 The Learning-Theoretic AI Alignment Research Agenda

Vanessa Kosoy

4y

39

52 Research Agenda v0.9: Synthesising a human's preferences into a utility function

Stuart_Armstrong

3y

25

42 Technical AGI safety research outside AI

Richard_Ngo

3y

3