Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

26 posts Game Theory Center on Long-Term Risk (CLR) Risks of Astronomical Suffering (S-risks) Mechanism Design Suffering Fairness Blackmail / Extortion Group Rationality Terminology / Jargon (meta) Reading Group

20 posts Research Agendas Goal Factoring Mind Crime

174 Unifying Bargaining Notions (1/2)

Diffractor

4mo

38

143 The Commitment Races problem

Daniel Kokotajlo

3y

39

125 «Boundaries», Part 1: a key missing concept from utility theory

Andrew_Critch

4mo

26

84 Unifying Bargaining Notions (2/2)

Diffractor

4mo

11

81 "Zero Sum" is a misnomer.

abramdemski

2y

35

80 Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

Diffractor

2mo

11

77 Preface to CLR's Research Agenda on Cooperation, Conflict, and TAI

JesseClifton

3y

10

73 CLR's recent work on multi-agent systems

JesseClifton

1y

1

67 Siren worlds and the perils of over-optimised search

Stuart_Armstrong

8y

417

60 What counts as defection?

TurnTrout

2y

21

58 Reducing collective rationality to individual optimization in common-payoff games using MCMC

jessicata

4y

12

56 When Hindsight Isn't 20/20: Incentive Design With Imperfect Credit Allocation

johnswentworth

2y

24

44 «Boundaries», Part 3b: Alignment problems in terms of boundaries

Andrew_Critch

6d

2

41 Sections 1 & 2: Introduction, Strategy and Governance

JesseClifton

3y

5

285 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

181 Some conceptual alignment research projects

Richard_Ngo

3mo

14

121 Thoughts on Human Models

Ramana Kumar

3y

32

102 Our take on CHAI’s research agenda in under 1500 words

Alex Flint

2y

19

75 The Learning-Theoretic AI Alignment Research Agenda

Vanessa Kosoy

4y

39

52 Resources for AI Alignment Cartography

Gyrodiot

2y

8

52 Research Agenda v0.9: Synthesising a human's preferences into a utility function

Stuart_Armstrong

3y

25

42 Technical AGI safety research outside AI

Richard_Ngo

3y

3

39 Research agenda update

Steven Byrnes

1y

40

33 My AGI safety research—2022 review, ’23 plans

Steven Byrnes

6d

6

28 Ultra-simplified research agenda

Stuart_Armstrong

3y

4

26 New safety research agenda: scalable agent alignment via reward modeling

Vika

4y

13

24 Research Agenda in reverse: what *would* a solution look like?

Stuart_Armstrong

3y

25

24 New year, new research agenda post

Charlie Steiner

11mo

4