Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

26 posts Game Theory Center on Long-Term Risk (CLR) Risks of Astronomical Suffering (S-risks) Mechanism Design Suffering Fairness Blackmail / Extortion Group Rationality Terminology / Jargon (meta) Reading Group

20 posts Research Agendas Goal Factoring Mind Crime

204 Unifying Bargaining Notions (1/2)

Diffractor

4mo

38

155 «Boundaries», Part 1: a key missing concept from utility theory

Andrew_Critch

4mo

26

131 "Zero Sum" is a misnomer.

abramdemski

2y

35

120 Unifying Bargaining Notions (2/2)

Diffractor

4mo

11

102 What counts as defection?

TurnTrout

2y

21

101 The Commitment Races problem

Daniel Kokotajlo

3y

39

98 Threat-Resistant Bargaining Megapost: Introducing the ROSE Value

Diffractor

2mo

11

86 When Hindsight Isn't 20/20: Incentive Design With Imperfect Credit Allocation

johnswentworth

2y

24

79 Siren worlds and the perils of over-optimised search

Stuart_Armstrong

8y

417

60 Reducing collective rationality to individual optimization in common-payoff games using MCMC

jessicata

4y

12

54 «Boundaries», Part 3b: Alignment problems in terms of boundaries

Andrew_Critch

6d

2

50 Book report: Theory of Games and Economic Behavior (von Neumann & Morgenstern)

Nisan

2y

4

48 Nash equilibriums can be arbitrarily bad

Stuart_Armstrong

3y

24

44 My take on higher-order game theory

Nisan

1y

6

231 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

155 Some conceptual alignment research projects

Richard_Ngo

3mo

14

127 Thoughts on Human Models

Ramana Kumar

3y

32

122 Our take on CHAI’s research agenda in under 1500 words

Alex Flint

2y

19

82 Research Agenda v0.9: Synthesising a human's preferences into a utility function

Stuart_Armstrong

3y

25

77 The Learning-Theoretic AI Alignment Research Agenda

Vanessa Kosoy

4y

39

69 Research agenda update

Steven Byrnes

1y

40

45 Immobile AI makes a move: anti-wireheading, ontology change, and model splintering

Stuart_Armstrong

1y

3

44 Technical AGI safety research outside AI

Richard_Ngo

3y

3

44 Research Agenda in reverse: what *would* a solution look like?

Stuart_Armstrong

3y

25

42 New safety research agenda: scalable agent alignment via reward modeling

Vika

4y

13

40 Ultra-simplified research agenda

Stuart_Armstrong

3y

4

38 Resources for AI Alignment Cartography

Gyrodiot

2y

8

36 Why I am not currently working on the AAMLS agenda

jessicata

5y

1