Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

1014 posts AI AI Timelines Value Learning AI Takeoff Embedded Agency Community Eliciting Latent Knowledge (ELK) Reinforcement Learning Infra-Bayesianism Counterfactuals Logic & Mathematics Interviews

111 posts Iterated Amplification Game Theory Factored Cognition Humans Consulting HCH Research Agendas Ought Debate (AI safety technique) Risks of Astronomical Suffering (S-risks) Center on Long-Term Risk (CLR) Mechanism Design Fairness Group Rationality

259 Humans are very reliable agents

alyssavance

6mo

35

259 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

252 What 2026 looks like

Daniel Kokotajlo

1y

98

242 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

241 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

233 Reward is not the optimization target

TurnTrout

4mo

97

231 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

219 ARC's first technical report: Eliciting Latent Knowledge

paulfchristiano

1y

88

217 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

214 Are we in an AI overhang?

Andy Jones

2y

109

213 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

213 Hiring engineers and researchers to help align GPT-3

paulfchristiano

2y

14

212 EfficientZero: How It Works

1a3orn

1y

42

211 Safetywashing

Adam Scholl

5mo

17

231 On how various plans miss the hard bits of the alignment challenge

So8res

5mo

81

204 Unifying Bargaining Notions (1/2)

Diffractor

4mo

38

155 Some conceptual alignment research projects

Richard_Ngo

3mo

14

155 «Boundaries», Part 1: a key missing concept from utility theory

Andrew_Critch

4mo

26

139 Debate update: Obfuscated arguments problem

Beth Barnes

1y

21

131 "Zero Sum" is a misnomer.

abramdemski

2y

35

127 Thoughts on Human Models

Ramana Kumar

3y

32

124 My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda

Chi Nguyen

2y

21

122 Supervise Process, not Outcomes

stuhlmueller

8mo

8

122 Our take on CHAI’s research agenda in under 1500 words

Alex Flint

2y

19

120 Unifying Bargaining Notions (2/2)

Diffractor

4mo

11

118 Paul's research agenda FAQ

zhukeepa

4y

73

106 Imitative Generalisation (AKA 'Learning the Prior')

Beth Barnes

1y

14

102 What counts as defection?

TurnTrout

2y

21