Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

593 posts AI Social Media Autonomy and Choice Truthful AI

27 posts Eliciting Latent Knowledge (ELK)

242 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

241 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

231 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

217 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

213 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

213 Hiring engineers and researchers to help align GPT-3

paulfchristiano

2y

14

211 Safetywashing

Adam Scholl

5mo

17

205 The Plan

johnswentworth

1y

77

205 Ngo and Yudkowsky on alignment difficulty

Eliezer Yudkowsky

1y

143

202 Attempted Gears Analysis of AGI Intervention Discussion With Eliezer

Zvi

1y

48

194 Announcing the Alignment Research Center

paulfchristiano

1y

6

185 (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen

3mo

83

177 A note about differential technological development

So8res

5mo

31

164 Alignment By Default

johnswentworth

2y

92

219 ARC's first technical report: Eliciting Latent Knowledge

paulfchristiano

1y

88

128 ELK prize results

paulfchristiano

9mo

50

115 Prizes for ELK proposals

paulfchristiano

11mo

156

113 Mechanistic anomaly detection and ELK

paulfchristiano

25d

17

106 Finding gliders in the game of life

paulfchristiano

19d

7

85 ARC paper: Formalizing the presumption of independence

Erik Jenner

1mo

2

77 ELK First Round Contest Winners

Mark Xu

10mo

6

73 Can we efficiently explain model behaviors?

paulfchristiano

4d

0

70 ELK Thought Dump

abramdemski

9mo

18

67 Where I currently disagree with Ryan Greenblatt’s version of the ELK approach

So8res

2mo

7

64 Counterexamples to some ELK proposals

paulfchristiano

11mo

10

60 ELK Computational Complexity: Three Levels of Difficulty

abramdemski

8mo

9

46 Some Hacky ELK Ideas

johnswentworth

10mo

8

46 Eliciting Latent Knowledge Via Hypothetical Sensors

John_Maxwell

11mo

2