Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

808 posts AI Embedded Agency Eliciting Latent Knowledge (ELK) Reinforcement Learning Infra-Bayesianism Counterfactuals Logic & Mathematics AI Capabilities Interviews Audio Subagents Wireheading

126 posts Value Learning Inverse Reinforcement Learning Machine Intelligence Research Institute (MIRI) Agent Foundations Meta-Philosophy Metaethics Community Philosophy The Pointers Problem Moral Uncertainty Cognitive Reduction Center for Human-Compatible AI (CHAI)

503 (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen

3mo

83

409 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

334 EfficientZero: How It Works

1a3orn

1y

42

271 Reward is not the optimization target

TurnTrout

4mo

97

265 The Plan

johnswentworth

1y

77

265 Ngo and Yudkowsky on alignment difficulty

Eliezer Yudkowsky

1y

143

265 An overview of 11 proposals for building safe advanced AI

evhub

2y

36

263 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

257 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

253 Embedded Agents

abramdemski

4y

41

251 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

248 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

247 Optimality is the tiger, and agents are its teeth

Veedrac

8mo

31

237 Humans are very reliable agents

alyssavance

6mo

35

297 Why Agent Foundations? An Overly Abstract Explanation

johnswentworth

9mo

54

250 The Rocket Alignment Problem

Eliezer Yudkowsky

4y

42

203 2018 AI Alignment Literature Review and Charity Comparison

Larks

4y

26

127 2019 AI Alignment Literature Review and Charity Comparison

Larks

3y

18

109 The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

johnswentworth

2y

43

100 What I’ll be doing at MIRI

evhub

3y

6

99 Announcing the Introduction to ML Safety course

Dan H

4mo

6

94 Call for research on evaluating alignment (funding + advice available)

Beth Barnes

1y

11

94 Introducing the ML Safety Scholars Program

Dan H

7mo

2

93 Full-time AGI Safety!

Steven Byrnes

1y

3

91 AI Safety and Neighboring Communities: A Quick-Start Guide, as of Summer 2022

Sam Bowman

3mo

2

87 Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility

Akash

28d

20

85 Prize and fast track to alignment research at ALTER

Vanessa Kosoy

3mo

4

85 Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]

habryka

1y

4