Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

808 posts AI Embedded Agency Eliciting Latent Knowledge (ELK) Reinforcement Learning Infra-Bayesianism Counterfactuals Logic & Mathematics AI Capabilities Interviews Audio Subagents Wireheading

126 posts Value Learning Inverse Reinforcement Learning Machine Intelligence Research Institute (MIRI) Agent Foundations Meta-Philosophy Metaethics Community Philosophy The Pointers Problem Moral Uncertainty Cognitive Reduction Center for Human-Compatible AI (CHAI)

259 Humans are very reliable agents

alyssavance

6mo

35

242 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

241 Discussion with Eliezer Yudkowsky on AGI interventions

Rob Bensinger

1y

257

233 Reward is not the optimization target

TurnTrout

4mo

97

231 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

219 ARC's first technical report: Eliciting Latent Knowledge

paulfchristiano

1y

88

217 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

213 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

213 Hiring engineers and researchers to help align GPT-3

paulfchristiano

2y

14

212 EfficientZero: How It Works

1a3orn

1y

42

211 Safetywashing

Adam Scholl

5mo

17

205 The Plan

johnswentworth

1y

77

205 Ngo and Yudkowsky on alignment difficulty

Eliezer Yudkowsky

1y

143

202 Attempted Gears Analysis of AGI Intervention Discussion With Eliezer

Zvi

1y

48

197 Why Agent Foundations? An Overly Abstract Explanation

johnswentworth

9mo

54

177 2018 AI Alignment Literature Review and Charity Comparison

Larks

4y

26

146 The Rocket Alignment Problem

Eliezer Yudkowsky

4y

42

139 Full-time AGI Safety!

Steven Byrnes

1y

3

133 2019 AI Alignment Literature Review and Charity Comparison

Larks

3y

18

120 What I’ll be doing at MIRI

evhub

3y

6

116 Call for research on evaluating alignment (funding + advice available)

Beth Barnes

1y

11

105 Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]

habryka

1y

4

99 The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

johnswentworth

2y

43

82 Comparing Utilities

abramdemski

2y

31

76 Some Thoughts on Metaphilosophy

Wei_Dai

3y

27

71 AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah

Palus Astra

2y

27

71 Clarifying "AI Alignment"

paulfchristiano

4y

82

70 Encultured AI Pre-planning, Part 1: Enabling New Benchmarks

Andrew_Critch

4mo

2