Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

67 posts Value Learning Inverse Reinforcement Learning The Pointers Problem Meta-Philosophy Metaethics Kolmogorov Complexity Philosophy Book Reviews Perceptual Control Theory

59 posts Community Agent Foundations Machine Intelligence Research Institute (MIRI) Cognitive Reduction Center for Human-Compatible AI (CHAI) Regulation and AI Risk Grants & Fundraising Opportunities Future of Humanity Institute (FHI) Population Ethics Utilitarianism Moral Uncertainty The SF Bay Area

109 The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

johnswentworth

2y

43

80 Beyond Kolmogorov and Shannon

Alexander Gietelink Oldenziel

1mo

14

74 Preface to the sequence on value learning

Rohin Shah

4y

6

71 [Book Review] "The Alignment Problem" by Brian Christian

lsusr

1y

16

68 Parsing Chris Mingard on Neural Networks

Alex Flint

1y

27

67 Thoughts on "Human-Compatible"

TurnTrout

3y

35

67 Don't design agents which exploit adversarial inputs

TurnTrout

1mo

61

58 Humans can be assigned any values whatsoever…

Stuart_Armstrong

4y

26

57 Clarifying "AI Alignment"

paulfchristiano

4y

82

51 Intuitions about goal-directed behavior

Rohin Shah

4y

15

48 Some Thoughts on Metaphilosophy

Wei_Dai

3y

27

48 The easy goal inference problem is still hard

paulfchristiano

4y

19

47 Future directions for ambitious value learning

Rohin Shah

4y

9

46 Policy Alignment

abramdemski

4y

25

297 Why Agent Foundations? An Overly Abstract Explanation

johnswentworth

9mo

54

250 The Rocket Alignment Problem

Eliezer Yudkowsky

4y

42

203 2018 AI Alignment Literature Review and Charity Comparison

Larks

4y

26

127 2019 AI Alignment Literature Review and Charity Comparison

Larks

3y

18

100 What I’ll be doing at MIRI

evhub

3y

6

99 Announcing the Introduction to ML Safety course

Dan H

4mo

6

94 Call for research on evaluating alignment (funding + advice available)

Beth Barnes

1y

11

94 Introducing the ML Safety Scholars Program

Dan H

7mo

2

93 Full-time AGI Safety!

Steven Byrnes

1y

3

91 AI Safety and Neighboring Communities: A Quick-Start Guide, as of Summer 2022

Sam Bowman

3mo

2

87 Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility

Akash

28d

20

85 Prize and fast track to alignment research at ALTER

Vanessa Kosoy

3mo

4

85 Apply to the ML for Alignment Bootcamp (MLAB) in Berkeley [Jan 3 - Jan 22]

habryka

1y

4

83 Challenges with Breaking into MIRI-Style Research

Chris_Leong

11mo

15