Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

67 posts Value Learning Inverse Reinforcement Learning The Pointers Problem Meta-Philosophy Metaethics Kolmogorov Complexity Philosophy Book Reviews Perceptual Control Theory

59 posts Community Agent Foundations Machine Intelligence Research Institute (MIRI) Cognitive Reduction Center for Human-Compatible AI (CHAI) Regulation and AI Risk Grants & Fundraising Opportunities Future of Humanity Institute (FHI) Population Ethics Utilitarianism Moral Uncertainty The SF Bay Area

53 Don't design agents which exploit adversarial inputs

TurnTrout

1mo

61

39 People care about each other even though they have imperfect motivational pointers?

TurnTrout

1mo

25

40 Beyond Kolmogorov and Shannon

Alexander Gietelink Oldenziel

1mo

14

50 Different perspectives on concept extrapolation

Stuart_Armstrong

8mo

7

15 What Should AI Owe To Us? Accountable and Aligned AI Systems via Contractualist AI Alignment

xuan

3mo

15

69 [Book Review] "The Alignment Problem" by Brian Christian

lsusr

1y

16

99 The Pointers Problem: Human Values Are A Function Of Humans' Latent Variables

johnswentworth

2y

43

66 Parsing Chris Mingard on Neural Networks

Alex Flint

1y

27

30 How an alien theory of mind might be unlearnable

Stuart_Armstrong

11mo

35

20 Value extrapolation, concept extrapolation, model splintering

Stuart_Armstrong

9mo

1

54 Normativity

abramdemski

2y

11

21 Morally underdefined situations can be deadly

Stuart_Armstrong

1y

8

40 Recursive Quantilizers II

abramdemski

2y

15

76 Some Thoughts on Metaphilosophy

Wei_Dai

3y

27

23 Event [Berkeley]: Alignment Collaborator Speed-Meeting

AlexMennen

1d

2

17 Looking for an alignment tutor

JanBrauner

3d

2

51 Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility

Akash

28d

20

197 Why Agent Foundations? An Overly Abstract Explanation

johnswentworth

9mo

54

25 A newcomer’s guide to the technical AI safety field

zeshen

1mo

1

45 Clarifying the Agent-Like Structure Problem

johnswentworth

2mo

14

57 AI Safety and Neighboring Communities: A Quick-Start Guide, as of Summer 2022

Sam Bowman

3mo

2

70 Encultured AI Pre-planning, Part 1: Enabling New Benchmarks

Andrew_Critch

4mo

2

45 Prize and fast track to alignment research at ALTER

Vanessa Kosoy

3mo

4

59 Seeking Interns/RAs for Mechanistic Interpretability Projects

Neel Nanda

4mo

0

12 The Slippery Slope from DALLE-2 to Deepfake Anarchy

scasper

1mo

9

39 Announcing the Introduction to ML Safety course

Dan H

4mo

6

62 Jobs: Help scale up LM alignment research at NYU

Sam Bowman

7mo

1

18 CHAI, Assistance Games, And Fully-Updated Deference [Scott Alexander]

berglund

2mo

1