Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

3083 posts AI GPT AI Timelines Machine Learning (ML) AI Takeoff Interpretability (ML & AI) Language Models Conjecture (org) Careers Instrumental Convergence Iterated Amplification Art

763 posts Anthropics Existential Risk Whole Brain Emulation Sleeping Beauty Paradox Threat Models Academic Papers Space Exploration & Colonization Great Filter Paradoxes Extraterrestrial Life Pascal's Mugging Longtermism

27 Discovering Language Model Behaviors with Model-Written Evaluations

evhub

4h

3

40 Towards Hodge-podge Alignment

Cleo Nardo

1d

20

10 An Open Agency Architecture for Safe Transformative AI

davidad

11h

11

108 The next decades might be wild

Marius Hobbhahn

5d

21

0 I believe some AI doomers are overconfident

FTPickle

6h

4

33 The "Minimal Latents" Approach to Natural Abstractions

johnswentworth

22h

6

57 Reframing inner alignment

davidad

9d

13

3 Will research in AI risk jinx it? Consequences of training AI on AI risk arguments

Yann Dubois

1d

6

70 Bad at Arithmetic, Promising at Math

cohenmacaulay

2d

17

22 Existential AI Safety is NOT separate from near-term applications

scasper

7d

15

43 Next Level Seinfeld

Zvi

1d

6

46 Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.

Charlie Steiner

8d

14

13 Will Machines Ever Rule the World? MLAISU W50

Esben Kran

4d

4

106 How "Discovering Latent Knowledge in Language Models Without Supervision" Fits Into a Broader Alignment Scheme

Collin

5d

18

58 Who are some prominent reasonable people who are confident that AI won't kill everyone?

Optimization Process

15d

40

93 AI will change the world, but won’t take it over by playing “3-dimensional chess”.

boazbarak

28d

86

87 Meta AI announces Cicero: Human-Level Diplomacy play (with dialogue)

Jacy Reese Anthis

28d

64

34 All AGI Safety questions welcome (especially basic ones) [~monthly thread]

Robert Miles

1mo

100

217 Counterarguments to the basic AI x-risk case

KatjaGrace

2mo

122

140 Worlds Where Iterative Design Fails

johnswentworth

3mo

26

63 Could a single alien message destroy us?

Writer

25d

23

23 Three Fables of Magical Girls and Longtermism

Ulisse Mini

18d

11

-3 AGI Impossible due to Energy Constrains

TheKlaus

20d

13

5 Introducing The Logical Foundation, A Plan to End Poverty With Guaranteed Income

Michael Simm

1mo

23

38 All AGI safety questions welcome (especially basic ones) [July 2022]

plex

5mo

130

25 AI X-risk >35% mostly based on a recent peer-reviewed argument

michaelcohen

1mo

31

72 Far-UVC Light Update: No, LEDs are not around the corner (tweetstorm)

Davidmanheim

1mo

27

20 Quantifying anthropic effects on the Fermi paradox

Lanrian

3y

5