Tree of Tags

Go Back

You can't go any further

You can't go any further

meritocratic regular democratic

hot top alive

0 posts AI Sentience

1854 posts AI

84 Towards Hodge-podge Alignment

Cleo Nardo

1d

20

41 The "Minimal Latents" Approach to Natural Abstractions

johnswentworth

22h

6

198 The next decades might be wild

Marius Hobbhahn

5d

21

265 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

6 I believe some AI doomers are overconfident

FTPickle

6h

4

13 Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.

Charlie Steiner

19h

0

19 Why mechanistic interpretability does not and cannot contribute to long-term AGI safety (from messages with a friend)

Remmelt

1d

6

323 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

107 Okay, I feel it now

g1

7d

14

111 Revisiting algorithmic progress

Tamay

7d

6

89 Trying to disambiguate different questions about whether RLHF is “good”

Buck

6d

39

16 Hacker-AI and Cyberwar 2.0+

Erland Wittkotter

1d

0

190 Using GPT-Eliezer against ChatGPT Jailbreaking

Stuart_Armstrong

14d

77

74 Predicting GPU performance

Marius Hobbhahn

6d

24