Tree of Tags

Go Back

You can't go any further

Choose this branch

meritocratic regular democratic

hot top alive

0 posts

593 posts AI Autonomy and Choice Truthful AI Social Media

49 Existential AI Safety is NOT separate from near-term applications

scasper

7d

15

79 Towards Hodge-podge Alignment

Cleo Nardo

1d

20

39 The "Minimal Latents" Approach to Natural Abstractions

johnswentworth

22h

6

85 Trying to disambiguate different questions about whether RLHF is “good”

Buck

6d

39

182 Using GPT-Eliezer against ChatGPT Jailbreaking

Stuart_Armstrong

14d

77

72 A shot at the diamond-alignment problem

TurnTrout

2mo

53

251 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

43 In defense of probably wrong mechanistic models

evhub

14d

10

48 Verification Is Not Easier Than Generation In General

johnswentworth

14d

23

8 Concept extrapolation for hypothesis generation

Stuart_Armstrong

8d

2

73 Update to Mysteries of mode collapse: text-davinci-002 not RLHF

janus

1mo

8

87 How could we know that an AGI system will have good consequences?

So8res

1mo

24

69 Automating Auditing: An ambitious concrete technical research proposal

evhub

1y

9

73 Response to Katja Grace's AI x-risk counterarguments

Erik Jenner

2mo

18