Tree of Tags

Go Back

You can't go any further

Choose this branch

meritocratic regular democratic

hot top alive

0 posts Truthful AI

593 posts AI Autonomy and Choice

37 Existential AI Safety is NOT separate from near-term applications

scasper

7d

15

62 Towards Hodge-podge Alignment

Cleo Nardo

1d

20

37 The "Minimal Latents" Approach to Natural Abstractions

johnswentworth

22h

6

92 Trying to disambiguate different questions about whether RLHF is “good”

Buck

6d

39

159 Using GPT-Eliezer against ChatGPT Jailbreaking

Stuart_Armstrong

14d

77

77 A shot at the diamond-alignment problem

TurnTrout

2mo

53

232 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

41 In defense of probably wrong mechanistic models

evhub

14d

10

56 Verification Is Not Easier Than Generation In General

johnswentworth

14d

23

20 Concept extrapolation for hypothesis generation

Stuart_Armstrong

8d

2

69 Update to Mysteries of mode collapse: text-davinci-002 not RLHF

janus

1mo

8

86 How could we know that an AGI system will have good consequences?

So8res

1mo

24

77 Automating Auditing: An ambitious concrete technical research proposal

evhub

1y

9

75 Response to Katja Grace's AI x-risk counterarguments

Erik Jenner

2mo

18