Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

80 posts Oracle AI Myopia AI Boxing (Containment) Deceptive Alignment Deception Acausal Trade Self Fulfilling/Refuting Prophecies Bounties (closed) Parables & Fables Superrationality Values handshakes Computer Security & Cryptography

88 posts Conjecture (org) Language Models Refine Agency Deconfusion Scaling Laws Project Announcement Encultured AI (org) Tool AI Definitions PaLM Prompt Engineering

258 The Parable of Predict-O-Matic

abramdemski

3y

42

145 Decision theory does not imply that we get to have nice things

So8res

2mo

53

119 Monitoring for deceptive alignment

evhub

3mo

7

107 The Credit Assignment Problem

abramdemski

3y

40

85 Trying to Make a Treacherous Mesa-Optimizer

MadHatter

1mo

13

72 Prize for probable problems

paulfchristiano

4y

63

67 Arguments against myopic training

Richard_Ngo

2y

39

67 Results of $1,000 Oracle contest!

Stuart_Armstrong

2y

2

65 Cryptographic Boxes for Unfriendly AI

paulfchristiano

12y

162

65 Why GPT wants to mesa-optimize & how we might change this

John_Maxwell

2y

32

64 How likely is deceptive alignment?

evhub

3mo

21

63 Partial Agency

abramdemski

3y

18

60 Contest: $1,000 for good questions to ask to an Oracle AI

Stuart_Armstrong

3y

156

57 Open Problems with Myopia

Mark Xu

1y

16

234 chinchilla's wild implications

nostalgebraist

4mo

114

185 Simulators

janus

3mo

103

178 Mysteries of mode collapse

janus

1mo

35

163 Language models seem to be much better than humans at next-token prediction

Buck

4mo

56

157 Transformer Circuits

evhub

12mo

4

143 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

141 Announcing the Inverse Scaling Prize ($250k Prize Pool)

Ethan Perez

5mo

14

118 We Are Conjecture, A New Alignment Research Startup

Connor Leahy

8mo

24

113 The case for becoming a black-box investigator of language models

Buck

7mo

19

108 What I Learned Running Refine

adamShimi

26d

5

108 Beyond Astronomical Waste

Wei_Dai

4y

41

106 Testing PaLM prompts on GPT3

Yitz

8mo

15

105 Announcing Encultured AI: Building a Video Game

Andrew_Critch

4mo

26

99 Help ARC evaluate capabilities of current language models (still need people)

Beth Barnes

5mo

6