Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

128 posts GPT Bounties & Prizes (active) QURI AI Safety Public Materials Squiggle Generativity

112 posts Conjecture (org) Language Models Refine Project Announcement Anthropic Exploratory Engineering Encultured AI (org) Transformer Circuits Transformers

112 Bad at Arithmetic, Promising at Math

cohenmacaulay

2d

17

47 Next Level Seinfeld

Zvi

1d

6

314 Jailbreaking ChatGPT on Release Day

Zvi

18d

74

15 Does ChatGPT’s performance warrant working on a tutor for children? [It’s time to take it to the lab.]

Bill Benzon

1d

2

26 Best introductory overviews of AGI safety?

Jakub Kraus

7d

5

26 Is the ChatGPT-simulated Linux virtual machine real?

Kenoubi

7d

7

66 [ASoT] Finetuning, RL, and GPT's world prior

Jozdien

18d

8

93 Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility

Akash

28d

20

19 A crisis for online communication: bots and bot users will overrun the Internet?

Mitchell_Porter

9d

11

5 Abstract concepts and metalingual definition: Does ChatGPT understand justice and charity?

Bill Benzon

4d

0

20 Testing Ways to Bypass ChatGPT's Safety Features

Robert_AIZI

15d

2

126 AI Timelines via Cumulative Optimization Power: Less Long, More Short

jacob_cannell

2mo

32

147 Announcing $5,000 bounty for (responsibly) ending malaria

lc

2mo

42

21 ChatGPT is surprisingly and uncanningly good at pretending to be sentient

ZT5

17d

11

27 Discovering Language Model Behaviors with Model-Written Evaluations

evhub

4h

3

32 Take 11: "Aligning language models" should be weirder.

Charlie Steiner

2d

0

101 [Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey

7d

10

7 Will research in AI risk jinx it? Consequences of training AI on AI risk arguments

Yann Dubois

1d

6

52 Discovering Latent Knowledge in Language Models Without Supervision

Xodarap

6d

1

234 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

148 Did ChatGPT just gaslight me?

ThomasW

19d

45

808 Simulators

janus

3mo

103

262 Mysteries of mode collapse

janus

1mo

35

36 An exploration of GPT-2's embedding weights

Adam Scherlis

7d

2

37 A brainteaser for language models

Adam Scherlis

8d

3

103 Searching for Search

NicholasKees

22d

6

125 Conjecture Second Hiring Round

Connor Leahy

27d

0

32 Tradeoffs in complexity, abstraction, and generality

remember

8d

0