Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

240 posts GPT Language Models Conjecture (org) Bounties & Prizes (active) Refine Project Announcement QURI AI Safety Public Materials Anthropic Squiggle Exploratory Engineering Encultured AI (org)

248 posts Machine Learning (ML) Art Music OpenAI Scaling Laws DALL-E Symbol Grounding Meta-Humor Computing Overhang GAN

27 Discovering Language Model Behaviors with Model-Written Evaluations

evhub

4h

3

112 Bad at Arithmetic, Promising at Math

cohenmacaulay

2d

17

47 Next Level Seinfeld

Zvi

1d

6

32 Take 11: "Aligning language models" should be weirder.

Charlie Steiner

2d

0

314 Jailbreaking ChatGPT on Release Day

Zvi

18d

74

101 [Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey

7d

10

15 Does ChatGPT’s performance warrant working on a tutor for children? [It’s time to take it to the lab.]

Bill Benzon

1d

2

7 Will research in AI risk jinx it? Consequences of training AI on AI risk arguments

Yann Dubois

1d

6

52 Discovering Latent Knowledge in Language Models Without Supervision

Xodarap

6d

1

234 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

148 Did ChatGPT just gaslight me?

ThomasW

19d

45

808 Simulators

janus

3mo

103

262 Mysteries of mode collapse

janus

1mo

35

36 An exploration of GPT-2's embedding weights

Adam Scherlis

7d

2

37 Reframing inner alignment

davidad

9d

13

521 chinchilla's wild implications

nostalgebraist

4mo

114

22 My thoughts on OpenAI's Alignment plan

Donald Hobson

10d

0

21 Neural networks biased towards geometrically simple functions?

DavidHolmes

12d

2

264 Common misconceptions about OpenAI

Jacob_Hilton

3mo

138

24 ChatGPT seems overconfident to me

qbolec

16d

3

415 What DALL-E 2 can and cannot do

Swimmer963

7mo

305

46 love, not competition

carado

1mo

20

26 Updates on scaling laws for foundation models from ' Transcending Scaling Laws with 0.1% Extra Compute'

Nick_Greig

1mo

2

102 Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible

Sam Bowman

3mo

6

25 Why don't we have self driving cars yet?

Linda Linsefors

1mo

16

27 Inverse scaling can become U-shaped

Edouard Harris

1mo

15

197 dalle2 comments

nostalgebraist

7mo

13

184 A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]

Dan H

7mo

5