Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
240 posts
GPT
Language Models
Conjecture (org)
Bounties & Prizes (active)
Refine
Project Announcement
QURI
AI Safety Public Materials
Anthropic
Squiggle
Exploratory Engineering
Encultured AI (org)
248 posts
Machine Learning (ML)
Art
Music
OpenAI
Scaling Laws
DALL-E
Symbol Grounding
Meta-Humor
Computing Overhang
GAN
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
112
Bad at Arithmetic, Promising at Math
cohenmacaulay
2d
17
47
Next Level Seinfeld
Zvi
1d
6
32
Take 11: "Aligning language models" should be weirder.
Charlie Steiner
2d
0
314
Jailbreaking ChatGPT on Release Day
Zvi
18d
74
101
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
15
Does ChatGPT’s performance warrant working on a tutor for children? [It’s time to take it to the lab.]
Bill Benzon
1d
2
7
Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois
1d
6
52
Discovering Latent Knowledge in Language Models Without Supervision
Xodarap
6d
1
234
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
148
Did ChatGPT just gaslight me?
ThomasW
19d
45
808
Simulators
janus
3mo
103
262
Mysteries of mode collapse
janus
1mo
35
36
An exploration of GPT-2's embedding weights
Adam Scherlis
7d
2
37
Reframing inner alignment
davidad
9d
13
521
chinchilla's wild implications
nostalgebraist
4mo
114
22
My thoughts on OpenAI's Alignment plan
Donald Hobson
10d
0
21
Neural networks biased towards geometrically simple functions?
DavidHolmes
12d
2
264
Common misconceptions about OpenAI
Jacob_Hilton
3mo
138
24
ChatGPT seems overconfident to me
qbolec
16d
3
415
What DALL-E 2 can and cannot do
Swimmer963
7mo
305
46
love, not competition
carado
1mo
20
26
Updates on scaling laws for foundation models from ' Transcending Scaling Laws with 0.1% Extra Compute'
Nick_Greig
1mo
2
102
Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible
Sam Bowman
3mo
6
25
Why don't we have self driving cars yet?
Linda Linsefors
1mo
16
27
Inverse scaling can become U-shaped
Edouard Harris
1mo
15
197
dalle2 comments
nostalgebraist
7mo
13
184
A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]
Dan H
7mo
5