Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
128 posts
GPT
Bounties & Prizes (active)
QURI
AI Safety Public Materials
Squiggle
Generativity
112 posts
Conjecture (org)
Language Models
Refine
Project Announcement
Anthropic
Exploratory Engineering
Encultured AI (org)
Transformer Circuits
Transformers
112
Bad at Arithmetic, Promising at Math
cohenmacaulay
2d
17
47
Next Level Seinfeld
Zvi
1d
6
314
Jailbreaking ChatGPT on Release Day
Zvi
18d
74
15
Does ChatGPT’s performance warrant working on a tutor for children? [It’s time to take it to the lab.]
Bill Benzon
1d
2
26
Best introductory overviews of AGI safety?
Jakub Kraus
7d
5
26
Is the ChatGPT-simulated Linux virtual machine real?
Kenoubi
7d
7
66
[ASoT] Finetuning, RL, and GPT's world prior
Jozdien
18d
8
93
Announcing AI Alignment Awards: $100k research contests about goal misgeneralization & corrigibility
Akash
28d
20
19
A crisis for online communication: bots and bot users will overrun the Internet?
Mitchell_Porter
9d
11
5
Abstract concepts and metalingual definition: Does ChatGPT understand justice and charity?
Bill Benzon
4d
0
20
Testing Ways to Bypass ChatGPT's Safety Features
Robert_AIZI
15d
2
126
AI Timelines via Cumulative Optimization Power: Less Long, More Short
jacob_cannell
2mo
32
147
Announcing $5,000 bounty for (responsibly) ending malaria
lc
2mo
42
21
ChatGPT is surprisingly and uncanningly good at pretending to be sentient
ZT5
17d
11
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
32
Take 11: "Aligning language models" should be weirder.
Charlie Steiner
2d
0
101
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
7
Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois
1d
6
52
Discovering Latent Knowledge in Language Models Without Supervision
Xodarap
6d
1
234
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
148
Did ChatGPT just gaslight me?
ThomasW
19d
45
808
Simulators
janus
3mo
103
262
Mysteries of mode collapse
janus
1mo
35
36
An exploration of GPT-2's embedding weights
Adam Scherlis
7d
2
37
A brainteaser for language models
Adam Scherlis
8d
3
103
Searching for Search
NicholasKees
22d
6
125
Conjecture Second Hiring Round
Connor Leahy
27d
0
32
Tradeoffs in complexity, abstraction, and generality
remember
8d
0