Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
49 posts
Conjecture (org)
Refine
Project Announcement
Encultured AI (org)
63 posts
Language Models
Anthropic
Exploratory Engineering
Transformer Circuits
Transformers
80
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
213
Mysteries of mode collapse
janus
1mo
35
183
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
123
Refine: An Incubator for Conceptual Alignment Research Bets
adamShimi
8mo
13
64
Searching for Search
NicholasKees
22d
6
55
The First Filter
adamShimi
24d
5
103
What I Learned Running Refine
adamShimi
26d
5
103
Understanding Conjecture: Notes from Connor Leahy interview
Akash
3mo
24
28
Good Futures Initiative: Winter Project Internship
Aris
23d
4
103
Announcing Encultured AI: Building a Video Game
Andrew_Critch
4mo
26
46
AMA Conjecture, A New Alignment Startup
adamShimi
8mo
42
176
Connor Leahy on Dying with Dignity, EleutherAI and Conjecture
Michaƫl Trazzi
5mo
29
2
Creating a database for base rates
nikos
8d
1
28
confusion about alignment requirements
carado
2mo
10
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
5
Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois
1d
6
123
Did ChatGPT just gaslight me?
ThomasW
19d
45
472
Simulators
janus
3mo
103
19
Shh, don't tell the AI it's likely to be evil
naterush
14d
9
0
Simulators and Mindcrime
DragonGod
11d
4
16
Does a LLM have a utility function?
Dagon
11d
6
27
Gliders in Language Models
Alexandre Variengien
25d
11
46
A brainteaser for language models
Adam Scherlis
8d
3
164
Language models seem to be much better than humans at next-token prediction
Buck
4mo
56
26
An exploration of GPT-2's embedding weights
Adam Scherlis
7d
2
52
Paper: Large Language Models Can Self-improve [Linkpost]
Evan R. Murphy
2mo
14
50
They gave LLMs access to physics simulators
ryan_b
2mo
18
45
Discovering Latent Knowledge in Language Models Without Supervision
Xodarap
6d
1