Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
49 posts
Conjecture (org)
Refine
Project Announcement
Encultured AI (org)
63 posts
Language Models
Anthropic
Exploratory Engineering
Transformer Circuits
Transformers
101
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
262
Mysteries of mode collapse
janus
1mo
35
234
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
180
Refine: An Incubator for Conceptual Alignment Research Bets
adamShimi
8mo
13
103
Searching for Search
NicholasKees
22d
6
68
The First Filter
adamShimi
24d
5
103
What I Learned Running Refine
adamShimi
26d
5
105
Understanding Conjecture: Notes from Connor Leahy interview
Akash
3mo
24
44
Good Futures Initiative: Winter Project Internship
Aris
23d
4
106
Announcing Encultured AI: Building a Video Game
Andrew_Critch
4mo
26
57
AMA Conjecture, A New Alignment Startup
adamShimi
8mo
42
234
Connor Leahy on Dying with Dignity, EleutherAI and Conjecture
Michaƫl Trazzi
5mo
29
5
Creating a database for base rates
nikos
8d
1
34
confusion about alignment requirements
carado
2mo
10
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
7
Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois
1d
6
148
Did ChatGPT just gaslight me?
ThomasW
19d
45
808
Simulators
janus
3mo
103
21
Shh, don't tell the AI it's likely to be evil
naterush
14d
9
4
Simulators and Mindcrime
DragonGod
11d
4
9
Does a LLM have a utility function?
Dagon
11d
6
37
Gliders in Language Models
Alexandre Variengien
25d
11
37
A brainteaser for language models
Adam Scherlis
8d
3
173
Language models seem to be much better than humans at next-token prediction
Buck
4mo
56
36
An exploration of GPT-2's embedding weights
Adam Scherlis
7d
2
67
Paper: Large Language Models Can Self-improve [Linkpost]
Evan R. Murphy
2mo
14
61
They gave LLMs access to physics simulators
ryan_b
2mo
18
52
Discovering Latent Knowledge in Language Models Without Supervision
Xodarap
6d
1