Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
62 posts
Language Models
Anthropic
Transformer Circuits
Transformers
1 posts
Exploratory Engineering
27
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
32
Take 11: "Aligning language models" should be weirder.
Charlie Steiner
2d
0
7
Will research in AI risk jinx it? Consequences of training AI on AI risk arguments
Yann Dubois
1d
6
52
Discovering Latent Knowledge in Language Models Without Supervision
Xodarap
6d
1
148
Did ChatGPT just gaslight me?
ThomasW
19d
45
808
Simulators
janus
3mo
103
36
An exploration of GPT-2's embedding weights
Adam Scherlis
7d
2
37
A brainteaser for language models
Adam Scherlis
8d
3
21
Shh, don't tell the AI it's likely to be evil
naterush
14d
9
37
Gliders in Language Models
Alexandre Variengien
25d
11
173
Language models seem to be much better than humans at next-token prediction
Buck
4mo
56
105
Inverse Scaling Prize: Round 1 Winners
Ethan Perez
2mo
16
9
Does a LLM have a utility function?
Dagon
11d
6
61
They gave LLMs access to physics simulators
ryan_b
2mo
18
14
von Neumann probes and Dyson spheres: what exploratory engineering can tell us about the Fermi paradox
Stuart_Armstrong
10y
21