Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
24 posts
Language Models
Robotics
0 posts
26
Discovering Language Model Behaviors with Model-Written Evaluations
evhub
4h
3
31
Take 11: "Aligning language models" should be weirder.
Charlie Steiner
2d
0
101
Inverse Scaling Prize: Round 1 Winners
Ethan Perez
2mo
16
165
Language models seem to be much better than humans at next-token prediction
Buck
4mo
56
64
Paper: Large Language Models Can Self-improve [Linkpost]
Evan R. Murphy
2mo
14
140
Who models the models that model models? An exploration of GPT-3's in-context model fitting ability
Lovre
6mo
14
89
Help ARC evaluate capabilities of current language models (still need people)
Beth Barnes
5mo
6
116
RL with KL penalties is better seen as Bayesian inference
Tomek Korbak
6mo
15
61
Deep learning curriculum for large language model alignment
Jacob_Hilton
5mo
3
58
Conditioning Generative Models for Alignment
Jozdien
5mo
8
127
Transformer Circuits
evhub
12mo
4
32
Strategy For Conditioning Generative Models
james.lucassen
3mo
4
70
Gears-Level Mental Models of Transformer Interpretability
KevinRoWang
8mo
4
21
A Test for Language Model Consciousness
Ethan Perez
3mo
14