Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
49 posts
Conjecture (org)
Refine
Project Announcement
Encultured AI (org)
63 posts
Language Models
Anthropic
Exploratory Engineering
Transformer Circuits
Transformers
267
We Are Conjecture, A New Alignment Research Startup
Connor Leahy
8mo
24
262
Mysteries of mode collapse
janus
1mo
35
234
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
234
Connor Leahy on Dying with Dignity, EleutherAI and Conjecture
Michaƫl Trazzi
5mo
29
180
Refine: An Incubator for Conceptual Alignment Research Bets
adamShimi
8mo
13
130
Current themes in mechanistic interpretability research
Lee Sharkey
1mo
3
125
Conjecture Second Hiring Round
Connor Leahy
27d
0
106
Announcing Encultured AI: Building a Video Game
Andrew_Critch
4mo
26
105
Understanding Conjecture: Notes from Connor Leahy interview
Akash
3mo
24
103
What I Learned Running Refine
adamShimi
26d
5
103
Searching for Search
NicholasKees
22d
6
101
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
93
How to Diversify Conceptual Alignment: the Model Behind Refine
adamShimi
5mo
11
88
Abstracting The Hardness of Alignment: Unbounded Atomic Optimization
adamShimi
4mo
3
808
Simulators
janus
3mo
103
267
New Scaling Laws for Large Language Models
1a3orn
8mo
21
179
The case for aligning narrowly superhuman models
Ajeya Cotra
1y
74
173
Language models seem to be much better than humans at next-token prediction
Buck
4mo
56
148
Did ChatGPT just gaslight me?
ThomasW
19d
45
147
Who models the models that model models? An exploration of GPT-3's in-context model fitting ability
Lovre
6mo
14
138
GPT-3 Catching Fish in Morse Code
Megan Kinniment
5mo
27
133
Transformer Circuits
evhub
12mo
4
123
RL with KL penalties is better seen as Bayesian inference
Tomek Korbak
6mo
15
106
Paper: Teaching GPT3 to express uncertainty in words
Owain_Evans
6mo
7
105
Inverse Scaling Prize: Round 1 Winners
Ethan Perez
2mo
16
104
Testing PaLM prompts on GPT3
Yitz
8mo
15
103
A Summary Of Anthropic's First Paper
Sam Ringer
11mo
0
94
Help ARC evaluate capabilities of current language models (still need people)
Beth Barnes
5mo
6