Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
22 posts
Conjecture (org)
Project Announcement
Encultured AI (org)
11 posts
Refine
Analogy
254
We Are Conjecture, A New Alignment Research Startup
Connor Leahy
8mo
24
248
Mysteries of mode collapse
janus
1mo
35
223
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
222
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
beren
22d
27
190
Interpreting Neural Networks through the Polytope Lens
Sid Black
2mo
26
170
Refine: An Incubator for Conceptual Alignment Research Bets
adamShimi
8mo
13
127
Circumventing interpretability: How to defeat mind-readers
Lee Sharkey
5mo
8
123
Current themes in mechanistic interpretability research
Lee Sharkey
1mo
3
118
Conjecture Second Hiring Round
Connor Leahy
27d
0
101
Announcing Encultured AI: Building a Video Game
Andrew_Critch
4mo
26
98
What I Learned Running Refine
adamShimi
26d
5
97
Searching for Search
NicholasKees
22d
6
96
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
88
How to Diversify Conceptual Alignment: the Model Behind Refine
adamShimi
5mo
11
58
I missed the crux of the alignment problem the whole time
zeshen
4mo
7
56
My Thoughts on the ML Safety Course
zeshen
2mo
3
37
the Insulated Goal-Program idea
carado
4mo
3
36
Benchmarking Proposals on Risk Scenarios
Paul Bricman
4mo
2
33
Steelmining via Analogy
Paul Bricman
4mo
0
33
Refine Blogpost Day #3: The shortforms I did write
Alexander Gietelink Oldenziel
3mo
0
32
confusion about alignment requirements
carado
2mo
10
29
goal-program bricks
carado
4mo
2
25
Embedding safety in ML development
zeshen
1mo
1
25
(Structural) Stability of Coupled Optimizers
Paul Bricman
2mo
0
23
Refine's Third Blog Post Day/Week
adamShimi
3mo
0