Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
22 posts
Conjecture (org)
Project Announcement
Encultured AI (org)
11 posts
Refine
Analogy
178
Mysteries of mode collapse
janus
1mo
35
143
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
118
We Are Conjecture, A New Alignment Research Startup
Connor Leahy
8mo
24
108
What I Learned Running Refine
adamShimi
26d
5
105
Announcing Encultured AI: Building a Video Game
Andrew_Critch
4mo
26
96
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
beren
22d
27
76
Refine: An Incubator for Conceptual Alignment Research Bets
adamShimi
8mo
13
70
Announcing the Vitalik Buterin Fellowships in AI Existential Safety!
DanielFilan
1y
2
68
How to Diversify Conceptual Alignment: the Model Behind Refine
adamShimi
5mo
11
64
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
61
Circumventing interpretability: How to defeat mind-readers
Lee Sharkey
5mo
8
56
Interpreting Neural Networks through the Polytope Lens
Sid Black
2mo
26
52
Conjecture Second Hiring Round
Connor Leahy
27d
0
47
Refine's First Blog Post Day
adamShimi
4mo
3
48
I missed the crux of the alignment problem the whole time
zeshen
4mo
7
42
My Thoughts on the ML Safety Course
zeshen
2mo
3
41
the Insulated Goal-Program idea
carado
4mo
3
25
goal-program bricks
carado
4mo
2
25
(Structural) Stability of Coupled Optimizers
Paul Bricman
2mo
0
24
confusion about alignment requirements
carado
2mo
10
23
Embedding safety in ML development
zeshen
1mo
1
15
Steelmining via Analogy
Paul Bricman
4mo
0
14
Benchmarking Proposals on Risk Scenarios
Paul Bricman
4mo
2
13
Refine's Third Blog Post Day/Week
adamShimi
3mo
0
13
Refine Blogpost Day #3: The shortforms I did write
Alexander Gietelink Oldenziel
3mo
0