Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
3 posts
Project Announcement
Encultured AI (org)
19 posts
Conjecture (org)
32
Encultured AI Pre-planning, Part 2: Providing a Service
Andrew_Critch
4mo
4
105
Announcing Encultured AI: Building a Video Game
Andrew_Critch
4mo
26
70
Announcing the Vitalik Buterin Fellowships in AI Existential Safety!
DanielFilan
1y
2
64
[Interim research report] Taking features out of superposition with sparse autoencoders
Lee Sharkey
7d
10
96
The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable
beren
22d
27
178
Mysteries of mode collapse
janus
1mo
35
143
Conjecture: a retrospective after 8 months of work
Connor Leahy
27d
9
108
What I Learned Running Refine
adamShimi
26d
5
56
Interpreting Neural Networks through the Polytope Lens
Sid Black
2mo
26
41
Current themes in mechanistic interpretability research
Lee Sharkey
1mo
3
61
Circumventing interpretability: How to defeat mind-readers
Lee Sharkey
5mo
8
41
Abstracting The Hardness of Alignment: Unbounded Atomic Optimization
adamShimi
4mo
3
68
How to Diversify Conceptual Alignment: the Model Behind Refine
adamShimi
5mo
11
31
Mosaic and Palimpsests: Two Shapes of Research
adamShimi
5mo
3
39
Epistemological Vigilance for Alignment
adamShimi
6mo
11
47
Refine's First Blog Post Day
adamShimi
4mo
3
76
Refine: An Incubator for Conceptual Alignment Research Bets
adamShimi
8mo
13