Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

3 posts Project Announcement Encultured AI (org)

19 posts Conjecture (org)

32 Encultured AI Pre-planning, Part 2: Providing a Service

Andrew_Critch

4mo

4

105 Announcing Encultured AI: Building a Video Game

Andrew_Critch

4mo

26

70 Announcing the Vitalik Buterin Fellowships in AI Existential Safety!

DanielFilan

1y

2

64 [Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey

7d

10

96 The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable

beren

22d

27

178 Mysteries of mode collapse

janus

1mo

35

143 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

108 What I Learned Running Refine

adamShimi

26d

5

56 Interpreting Neural Networks through the Polytope Lens

Sid Black

2mo

26

41 Current themes in mechanistic interpretability research

Lee Sharkey

1mo

3

61 Circumventing interpretability: How to defeat mind-readers

Lee Sharkey

5mo

8

41 Abstracting The Hardness of Alignment: Unbounded Atomic Optimization

adamShimi

4mo

3

68 How to Diversify Conceptual Alignment: the Model Behind Refine

adamShimi

5mo

11

31 Mosaic and Palimpsests: Two Shapes of Research

adamShimi

5mo

3

39 Epistemological Vigilance for Alignment

adamShimi

6mo

11

47 Refine's First Blog Post Day

adamShimi

4mo

3

76 Refine: An Incubator for Conceptual Alignment Research Bets

adamShimi

8mo

13