Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

22 posts Conjecture (org) Project Announcement Encultured AI (org)

11 posts Refine Analogy

178 Mysteries of mode collapse

janus

1mo

35

143 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

118 We Are Conjecture, A New Alignment Research Startup

Connor Leahy

8mo

24

108 What I Learned Running Refine

adamShimi

26d

5

105 Announcing Encultured AI: Building a Video Game

Andrew_Critch

4mo

26

96 The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable

beren

22d

27

76 Refine: An Incubator for Conceptual Alignment Research Bets

adamShimi

8mo

13

70 Announcing the Vitalik Buterin Fellowships in AI Existential Safety!

DanielFilan

1y

2

68 How to Diversify Conceptual Alignment: the Model Behind Refine

adamShimi

5mo

11

64 [Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey

7d

10

61 Circumventing interpretability: How to defeat mind-readers

Lee Sharkey

5mo

8

56 Interpreting Neural Networks through the Polytope Lens

Sid Black

2mo

26

52 Conjecture Second Hiring Round

Connor Leahy

27d

0

47 Refine's First Blog Post Day

adamShimi

4mo

3

48 I missed the crux of the alignment problem the whole time

zeshen

4mo

7

42 My Thoughts on the ML Safety Course

zeshen

2mo

3

41 the Insulated Goal-Program idea

carado

4mo

3

25 goal-program bricks

carado

4mo

2

25 (Structural) Stability of Coupled Optimizers

Paul Bricman

2mo

0

24 confusion about alignment requirements

carado

2mo

10

23 Embedding safety in ML development

zeshen

1mo

1

15 Steelmining via Analogy

Paul Bricman

4mo

0

14 Benchmarking Proposals on Risk Scenarios

Paul Bricman

4mo

2

13 Refine's Third Blog Post Day/Week

adamShimi

3mo

0

13 Refine Blogpost Day #3: The shortforms I did write

Alexander Gietelink Oldenziel

3mo

0