Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

55 posts Language Models Agency Deconfusion Scaling Laws Tool AI Definitions Simulation Hypothesis PaLM Prompt Engineering Philosophy of Language Carving / Clustering Reality Astronomical Waste

33 posts Conjecture (org) Refine Project Announcement Encultured AI (org) Analogy

234 chinchilla's wild implications

nostalgebraist

4mo

114

185 Simulators

janus

3mo

103

163 Language models seem to be much better than humans at next-token prediction

Buck

4mo

56

157 Transformer Circuits

evhub

12mo

4

141 Announcing the Inverse Scaling Prize ($250k Prize Pool)

Ethan Perez

5mo

14

113 The case for becoming a black-box investigator of language models

Buck

7mo

19

108 Beyond Astronomical Waste

Wei_Dai

4y

41

106 Testing PaLM prompts on GPT3

Yitz

8mo

15

99 Help ARC evaluate capabilities of current language models (still need people)

Beth Barnes

5mo

6

84 Who models the models that model models? An exploration of GPT-3's in-context model fitting ability

Lovre

6mo

14

78 Thoughts on the Alignment Implications of Scaling Language Models

leogao

1y

11

75 Inverse Scaling Prize: Round 1 Winners

Ethan Perez

2mo

16

65 Causal confusion as an argument against the scaling hypothesis

RobertKirk

6mo

30

64 RL with KL penalties is better seen as Bayesian inference

Tomek Korbak

6mo

15

178 Mysteries of mode collapse

janus

1mo

35

143 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

118 We Are Conjecture, A New Alignment Research Startup

Connor Leahy

8mo

24

108 What I Learned Running Refine

adamShimi

26d

5

105 Announcing Encultured AI: Building a Video Game

Andrew_Critch

4mo

26

96 The Singular Value Decompositions of Transformer Weight Matrices are Highly Interpretable

beren

22d

27

76 Refine: An Incubator for Conceptual Alignment Research Bets

adamShimi

8mo

13

70 Announcing the Vitalik Buterin Fellowships in AI Existential Safety!

DanielFilan

1y

2

68 How to Diversify Conceptual Alignment: the Model Behind Refine

adamShimi

5mo

11

64 [Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey

7d

10

61 Circumventing interpretability: How to defeat mind-readers

Lee Sharkey

5mo

8

56 Interpreting Neural Networks through the Polytope Lens

Sid Black

2mo

26

52 Conjecture Second Hiring Round

Connor Leahy

27d

0

48 I missed the crux of the alignment problem the whole time

zeshen

4mo

7