Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

49 posts Conjecture (org) Refine Project Announcement Encultured AI (org)

63 posts Language Models Anthropic Exploratory Engineering Transformer Circuits Transformers

267 We Are Conjecture, A New Alignment Research Startup

Connor Leahy

8mo

24

262 Mysteries of mode collapse

janus

1mo

35

234 Conjecture: a retrospective after 8 months of work

Connor Leahy

27d

9

234 Connor Leahy on Dying with Dignity, EleutherAI and Conjecture

Michaël Trazzi

5mo

29

180 Refine: An Incubator for Conceptual Alignment Research Bets

adamShimi

8mo

13

130 Current themes in mechanistic interpretability research

Lee Sharkey

1mo

3

125 Conjecture Second Hiring Round

Connor Leahy

27d

0

106 Announcing Encultured AI: Building a Video Game

Andrew_Critch

4mo

26

105 Understanding Conjecture: Notes from Connor Leahy interview

Akash

3mo

24

103 What I Learned Running Refine

adamShimi

26d

5

103 Searching for Search

NicholasKees

22d

6

101 [Interim research report] Taking features out of superposition with sparse autoencoders

Lee Sharkey

7d

10

93 How to Diversify Conceptual Alignment: the Model Behind Refine

adamShimi

5mo

11

88 Abstracting The Hardness of Alignment: Unbounded Atomic Optimization

adamShimi

4mo

3

808 Simulators

janus

3mo

103

267 New Scaling Laws for Large Language Models

1a3orn

8mo

21

179 The case for aligning narrowly superhuman models

Ajeya Cotra

1y

74

173 Language models seem to be much better than humans at next-token prediction

Buck

4mo

56

148 Did ChatGPT just gaslight me?

ThomasW

19d

45

147 Who models the models that model models? An exploration of GPT-3's in-context model fitting ability

Lovre

6mo

14

138 GPT-3 Catching Fish in Morse Code

Megan Kinniment

5mo

27

133 Transformer Circuits

evhub

12mo

4

123 RL with KL penalties is better seen as Bayesian inference

Tomek Korbak

6mo

15

106 Paper: Teaching GPT3 to express uncertainty in words

Owain_Evans

6mo

7

105 Inverse Scaling Prize: Round 1 Winners

Ethan Perez

2mo

16

104 Testing PaLM prompts on GPT3

Yitz

8mo

15

103 A Summary Of Anthropic's First Paper

Sam Ringer

11mo

0

94 Help ARC evaluate capabilities of current language models (still need people)

Beth Barnes

5mo

6