Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

29 posts Language Models Definitions PaLM Prompt Engineering Robotics

10 posts Scaling Laws

163 Language models seem to be much better than humans at next-token prediction

Buck

4mo

56

157 Transformer Circuits

evhub

12mo

4

113 The case for becoming a black-box investigator of language models

Buck

7mo

19

106 Testing PaLM prompts on GPT3

Yitz

8mo

15

99 Help ARC evaluate capabilities of current language models (still need people)

Beth Barnes

5mo

6

84 Who models the models that model models? An exploration of GPT-3's in-context model fitting ability

Lovre

6mo

14

75 Inverse Scaling Prize: Round 1 Winners

Ethan Perez

2mo

16

64 RL with KL penalties is better seen as Bayesian inference

Tomek Korbak

6mo

15

61 Language Model Alignment Research Internships

Ethan Perez

1y

1

45 Deep learning curriculum for large language model alignment

Jacob_Hilton

5mo

3

44 Compact vs. Wide Models

Vaniver

4y

5

42 NLP Position Paper: When Combatting Hype, Proceed with Caution

Sam Bowman

1y

15

42 Gears-Level Mental Models of Transformer Interpretability

KevinRoWang

8mo

4

40 Paper: Large Language Models Can Self-improve [Linkpost]

Evan R. Murphy

2mo

14

234 chinchilla's wild implications

nostalgebraist

4mo

114

141 Announcing the Inverse Scaling Prize ($250k Prize Pool)

Ethan Perez

5mo

14

78 Thoughts on the Alignment Implications of Scaling Language Models

leogao

1y

11

65 Causal confusion as an argument against the scaling hypothesis

RobertKirk

6mo

30

50 NVIDIA and Microsoft releases 530B parameter transformer model, Megatron-Turing NLG

Ozyrus

1y

36

46 [Link] Training Compute-Optimal Large Language Models

nostalgebraist

8mo

23

45 Smoke without fire is scary

Adam Jermyn

2mo

22

34 Parameter counts in Machine Learning

Jsevillamol

1y

16

28 Inverse scaling can become U-shaped

Edouard Harris

1mo

15

5 Updates on scaling laws for foundation models from ' Transcending Scaling Laws with 0.1% Extra Compute'

Nick_Greig

1mo

2