Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

32 posts Machine Learning (ML) OpenAI Lottery Ticket Hypothesis

19 posts DeepMind Truth, Semantics, & Meaning Anthropic Honesty Map and Territory Calibration

199 Common misconceptions about OpenAI

Jacob_Hilton

3mo

138

123 the scaling “inconsistency”: openAI’s new insight

nostalgebraist

2y

14

114 Understanding “Deep Double Descent”

evhub

3y

51

100 Gradations of Inner Alignment Obstacles

abramdemski

1y

22

84 Safety Implications of LeCun's path to machine intelligence

Ivan Vendrov

5mo

16

81 Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible

Sam Bowman

3mo

6

77 A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]

Dan H

7mo

5

75 SGD's Bias

johnswentworth

1y

16

74 Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda

Logan Riggs

2y

12

70 Inductive biases stick around

evhub

3y

14

62 Multimodal Neurons in Artificial Neural Networks

Kaj_Sotala

1y

2

60 Unsolved ML Safety Problems

jsteinhardt

1y

2

59 Reframing inner alignment

davidad

9d

13

58 Understanding the Lottery Ticket Hypothesis

Alex Flint

1y

9

318 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

223 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

104 Caution when interpreting Deepmind's In-context RL paper

Sam Marks

1mo

6

91 Paper: Teaching GPT3 to express uncertainty in words

Owain_Evans

6mo

7

86 Paper: Discovering novel algorithms with AlphaTensor [Deepmind]

LawrenceC

2mo

18

66 Toy Models of Superposition

evhub

3mo

2

64 Clarifying AI X-risk

zac_kenton

1mo

23

62 Truthful LMs as a warm-up for aligned AGI

Jacob_Hilton

11mo

14

55 Autonomy as taking responsibility for reference maintenance

Ramana Kumar

4mo

3

48 A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment

AlexMennen

4y

1

44 AlphaGo Zero and capability amplification

paulfchristiano

3y

23

39 Finding the variables

Stuart_Armstrong

3y

1

38 The accumulation of knowledge: literature review

Alex Flint

1y

3

35 How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?

Owain_Evans

9mo

3