Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

11 posts Truth, Semantics, & Meaning Honesty Anthropic Map and Territory Calibration

8 posts DeepMind

66 Toy Models of Superposition

evhub

3mo

2

27 Maps and Blueprint; the Two Sides of the Alignment Equation

Nora_Ammann

1mo

1

55 Autonomy as taking responsibility for reference maintenance

Ramana Kumar

4mo

3

91 Paper: Teaching GPT3 to express uncertainty in words

Owain_Evans

6mo

7

62 Truthful LMs as a warm-up for aligned AGI

Jacob_Hilton

11mo

14

35 How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?

Owain_Evans

9mo

3

38 The accumulation of knowledge: literature review

Alex Flint

1y

3

25 Knowledge is not just precipitation of action

Alex Flint

1y

6

39 Finding the variables

Stuart_Armstrong

3y

1

28 Cartographic Processes

johnswentworth

3y

3

29 Bridging syntax and semantics, empirically

Stuart_Armstrong

4y

4

223 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

318 DeepMind alignment team opinions on AGI ruin arguments

Vika

4mo

34

104 Caution when interpreting Deepmind's In-context RL paper

Sam Marks

1mo

6

64 Clarifying AI X-risk

zac_kenton

1mo

23

86 Paper: Discovering novel algorithms with AlphaTensor [Deepmind]

LawrenceC

2mo

18

19 Paper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]

LawrenceC

1mo

5

44 AlphaGo Zero and capability amplification

paulfchristiano

3y

23

48 A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment

AlexMennen

4y

1