Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
11 posts
Truth, Semantics, & Meaning
Honesty
Anthropic
Map and Territory
Calibration
8 posts
DeepMind
49
Autonomy as taking responsibility for reference maintenance
Ramana Kumar
4mo
3
62
Toy Models of Superposition
evhub
3mo
2
21
Bridging syntax and semantics, empirically
Stuart_Armstrong
4y
4
17
Knowledge is not just precipitation of action
Alex Flint
1y
6
20
The accumulation of knowledge: literature review
Alex Flint
1y
3
68
Truthful LMs as a warm-up for aligned AGI
Jacob_Hilton
11mo
14
21
Finding the variables
Stuart_Armstrong
3y
1
16
Cartographic Processes
johnswentworth
3y
3
101
Paper: Teaching GPT3 to express uncertainty in words
Owain_Evans
6mo
7
49
How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?
Owain_Evans
9mo
3
15
Maps and Blueprint; the Two Sides of the Alignment Equation
Nora_Ammann
1mo
1
307
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
140
Clarifying AI X-risk
zac_kenton
1mo
23
74
Paper: Discovering novel algorithms with AlphaTensor [Deepmind]
LawrenceC
2mo
18
410
DeepMind alignment team opinions on AGI ruin arguments
Vika
4mo
34
104
Caution when interpreting Deepmind's In-context RL paper
Sam Marks
1mo
6
37
Paper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]
LawrenceC
1mo
5
20
AlphaGo Zero and capability amplification
paulfchristiano
3y
23
32
A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
4y
1