Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
11 posts
Truth, Semantics, & Meaning
Honesty
Anthropic
Map and Territory
Calibration
8 posts
DeepMind
66
Toy Models of Superposition
evhub
3mo
2
27
Maps and Blueprint; the Two Sides of the Alignment Equation
Nora_Ammann
1mo
1
55
Autonomy as taking responsibility for reference maintenance
Ramana Kumar
4mo
3
91
Paper: Teaching GPT3 to express uncertainty in words
Owain_Evans
6mo
7
62
Truthful LMs as a warm-up for aligned AGI
Jacob_Hilton
11mo
14
35
How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?
Owain_Evans
9mo
3
38
The accumulation of knowledge: literature review
Alex Flint
1y
3
25
Knowledge is not just precipitation of action
Alex Flint
1y
6
39
Finding the variables
Stuart_Armstrong
3y
1
28
Cartographic Processes
johnswentworth
3y
3
29
Bridging syntax and semantics, empirically
Stuart_Armstrong
4y
4
223
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
318
DeepMind alignment team opinions on AGI ruin arguments
Vika
4mo
34
104
Caution when interpreting Deepmind's In-context RL paper
Sam Marks
1mo
6
64
Clarifying AI X-risk
zac_kenton
1mo
23
86
Paper: Discovering novel algorithms with AlphaTensor [Deepmind]
LawrenceC
2mo
18
19
Paper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]
LawrenceC
1mo
5
44
AlphaGo Zero and capability amplification
paulfchristiano
3y
23
48
A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
4y
1