Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
32 posts
Machine Learning (ML)
OpenAI
Lottery Ticket Hypothesis
19 posts
DeepMind
Truth, Semantics, & Meaning
Anthropic
Honesty
Map and Territory
Calibration
199
Common misconceptions about OpenAI
Jacob_Hilton
3mo
138
123
the scaling “inconsistency”: openAI’s new insight
nostalgebraist
2y
14
114
Understanding “Deep Double Descent”
evhub
3y
51
100
Gradations of Inner Alignment Obstacles
abramdemski
1y
22
84
Safety Implications of LeCun's path to machine intelligence
Ivan Vendrov
5mo
16
81
Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible
Sam Bowman
3mo
6
77
A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]
Dan H
7mo
5
75
SGD's Bias
johnswentworth
1y
16
74
Using GPT-N to Solve Interpretability of Neural Networks: A Research Agenda
Logan Riggs
2y
12
70
Inductive biases stick around
evhub
3y
14
62
Multimodal Neurons in Artificial Neural Networks
Kaj_Sotala
1y
2
60
Unsolved ML Safety Problems
jsteinhardt
1y
2
59
Reframing inner alignment
davidad
9d
13
58
Understanding the Lottery Ticket Hypothesis
Alex Flint
1y
9
318
DeepMind alignment team opinions on AGI ruin arguments
Vika
4mo
34
223
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
104
Caution when interpreting Deepmind's In-context RL paper
Sam Marks
1mo
6
91
Paper: Teaching GPT3 to express uncertainty in words
Owain_Evans
6mo
7
86
Paper: Discovering novel algorithms with AlphaTensor [Deepmind]
LawrenceC
2mo
18
66
Toy Models of Superposition
evhub
3mo
2
64
Clarifying AI X-risk
zac_kenton
1mo
23
62
Truthful LMs as a warm-up for aligned AGI
Jacob_Hilton
11mo
14
55
Autonomy as taking responsibility for reference maintenance
Ramana Kumar
4mo
3
48
A comment on the IDA-AlphaGoZero metaphor; capabilities versus alignment
AlexMennen
4y
1
44
AlphaGo Zero and capability amplification
paulfchristiano
3y
23
39
Finding the variables
Stuart_Armstrong
3y
1
38
The accumulation of knowledge: literature review
Alex Flint
1y
3
35
How do new models from OpenAI, DeepMind and Anthropic perform on TruthfulQA?
Owain_Evans
9mo
3