Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
32 posts
Machine Learning (ML)
OpenAI
Lottery Ticket Hypothesis
19 posts
DeepMind
Truth, Semantics, & Meaning
Anthropic
Honesty
Map and Territory
Calibration
35
Reframing inner alignment
davidad
9d
13
253
Common misconceptions about OpenAI
Jacob_Hilton
3mo
138
55
A Data limited future
Donald Hobson
4mo
25
47
Steganography in Chain of Thought Reasoning
A Ray
4mo
13
34
Prosaic AI alignment
paulfchristiano
4y
10
97
Survey of NLP Researchers: NLP is contributing to AGI progress; major catastrophe plausible
Sam Bowman
3mo
6
24
Train first VS prune first in neural networks.
Donald Hobson
5mo
5
17
[MLSN #5]: Prize Compilation
Dan H
2mo
1
17
Grouped Loss may disfavor discontinuous capabilities
Adam Jermyn
5mo
2
173
A Bird's Eye View of the ML Field [Pragmatic AI Safety #2]
Dan H
7mo
5
21
My thoughts on OpenAI's Alignment plan
Donald Hobson
10d
0
24
Discussion on the machine learning approach to AI safety
Vika
4y
3
54
Unsolved ML Safety Problems
jsteinhardt
1y
2
12
Automated Fact Checking: A Look at the Field
Hoagy
1y
0
307
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
140
Clarifying AI X-risk
zac_kenton
1mo
23
74
Paper: Discovering novel algorithms with AlphaTensor [Deepmind]
LawrenceC
2mo
18
410
DeepMind alignment team opinions on AGI ruin arguments
Vika
4mo
34
104
Caution when interpreting Deepmind's In-context RL paper
Sam Marks
1mo
6
37
Paper: In-context Reinforcement Learning with Algorithm Distillation [Deepmind]
LawrenceC
1mo
5
49
Autonomy as taking responsibility for reference maintenance
Ramana Kumar
4mo
3
62
Toy Models of Superposition
evhub
3mo
2
21
Bridging syntax and semantics, empirically
Stuart_Armstrong
4y
4
20
AlphaGo Zero and capability amplification
paulfchristiano
3y
23
17
Knowledge is not just precipitation of action
Alex Flint
1y
6
20
The accumulation of knowledge: literature review
Alex Flint
1y
3
68
Truthful LMs as a warm-up for aligned AGI
Jacob_Hilton
11mo
14
21
Finding the variables
Stuart_Armstrong
3y
1