Go Back
You can't go any further
You can't go any further
meritocratic
regular
democratic
hot
top
alive
2 posts
Distributional Shifts
26 posts
SERI MATS
7
Distribution Shifts and The Importance of AI Safety
Leon Lang
2mo
2
-1
Mesa-optimization for goals defined only within a training environment is dangerous
Rubi J. Hudson
4mo
2
39
Proper scoring rules don’t guarantee predicting fixed points
Johannes_Treutlein
4d
2
16
Is the "Valley of Confused Abstractions" real?
jacquesthibs
15d
9
28
The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)
Jessica Mary
1mo
2
61
SERI MATS Program - Winter 2022 Cohort
Ryan Kidd
2mo
12
96
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker
3mo
48
54
Neural Tangent Kernel Distillation
Thomas Larsen
2mo
20
20
Why I'm Working On Model Agnostic Interpretability
Jessica Mary
1mo
9
72
Externalized reasoning oversight: a research direction for language model alignment
tamera
4mo
22
13
Auditing games for high-level interpretability
Paul Colognese
1mo
1
30
Framing AI Childhoods
David Udell
3mo
8
28
Behaviour Manifolds and the Hessian of the Total Loss - Notes and Criticism
Spencer Becker-Kahn
3mo
4
6
[ASoT] Reflectivity in Narrow AI
Ulisse Mini
29d
1
36
Race Along Rashomon Ridge
Stephen Fowler
5mo
15
4
Guardian AI (Misaligned systems are all around us.)
Jessica Mary
25d
6