Tree of Tags

Go Back

You can't go any further

You can't go any further

meritocratic regular democratic

hot top alive

2 posts Distributional Shifts

26 posts SERI MATS

7 Distribution Shifts and The Importance of AI Safety

Leon Lang

2mo

2

-1 Mesa-optimization for goals defined only within a training environment is dangerous

Rubi J. Hudson

4mo

2

39 Proper scoring rules don’t guarantee predicting fixed points

Johannes_Treutlein

4d

2

16 Is the "Valley of Confused Abstractions" real?

jacquesthibs

15d

9

28 The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)

Jessica Mary

1mo

2

61 SERI MATS Program - Winter 2022 Cohort

Ryan Kidd

2mo

12

96 Taking the parameters which seem to matter and rotating them until they don't

Garrett Baker

3mo

48

54 Neural Tangent Kernel Distillation

Thomas Larsen

2mo

20

20 Why I'm Working On Model Agnostic Interpretability

Jessica Mary

1mo

9

72 Externalized reasoning oversight: a research direction for language model alignment

tamera

4mo

22

13 Auditing games for high-level interpretability

Paul Colognese

1mo

1

30 Framing AI Childhoods

David Udell

3mo

8

28 Behaviour Manifolds and the Hessian of the Total Loss - Notes and Criticism

Spencer Becker-Kahn

3mo

4

6 [ASoT] Reflectivity in Narrow AI

Ulisse Mini

29d

1

36 Race Along Rashomon Ridge

Stephen Fowler

5mo

15

4 Guardian AI (Misaligned systems are all around us.)

Jessica Mary

25d

6