Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

106 posts Careers Infra-Bayesianism SERI MATS Formal Proof Domain Theory Distributional Shifts

79 posts Audio Interviews Organization Updates Redwood Research AXRP Adversarial Examples Adversarial Training AI Robustness

39 Proper scoring rules don’t guarantee predicting fixed points

Johannes_Treutlein

4d

2

28 Where to be an AI Safety Professor

scasper

13d

12

5 What about non-degree seeking?

Lao Mein

3d

5

45 Career Scouting: Dentistry

koratkar

1mo

5

16 Is the "Valley of Confused Abstractions" real?

jacquesthibs

15d

9

28 The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)

Jessica Mary

1mo

2

61 SERI MATS Program - Winter 2022 Cohort

Ryan Kidd

2mo

12

31 Some advice on independent research

Marius Hobbhahn

1mo

4

96 Taking the parameters which seem to matter and rotating them until they don't

Garrett Baker

3mo

48

71 Understanding Infra-Bayesianism: A Beginner-Friendly Video Series

Jack Parker

2mo

6

85 An Update on Academia vs. Industry (one year into my faculty job)

David Scott Krueger (formerly: capybaralet)

3mo

18

54 Neural Tangent Kernel Distillation

Thomas Larsen

2mo

20

74 Evaluations project @ ARC is hiring a researcher and a webdev/engineer

Beth Barnes

3mo

7

20 Why I'm Working On Model Agnostic Interpretability

Jessica Mary

1mo

9

7 Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic

Akash

2h

0

96 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

109 Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau

1mo

14

26 Causal scrubbing: results on a paren balance checker

LawrenceC

17d

0

32 Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas

Akash

25d

2

67 Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

KevinRoWang

1mo

5

127 Takeaways from our robust injury classifier project [Redwood Research]

dmz

3mo

9

14 Causal scrubbing: Appendix

LawrenceC

17d

0

16 Interview with Matt Freeman

Evenflair

29d

0

26 Me (Steve Byrnes) on the “Brain Inspired” podcast

Steven Byrnes

1mo

1

106 Announcing the LessWrong Curated Podcast

Ben Pace

6mo

17

36 Which LessWrong content would you like recorded into audio/podcast form?

Ruby

3mo

11

49 How and why to turn everything into audio

KatWoods

4mo

18

88 High-stakes alignment via adversarial training [Redwood Research report]

dmz

7mo

29