Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
106 posts
Careers
Infra-Bayesianism
SERI MATS
Formal Proof
Domain Theory
Distributional Shifts
79 posts
Audio
Interviews
Organization Updates
Redwood Research
AXRP
Adversarial Examples
Adversarial Training
AI Robustness
39
Proper scoring rules don’t guarantee predicting fixed points
Johannes_Treutlein
4d
2
28
Where to be an AI Safety Professor
scasper
13d
12
5
What about non-degree seeking?
Lao Mein
3d
5
45
Career Scouting: Dentistry
koratkar
1mo
5
16
Is the "Valley of Confused Abstractions" real?
jacquesthibs
15d
9
28
The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)
Jessica Mary
1mo
2
61
SERI MATS Program - Winter 2022 Cohort
Ryan Kidd
2mo
12
31
Some advice on independent research
Marius Hobbhahn
1mo
4
96
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker
3mo
48
71
Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
Jack Parker
2mo
6
85
An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet)
3mo
18
54
Neural Tangent Kernel Distillation
Thomas Larsen
2mo
20
74
Evaluations project @ ARC is hiring a researcher and a webdev/engineer
Beth Barnes
3mo
7
20
Why I'm Working On Model Agnostic Interpretability
Jessica Mary
1mo
9
7
Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic
Akash
2h
0
96
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
109
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
26
Causal scrubbing: results on a paren balance checker
LawrenceC
17d
0
32
Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas
Akash
25d
2
67
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
KevinRoWang
1mo
5
127
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
14
Causal scrubbing: Appendix
LawrenceC
17d
0
16
Interview with Matt Freeman
Evenflair
29d
0
26
Me (Steve Byrnes) on the “Brain Inspired” podcast
Steven Byrnes
1mo
1
106
Announcing the LessWrong Curated Podcast
Ben Pace
6mo
17
36
Which LessWrong content would you like recorded into audio/podcast form?
Ruby
3mo
11
49
How and why to turn everything into audio
KatWoods
4mo
18
88
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29