Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
106 posts
Careers
Infra-Bayesianism
SERI MATS
Formal Proof
Domain Theory
Distributional Shifts
79 posts
Audio
Interviews
Organization Updates
Redwood Research
AXRP
Adversarial Examples
Adversarial Training
AI Robustness
55
Proper scoring rules don’t guarantee predicting fixed points
Johannes_Treutlein
4d
2
2
Career Scouting: Housing Coordination
koratkar
5h
0
30
Where to be an AI Safety Professor
scasper
13d
12
67
Career Scouting: Dentistry
koratkar
1mo
5
5
What about non-degree seeking?
Lao Mein
3d
5
114
Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
Jack Parker
2mo
6
15
Is the "Valley of Confused Abstractions" real?
jacquesthibs
15d
9
41
Some advice on independent research
Marius Hobbhahn
1mo
4
118
An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet)
3mo
18
71
SERI MATS Program - Winter 2022 Cohort
Ryan Kidd
2mo
12
117
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker
3mo
48
68
Neural Tangent Kernel Distillation
Thomas Larsen
2mo
20
94
Evaluations project @ ARC is hiring a researcher and a webdev/engineer
Beth Barnes
3mo
7
26
The Ground Truth Problem (Or, Why Evaluating Interpretability Methods Is Hard)
Jessica Mary
1mo
2
6
Podcast: Tamera Lanham on AI risk, threat models, alignment proposals, externalized reasoning oversight, and working at Anthropic
Akash
2h
0
130
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
134
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
26
Causal scrubbing: results on a paren balance checker
LawrenceC
17d
0
37
Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas
Akash
25d
2
86
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
KevinRoWang
1mo
5
135
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
16
Causal scrubbing: Appendix
LawrenceC
17d
0
131
Announcing the LessWrong Curated Podcast
Ben Pace
6mo
17
14
Interview with Matt Freeman
Evenflair
29d
0
26
Me (Steve Byrnes) on the “Brain Inspired” podcast
Steven Byrnes
1mo
1
136
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29
31
Shahar Avin On How To Regulate Advanced AI Systems
Michaël Trazzi
2mo
0
46
How and why to turn everything into audio
KatWoods
4mo
18