Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
106 posts
Careers
Infra-Bayesianism
SERI MATS
Formal Proof
Domain Theory
Distributional Shifts
79 posts
Audio
Interviews
Organization Updates
Redwood Research
AXRP
Adversarial Examples
Adversarial Training
AI Robustness
5
What about non-degree seeking?
Lao Mein
3d
5
6
[ASoT] Reflectivity in Narrow AI
Ulisse Mini
29d
1
28
Where to be an AI Safety Professor
scasper
13d
12
39
Proper scoring rules don’t guarantee predicting fixed points
Johannes_Treutlein
4d
2
16
Is the "Valley of Confused Abstractions" real?
jacquesthibs
15d
9
2
Vanessa Kosoy's PreDCA, distilled
Martín Soto
1mo
17
93
Infra-Bayesian physicalism: a formal theory of naturalized induction
Vanessa Kosoy
1y
20
96
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker
3mo
48
4
Guardian AI (Misaligned systems are all around us.)
Jessica Mary
25d
6
54
Neural Tangent Kernel Distillation
Thomas Larsen
2mo
20
20
Why I'm Working On Model Agnostic Interpretability
Jessica Mary
1mo
9
45
Career Scouting: Dentistry
koratkar
1mo
5
1
Working towards AI alignment is better
Johannes C. Mayer
11d
2
15
How do you get a job as a software developer?
lsusr
4mo
24
96
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
17
Latent Adversarial Training
Adam Jermyn
5mo
9
109
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
49
How and why to turn everything into audio
KatWoods
4mo
18
36
Which LessWrong content would you like recorded into audio/podcast form?
Ruby
3mo
11
67
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
KevinRoWang
1mo
5
32
Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas
Akash
25d
2
127
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
106
Announcing the LessWrong Curated Podcast
Ben Pace
6mo
17
88
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29
165
Redwood Research’s current project
Buck
1y
29
26
Me (Steve Byrnes) on the “Brain Inspired” podcast
Steven Byrnes
1mo
1
8
AXRP Episode 18 - Concept Extrapolation with Stuart Armstrong
DanielFilan
3mo
1
44
Listen to top LessWrong posts with The Nonlinear Library
KatWoods
1y
27