Go Back
Choose this branch
You can't go any further
meritocratic
regular
democratic
hot
top
alive
62 posts
Interviews
Redwood Research
Organization Updates
AXRP
Adversarial Examples
Adversarial Training
AI Robustness
17 posts
Audio
164
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
31
Latent Adversarial Training
Adam Jermyn
5mo
9
159
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
105
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
KevinRoWang
1mo
5
143
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
184
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29
121
Redwood Research’s current project
Buck
1y
29
12
AXRP Episode 18 - Concept Extrapolation with Stuart Armstrong
DanielFilan
3mo
1
44
Redwood's Technique-Focused Epistemic Strategy
adamShimi
1y
1
15
AXRP Episode 17 - Training for Very High Reliability with Daniel Ziegler
DanielFilan
4mo
0
22
deluks917 on Online Weirdos
Jacob Falkovich
4y
3
48
AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah
Palus Astra
2y
27
8
AXRP Episode 1 - Adversarial Policies with Adam Gleave
DanielFilan
1y
5
6
Bloggingheads: Yudkowsky and Horgan
Eliezer Yudkowsky
14y
37
43
How and why to turn everything into audio
KatWoods
4mo
18
22
Which LessWrong content would you like recorded into audio/podcast form?
Ruby
3mo
11
42
Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas
Akash
25d
2
156
Announcing the LessWrong Curated Podcast
Ben Pace
6mo
17
26
Me (Steve Byrnes) on the “Brain Inspired” podcast
Steven Byrnes
1mo
1
104
Listen to top LessWrong posts with The Nonlinear Library
KatWoods
1y
27
6
Cognitive scientist Joel Chan on metascience, scaling and automating innovation, collective intelligence, and tools for thought.
fowlertm
1y
3
13
Podcasts on surveys, slower AI, AI arguments, etc
KatjaGrace
3mo
0
34
AXRP Episode 4 - Risks from Learned Optimization with Evan Hubinger
DanielFilan
1y
10
12
Interview with Matt Freeman
Evenflair
29d
0
37
Shahar Avin On How To Regulate Advanced AI Systems
Michaël Trazzi
2mo
0
20
Feelings of Admiration, Ruby <=> Miranda
Ruby
1y
0
57
New: use The Nonlinear Library to listen to the top LessWrong posts of all time
KatWoods
8mo
9
165
Curated conversations with brilliant rationalists
spencerg
1y
18