Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
27 posts
Redwood Research
Organization Updates
Adversarial Examples
Adversarial Training
AI Robustness
35 posts
Interviews
AXRP
165
Redwood Research’s current project
Buck
1y
29
127
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
126
Why I'm excited about Redwood Research's current project
paulfchristiano
1y
6
109
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
96
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
88
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29
67
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
KevinRoWang
1mo
5
64
What I've been doing instead of writing
benkuhn
1y
3
62
We're Redwood Research, we do applied alignment research, AMA
Nate Thomas
1y
3
61
Get genotyped for free ( If your IQ is high enough)
David Althaus
11y
63
55
What's up with Arbital?
Alexei
5y
91
52
Redwood's Technique-Focused Epistemic Strategy
adamShimi
1y
1
49
Two clarifications about "Strategic Background"
Rob Bensinger
4y
6
43
Help the Brain Preservation Foundation
aurellem
9y
20
135
[Transcript] Richard Feynman on Why Questions
Grognor
10y
45
112
I wanted to interview Eliezer Yudkowsky but he's busy so I simulated him instead
lsusr
1y
33
73
AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant
DanielFilan
1y
2
68
AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah
Palus Astra
2y
27
46
AXRP Episode 10 - AI’s Future and Impacts with Katja Grace
DanielFilan
1y
2
45
AXRP Episode 7 - Side Effects with Victoria Krakovna
DanielFilan
1y
6
43
Muehlhauser-Wang Dialogue
lukeprog
10y
288
41
My hour-long interview with Yudkowsky on "Becoming a Rationalist"
lukeprog
11y
22
39
AXRP Episode 12 - AI Existential Risk with Paul Christiano
DanielFilan
1y
0
37
Conversation with Paul Christiano
abergal
3y
6
36
AXRP Episode 3 - Negotiable Reinforcement Learning with Andrew Critch
DanielFilan
1y
0
34
See Eliezer talk with PZ Myers and David Brin (and me) about immortality this Sunday
Eneasz
9y
5
34
Robin Hanson on the futurist focus on AI
abergal
3y
24
33
AXRP Episode 7.5 - Forecasting Transformative AI from Biological Anchors with Ajeya Cotra
DanielFilan
1y
1