Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
27 posts
Redwood Research
Organization Updates
Adversarial Examples
Adversarial Training
AI Robustness
35 posts
Interviews
AXRP
143
Redwood Research’s current project
Buck
1y
29
136
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29
135
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
134
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
130
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
112
Why I'm excited about Redwood Research's current project
paulfchristiano
1y
6
86
Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small
KevinRoWang
1mo
5
57
What I've been doing instead of writing
benkuhn
1y
3
56
We're Redwood Research, we do applied alignment research, AMA
Nate Thomas
1y
3
48
Redwood's Technique-Focused Epistemic Strategy
adamShimi
1y
1
48
Get genotyped for free ( If your IQ is high enough)
David Althaus
11y
63
47
Two clarifications about "Strategic Background"
Rob Bensinger
4y
6
41
What's up with Arbital?
Alexei
5y
91
38
Genomic Prediction is now offering embryo selection
gwern
4y
1
116
[Transcript] Richard Feynman on Why Questions
Grognor
10y
45
110
I wanted to interview Eliezer Yudkowsky but he's busy so I simulated him instead
lsusr
1y
33
58
AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah
Palus Astra
2y
27
56
AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant
DanielFilan
1y
2
44
Conversation with Paul Christiano
abergal
3y
6
36
AXRP Episode 12 - AI Existential Risk with Paul Christiano
DanielFilan
1y
0
34
AXRP Episode 7 - Side Effects with Victoria Krakovna
DanielFilan
1y
6
34
Muehlhauser-Wang Dialogue
lukeprog
10y
288
34
AXRP Episode 10 - AI’s Future and Impacts with Katja Grace
DanielFilan
1y
2
33
My hour-long interview with Yudkowsky on "Becoming a Rationalist"
lukeprog
11y
22
32
AXRP Episode 15 - Natural Abstractions with John Wentworth
DanielFilan
7mo
1
31
Robin Hanson on the futurist focus on AI
abergal
3y
24
26
See Eliezer talk with PZ Myers and David Brin (and me) about immortality this Sunday
Eneasz
9y
5
26
AXRP Episode 3 - Negotiable Reinforcement Learning with Andrew Critch
DanielFilan
1y
0