Tree of Tags

Go Back

You can't go any further

Choose this branch

meritocratic regular democratic

hot top alive

9 posts Organization Updates

14 posts Redwood Research Adversarial Training AI Robustness

50 What I've been doing instead of writing

benkuhn

1y

3

45 Two clarifications about "Strategic Background"

Rob Bensinger

4y

6

37 Genomic Prediction is now offering embryo selection

gwern

4y

1

35 Get genotyped for free ( If your IQ is high enough)

David Althaus

11y

63

27 What's up with Arbital?

Alexei

5y

91

25 Help the Brain Preservation Foundation

aurellem

9y

20

25 RAISE is looking for full-time content developers

4y

5

24 Giving What We Can needs your help!

RobertWiblin

7y

6

7 Symbiosis - An Intentional Community For Radical Self-Improvement

Matt Goldenberg

4y

0

184 High-stakes alignment via adversarial training [Redwood Research report]

dmz

7mo

29

164 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

159 Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau

1mo

14

143 Takeaways from our robust injury classifier project [Redwood Research]

dmz

3mo

9

121 Redwood Research’s current project

Buck

1y

29

105 Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

KevinRoWang

1mo

5

98 Why I'm excited about Redwood Research's current project

paulfchristiano

1y

6

50 We're Redwood Research, we do applied alignment research, AMA

Nate Thomas

1y

3

44 Redwood's Technique-Focused Epistemic Strategy

adamShimi

1y

1

31 Latent Adversarial Training

Adam Jermyn

5mo

9

28 Adversarial training, importance sampling, and anti-adversarial training for AI whistleblowing

Buck

6mo

0

26 Causal scrubbing: results on a paren balance checker

LawrenceC

17d

0

18 Causal scrubbing: Appendix

LawrenceC

17d

0

15 AXRP Episode 17 - Training for Very High Reliability with Daniel Ziegler

DanielFilan

4mo

0