Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

27 posts Redwood Research Organization Updates Adversarial Examples Adversarial Training AI Robustness

35 posts Interviews AXRP

165 Redwood Research’s current project

Buck

1y

29

127 Takeaways from our robust injury classifier project [Redwood Research]

dmz

3mo

9

126 Why I'm excited about Redwood Research's current project

paulfchristiano

1y

6

109 Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau

1mo

14

96 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

88 High-stakes alignment via adversarial training [Redwood Research report]

dmz

7mo

29

67 Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

KevinRoWang

1mo

5

64 What I've been doing instead of writing

benkuhn

1y

3

62 We're Redwood Research, we do applied alignment research, AMA

Nate Thomas

1y

3

61 Get genotyped for free ( If your IQ is high enough)

David Althaus

11y

63

55 What's up with Arbital?

Alexei

5y

91

52 Redwood's Technique-Focused Epistemic Strategy

adamShimi

1y

1

49 Two clarifications about "Strategic Background"

Rob Bensinger

4y

6

43 Help the Brain Preservation Foundation

aurellem

9y

20

135 [Transcript] Richard Feynman on Why Questions

Grognor

10y

45

112 I wanted to interview Eliezer Yudkowsky but he's busy so I simulated him instead

lsusr

1y

33

73 AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant

DanielFilan

1y

2

68 AI Alignment Podcast: An Overview of Technical AI Alignment in 2018 and 2019 with Buck Shlegeris and Rohin Shah

Palus Astra

2y

27

46 AXRP Episode 10 - AI’s Future and Impacts with Katja Grace

DanielFilan

1y

2

45 AXRP Episode 7 - Side Effects with Victoria Krakovna

DanielFilan

1y

6

43 Muehlhauser-Wang Dialogue

lukeprog

10y

288

41 My hour-long interview with Yudkowsky on "Becoming a Rationalist"

lukeprog

11y

22

39 AXRP Episode 12 - AI Existential Risk with Paul Christiano

DanielFilan

1y

0

37 Conversation with Paul Christiano

abergal

3y

6

36 AXRP Episode 3 - Negotiable Reinforcement Learning with Andrew Critch

DanielFilan

1y

0

34 See Eliezer talk with PZ Myers and David Brin (and me) about immortality this Sunday

Eneasz

9y

5

34 Robin Hanson on the futurist focus on AI

abergal

3y

24

33 AXRP Episode 7.5 - Forecasting Transformative AI from Biological Anchors with Ajeya Cotra

DanielFilan

1y

1