Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

1855 posts AI SERI MATS AI Sentience Distributional Shifts AI Robustness Truthful AI Adversarial Examples

185 posts Careers Audio Interviews Infra-Bayesianism Organization Updates AXRP Formal Proof Redwood Research Domain Theory Adversarial Training

344 (My understanding of) What Everyone in Technical Alignment is Doing and Why

Thomas Larsen

3mo

83

314 How To Get Into Independent Research On Alignment/Agency

johnswentworth

1y

33

310 Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover

Ajeya Cotra

5mo

89

303 What should you change in response to an "emergency"? And AI risk

AnnaSalamon

5mo

60

265 A challenge for AGI organizations, and a challenge for readers

Rob Bensinger

19d

30

259 We Choose To Align AI

johnswentworth

11mo

15

247 DeepMind: Generally capable agents emerge from open-ended play

Daniel Kokotajlo

1y

53

245 Visible Thoughts Project and Bounty Announcement

So8res

1y

104

243 Don't die with dignity; instead play to your outs

Jeffrey Ladish

8mo

58

237 larger language models may disappoint you [or, an eternally unfinished draft]

nostalgebraist

1y

29

235 The Plan

johnswentworth

1y

77

235 Ngo and Yudkowsky on alignment difficulty

Eliezer Yudkowsky

1y

143

235 Contra Hofstadter on GPT-3 Nonsense

rictic

6mo

22

232 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

153 Curated conversations with brilliant rationalists

spencerg

1y

18

143 Redwood Research’s current project

Buck

1y

29

136 High-stakes alignment via adversarial training [Redwood Research report]

dmz

7mo

29

135 Job Offering: Help Communicate Infrabayesianism

abramdemski

9mo

21

135 Takeaways from our robust injury classifier project [Redwood Research]

dmz

3mo

9

134 Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau

1mo

14

131 Announcing the LessWrong Curated Podcast

Ben Pace

6mo

17

130 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

120 Proofs, Implications, and Models

Eliezer Yudkowsky

10y

218

118 An Update on Academia vs. Industry (one year into my faculty job)

David Scott Krueger (formerly: capybaralet)

3mo

18

117 Taking the parameters which seem to matter and rotating them until they don't

Garrett Baker

3mo

48

116 [Transcript] Richard Feynman on Why Questions

Grognor

10y

45

114 Understanding Infra-Bayesianism: A Beginner-Friendly Video Series

Jack Parker

2mo

6

112 Why I'm excited about Redwood Research's current project

paulfchristiano

1y

6