Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
1855 posts
AI
SERI MATS
AI Sentience
Distributional Shifts
AI Robustness
Truthful AI
Adversarial Examples
185 posts
Careers
Audio
Interviews
Infra-Bayesianism
Organization Updates
AXRP
Formal Proof
Redwood Research
Domain Theory
Adversarial Training
344
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
3mo
83
314
How To Get Into Independent Research On Alignment/Agency
johnswentworth
1y
33
310
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
5mo
89
303
What should you change in response to an "emergency"? And AI risk
AnnaSalamon
5mo
60
265
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
259
We Choose To Align AI
johnswentworth
11mo
15
247
DeepMind: Generally capable agents emerge from open-ended play
Daniel Kokotajlo
1y
53
245
Visible Thoughts Project and Bounty Announcement
So8res
1y
104
243
Don't die with dignity; instead play to your outs
Jeffrey Ladish
8mo
58
237
larger language models may disappoint you [or, an eternally unfinished draft]
nostalgebraist
1y
29
235
The Plan
johnswentworth
1y
77
235
Ngo and Yudkowsky on alignment difficulty
Eliezer Yudkowsky
1y
143
235
Contra Hofstadter on GPT-3 Nonsense
rictic
6mo
22
232
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
153
Curated conversations with brilliant rationalists
spencerg
1y
18
143
Redwood Research’s current project
Buck
1y
29
136
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29
135
Job Offering: Help Communicate Infrabayesianism
abramdemski
9mo
21
135
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
134
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
131
Announcing the LessWrong Curated Podcast
Ben Pace
6mo
17
130
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
120
Proofs, Implications, and Models
Eliezer Yudkowsky
10y
218
118
An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet)
3mo
18
117
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker
3mo
48
116
[Transcript] Richard Feynman on Why Questions
Grognor
10y
45
114
Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
Jack Parker
2mo
6
112
Why I'm excited about Redwood Research's current project
paulfchristiano
1y
6