Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
1855 posts
AI
SERI MATS
AI Sentience
Distributional Shifts
AI Robustness
Truthful AI
Adversarial Examples
185 posts
Careers
Audio
Interviews
Infra-Bayesianism
Organization Updates
AXRP
Formal Proof
Redwood Research
Domain Theory
Adversarial Training
531
(My understanding of) What Everyone in Technical Alignment is Doing and Why
Thomas Larsen
3mo
83
436
How To Get Into Independent Research On Alignment/Agency
johnswentworth
1y
33
404
Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover
Ajeya Cotra
5mo
89
373
We Choose To Align AI
johnswentworth
11mo
15
331
What should you change in response to an "emergency"? And AI risk
AnnaSalamon
5mo
60
323
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
287
Don't die with dignity; instead play to your outs
Jeffrey Ladish
8mo
58
282
AGI Safety FAQ / all-dumb-questions-allowed thread
Aryeh Englander
6mo
514
281
Ngo and Yudkowsky on alignment difficulty
Eliezer Yudkowsky
1y
143
281
An overview of 11 proposals for building safe advanced AI
evhub
2y
36
279
The Plan
johnswentworth
1y
77
276
DeepMind: Generally capable agents emerge from open-ended play
Daniel Kokotajlo
1y
53
271
larger language models may disappoint you [or, an eternally unfinished draft]
nostalgebraist
1y
29
265
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
184
High-stakes alignment via adversarial training [Redwood Research report]
dmz
7mo
29
165
Curated conversations with brilliant rationalists
spencerg
1y
18
164
Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]
LawrenceC
17d
9
159
Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley
maxnadeau
1mo
14
157
Understanding Infra-Bayesianism: A Beginner-Friendly Video Series
Jack Parker
2mo
6
156
Announcing the LessWrong Curated Podcast
Ben Pace
6mo
17
151
An Update on Academia vs. Industry (one year into my faculty job)
David Scott Krueger (formerly: capybaralet)
3mo
18
143
Takeaways from our robust injury classifier project [Redwood Research]
dmz
3mo
9
138
Taking the parameters which seem to matter and rotating them until they don't
Garrett Baker
3mo
48
134
Externalized reasoning oversight: a research direction for language model alignment
tamera
4mo
22
124
Job Offering: Help Communicate Infrabayesianism
abramdemski
9mo
21
121
Redwood Research’s current project
Buck
1y
29
114
Evaluations project @ ARC is hiring a researcher and a webdev/engineer
Beth Barnes
3mo
7
108
I wanted to interview Eliezer Yudkowsky but he's busy so I simulated him instead
lsusr
1y
33