Go Back
You can't go any further
You can't go any further
meritocratic
regular
democratic
hot
top
alive
0 posts
AI Sentience
1854 posts
AI
84
Towards Hodge-podge Alignment
Cleo Nardo
1d
20
41
The "Minimal Latents" Approach to Natural Abstractions
johnswentworth
22h
6
198
The next decades might be wild
Marius Hobbhahn
5d
21
265
AI alignment is distinct from its near-term applications
paulfchristiano
7d
5
6
I believe some AI doomers are overconfident
FTPickle
6h
4
13
Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.
Charlie Steiner
19h
0
19
Why mechanistic interpretability does not and cannot contribute to long-term AGI safety (from messages with a friend)
Remmelt
1d
6
323
A challenge for AGI organizations, and a challenge for readers
Rob Bensinger
19d
30
107
Okay, I feel it now
g1
7d
14
111
Revisiting algorithmic progress
Tamay
7d
6
89
Trying to disambiguate different questions about whether RLHF is “good”
Buck
6d
39
16
Hacker-AI and Cyberwar 2.0+
Erland Wittkotter
1d
0
190
Using GPT-Eliezer against ChatGPT Jailbreaking
Stuart_Armstrong
14d
77
74
Predicting GPU performance
Marius Hobbhahn
6d
24