Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
28 posts
GPT
List of Links
14 posts
AI-assisted Alignment
Bounties & Prizes (active)
AI Safety Public Materials
18
An exploration of GPT-2's embedding weights
Adam Scherlis
7d
2
45
By Default, GPTs Think In Plain Sight
Fabien Roger
1mo
16
12
[LINK] - ChatGPT discussion
JanBrauner
19d
7
191
New Scaling Laws for Large Language Models
1a3orn
8mo
21
23
Recall and Regurgitation in GPT2
Megan Kinniment
2mo
1
204
The case for aligning narrowly superhuman models
Ajeya Cotra
1y
74
164
MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"
Rob Bensinger
1y
13
145
interpreting GPT: the logit lens
nostalgebraist
2y
32
125
Developmental Stages of GPTs
orthonormal
2y
74
105
Alignment As A Bottleneck To Usefulness Of GPT-3
johnswentworth
2y
57
24
GPT-3 and concept extrapolation
Stuart_Armstrong
8mo
28
28
More GPT-3 and symbol grounding
Stuart_Armstrong
10mo
7
96
Can you get AGI from a Transformer?
Steven Byrnes
2y
39
96
Collection of GPT-3 results
Kaj_Sotala
2y
24
90
[Link] Why I’m optimistic about OpenAI’s alignment approach
janleike
15d
13
12
Research request (alignment strategy): Deep dive on "making AI solve alignment for us"
JanBrauner
19d
3
36
Prizes for ML Safety Benchmark Ideas
joshc
1mo
3
4
Alignment with argument-networks and assessment-predictions
Tor Økland Barstad
7d
3
85
Beliefs and Disagreements about Automating Alignment Research
Ian McKenzie
3mo
4
127
Godzilla Strategies
johnswentworth
6mo
65
68
NeurIPS ML Safety Workshop 2022
Dan H
4mo
2
52
$20K In Bounties for AI Safety Public Materials
Dan H
4mo
7
135
How much chess engine progress is about adapting to bigger computers?
paulfchristiano
1y
23
8
Distribution Shifts and The Importance of AI Safety
Leon Lang
2mo
2
22
[$20K in Prizes] AI Safety Arguments Competition
Dan H
7mo
543
5
AI-assisted list of ten concrete alignment things to do right now
lcmgcd
3mo
5
6
Getting from an unaligned AGI to an aligned AGI?
Tor Økland Barstad
6mo
7
2
Making it harder for an AGI to "trick" us, with STVs
Tor Økland Barstad
5mo
5