Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

28 posts GPT List of Links

14 posts AI-assisted Alignment Bounties & Prizes (active) AI Safety Public Materials

18 An exploration of GPT-2's embedding weights

Adam Scherlis

7d

2

45 By Default, GPTs Think In Plain Sight

Fabien Roger

1mo

16

12 [LINK] - ChatGPT discussion

JanBrauner

19d

7

191 New Scaling Laws for Large Language Models

1a3orn

8mo

21

23 Recall and Regurgitation in GPT2

Megan Kinniment

2mo

1

204 The case for aligning narrowly superhuman models

Ajeya Cotra

1y

74

164 MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"

Rob Bensinger

1y

13

145 interpreting GPT: the logit lens

nostalgebraist

2y

32

125 Developmental Stages of GPTs

orthonormal

2y

74

105 Alignment As A Bottleneck To Usefulness Of GPT-3

johnswentworth

2y

57

24 GPT-3 and concept extrapolation

Stuart_Armstrong

8mo

28

28 More GPT-3 and symbol grounding

Stuart_Armstrong

10mo

7

96 Can you get AGI from a Transformer?

Steven Byrnes

2y

39

96 Collection of GPT-3 results

Kaj_Sotala

2y

24

90 [Link] Why I’m optimistic about OpenAI’s alignment approach

janleike

15d

13

12 Research request (alignment strategy): Deep dive on "making AI solve alignment for us"

JanBrauner

19d

3

36 Prizes for ML Safety Benchmark Ideas

joshc

1mo

3

4 Alignment with argument-networks and assessment-predictions

Tor Økland Barstad

7d

3

85 Beliefs and Disagreements about Automating Alignment Research

Ian McKenzie

3mo

4

127 Godzilla Strategies

johnswentworth

6mo

65

68 NeurIPS ML Safety Workshop 2022

Dan H

4mo

2

52 $20K In Bounties for AI Safety Public Materials

Dan H

4mo

7

135 How much chess engine progress is about adapting to bigger computers?

paulfchristiano

1y

23

8 Distribution Shifts and The Importance of AI Safety

Leon Lang

2mo

2

22 [$20K in Prizes] AI Safety Arguments Competition

Dan H

7mo

543

5 AI-assisted list of ten concrete alignment things to do right now

lcmgcd

3mo

5

6 Getting from an unaligned AGI to an aligned AGI?

Tor Økland Barstad

6mo

7

2 Making it harder for an AGI to "trick" us, with STVs

Tor Økland Barstad

5mo

5