Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

28 posts GPT List of Links

14 posts AI-assisted Alignment Bounties & Prizes (active) AI Safety Public Materials

255 New Scaling Laws for Large Language Models

1a3orn

8mo

21

171 interpreting GPT: the logit lens

nostalgebraist

2y

32

170 The case for aligning narrowly superhuman models

Ajeya Cotra

1y

74

155 Developmental Stages of GPTs

orthonormal

2y

74

132 Can you get AGI from a Transformer?

Steven Byrnes

2y

39

117 Alignment As A Bottleneck To Usefulness Of GPT-3

johnswentworth

2y

57

108 MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"

Rob Bensinger

1y

13

82 Collection of GPT-3 results

Kaj_Sotala

2y

24

75 By Default, GPTs Think In Plain Sight

Fabien Roger

1mo

16

73 To what extent is GPT-3 capable of reasoning?

TurnTrout

2y

74

62 [ASoT] Finetuning, RL, and GPT's world prior

Jozdien

18d

8

59 OpenAI announces GPT-3

gwern

2y

23

58 Will OpenAI's work unintentionally increase existential risks related to AI?

adamShimi

2y

56

57 How "honest" is GPT-3?

abramdemski

2y

18

175 Godzilla Strategies

johnswentworth

6mo

65

126 [$20K in Prizes] AI Safety Arguments Competition

Dan H

7mo

543

99 Beliefs and Disagreements about Automating Alignment Research

Ian McKenzie

3mo

4

96 [Link] Why I’m optimistic about OpenAI’s alignment approach

janleike

15d

13

93 How much chess engine progress is about adapting to bigger computers?

paulfchristiano

1y

23

84 $20K In Bounties for AI Safety Public Materials

Dan H

4mo

7

76 NeurIPS ML Safety Workshop 2022

Dan H

4mo

2

36 Prizes for ML Safety Benchmark Ideas

joshc

1mo

3

26 Distribution Shifts and The Importance of AI Safety

Leon Lang

2mo

2

26 Making it harder for an AGI to "trick" us, with STVs

Tor Økland Barstad

5mo

5

20 Research request (alignment strategy): Deep dive on "making AI solve alignment for us"

JanBrauner

19d

3

12 Getting from an unaligned AGI to an aligned AGI?

Tor Økland Barstad

6mo

7

11 AI-assisted list of ten concrete alignment things to do right now

lcmgcd

3mo

5

10 Alignment with argument-networks and assessment-predictions

Tor Økland Barstad

7d

3