Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

28 posts GPT List of Links

14 posts AI-assisted Alignment Bounties & Prizes (active) AI Safety Public Materials

18 An exploration of GPT-2's embedding weights

Adam Scherlis

7d

2

12 [LINK] - ChatGPT discussion

JanBrauner

19d

7

125 Developmental Stages of GPTs

orthonormal

2y

74

0 New(ish) AI control ideas

Stuart_Armstrong

5y

0

96 Collection of GPT-3 results

Kaj_Sotala

2y

24

145 interpreting GPT: the logit lens

nostalgebraist

2y

32

164 MIRI comments on Cotra's "Case for Aligning Narrowly Superhuman Models"

Rob Bensinger

1y

13

105 Alignment As A Bottleneck To Usefulness Of GPT-3

johnswentworth

2y

57

42 Will OpenAI's work unintentionally increase existential risks related to AI?

adamShimi

2y

56

50 [AN #102]: Meta learning by GPT-3, and a list of full proposals for AI alignment

Rohin Shah

2y

6

44 AI Alignment Writing Day Roundup #2

Ben Pace

3y

2

63 $1000 bounty for OpenAI to show whether GPT3 was "deliberately" pretending to be stupider than it is

jacobjacob

2y

40

42 To what extent are the scaling properties of Transformer networks exceptional?

abramdemski

2y

1

67 To what extent is GPT-3 capable of reasoning?

TurnTrout

2y

74

90 [Link] Why I’m optimistic about OpenAI’s alignment approach

janleike

15d

13

12 Research request (alignment strategy): Deep dive on "making AI solve alignment for us"

JanBrauner

19d

3

8 Distribution Shifts and The Importance of AI Safety

Leon Lang

2mo

2

5 AI-assisted list of ten concrete alignment things to do right now

lcmgcd

3mo

5

68 NeurIPS ML Safety Workshop 2022

Dan H

4mo

2

22 [$20K in Prizes] AI Safety Arguments Competition

Dan H

7mo

543

127 Godzilla Strategies

johnswentworth

6mo

65

135 How much chess engine progress is about adapting to bigger computers?

paulfchristiano

1y

23

85 Beliefs and Disagreements about Automating Alignment Research

Ian McKenzie

3mo

4

6 Getting from an unaligned AGI to an aligned AGI?

Tor Økland Barstad

6mo

7

52 $20K In Bounties for AI Safety Public Materials

Dan H

4mo

7

4 Alignment with argument-networks and assessment-predictions

Tor Økland Barstad

7d

3

36 Prizes for ML Safety Benchmark Ideas

joshc

1mo

3

2 Making it harder for an AGI to "trick" us, with STVs

Tor Økland Barstad

5mo

5