Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

46 posts Factored Cognition Experiments Ought AI-assisted Alignment Memory and Mnemonics Air Conditioning

23 posts Debate (AI safety technique)

118 Godzilla Strategies

johnswentworth

6mo

65

3 Alignment with argument-networks and assessment-predictions

Tor Økland Barstad

7d

3

11 Research request (alignment strategy): Deep dive on "making AI solve alignment for us"

JanBrauner

19d

3

84 Rant on Problem Factorization for Alignment

johnswentworth

4mo

48

3 Provably Honest - A First Step

Srijanak De

1mo

2

5 Getting from an unaligned AGI to an aligned AGI?

Tor Økland Barstad

6mo

7

80 Air Conditioner Test Results & Discussion

johnswentworth

6mo

38

4 AI-assisted list of ten concrete alignment things to do right now

lcmgcd

3mo

5

116 Supervise Process, not Outcomes

stuhlmueller

8mo

8

79 Beliefs and Disagreements about Automating Alignment Research

Ian McKenzie

3mo

4

0 Making it harder for an AGI to "trick" us, with STVs

Tor Økland Barstad

5mo

5

1 A Deceptively Simple Argument in favor of Problem Factorization

Logan Zoellner

4mo

4

11 Discussion on utilizing AI for alignment

elifland

3mo

3

5 Sufficiently many Godzillas as an alignment strategy

142857

3mo

3

46 Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.

Charlie Steiner

8d

14

34 AI Safety via Debate

ESRogs

4y

13

27 Briefly thinking through some analogs of debate

Eli Tyre

3mo

3

32 A Small Negative Result on Debate

Sam Bowman

8mo

11

91 Writeup: Progress on AI Safety via Debate

Beth Barnes

2y

18

62 A guide to Iterated Amplification & Debate

Rafael Harth

2y

10

26 AI Safety Debate and Its Applications

VojtaKovarik

3y

5

12 Debate AI and the Decision to Release an AI

Chris_Leong

3y

18

9 Splitting Debate up into Two Subsystems

Nandi

2y

5

30 Learning the smooth prior

Geoffrey Irving

7mo

0

102 Imitative Generalisation (AKA 'Learning the Prior')

Beth Barnes

1y

14

77 Why I'm excited about Debate

Richard_Ngo

1y

12

34 New paper: (When) is Truth-telling Favored in AI debate?

VojtaKovarik

2y

7

16 Thoughts on "AI safety via debate"

Gordon Seidoh Worley

4y

4