Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

46 posts Factored Cognition Experiments Ought AI-assisted Alignment Memory and Mnemonics Air Conditioning

23 posts Debate (AI safety technique)

151 Godzilla Strategies

johnswentworth

6mo

65

118 Supervise Process, not Outcomes

stuhlmueller

8mo

8

109 Preregistration: Air Conditioner Test

johnswentworth

8mo

64

98 Solving Math Problems by Relay

bgold

2y

26

92 Beliefs and Disagreements about Automating Alignment Research

Ian McKenzie

3mo

4

87 Ought: why it matters and ways to help

paulfchristiano

3y

7

80 Air Conditioner Test Results & Discussion

johnswentworth

6mo

38

73 Rant on Problem Factorization for Alignment

johnswentworth

4mo

48

50 Experiment: a good researcher is hard to find

gwern

10y

21

48 Vaniver's View on Factored Cognition

Vaniver

3y

4

47 A Library and Tutorial for Factored Cognition with Language Models

stuhlmueller

2mo

0

45 Factored Cognition

stuhlmueller

4y

6

42 Scientific Wrestling: Beyond Passive Hypothesis-Testing

adamShimi

9mo

6

42 The Majority Is Always Wrong

Eliezer Yudkowsky

15y

54

94 Writeup: Progress on AI Safety via Debate

Beth Barnes

2y

18

92 Imitative Generalisation (AKA 'Learning the Prior')

Beth Barnes

1y

14

73 Why I'm excited about Debate

Richard_Ngo

1y

12

68 A guide to Iterated Amplification & Debate

Rafael Harth

2y

10

55 Three mental images from thinking about AGI debate & corrigibility

Steven Byrnes

2y

35

52 Looking for adversarial collaborators to test our Debate protocol

Beth Barnes

2y

5

49 How should AI debate be judged?

abramdemski

2y

27

42 A Small Negative Result on Debate

Sam Bowman

8mo

11

37 Debate Minus Factored Cognition

abramdemski

1y

42

36 AI Safety Debate and Its Applications

VojtaKovarik

3y

5

36 Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.

Charlie Steiner

8d

14

35 Thoughts on AI Safety via Debate

Vaniver

4y

12

35 Can there be an indescribable hellworld?

Stuart_Armstrong

3y

19

32 New paper: (When) is Truth-telling Favored in AI debate?

VojtaKovarik

2y

7