Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

14 posts Factored Cognition Ought

14 posts Debate (AI safety technique) Adversarial Collaboration

63 A Library and Tutorial for Factored Cognition with Language Models

stuhlmueller

2mo

0

49 Ought will host a factored cognition “Lab Meeting”

jungofthewon

3mo

1

61 Rant on Problem Factorization for Alignment

johnswentworth

4mo

48

114 Supervise Process, not Outcomes

stuhlmueller

8mo

8

93 Ought: why it matters and ways to help

paulfchristiano

3y

7

26 Idealized Factored Cognition

Rafael Harth

2y

6

25 Preface to the Sequence on Factored Cognition

Rafael Harth

2y

7

49 Factored Cognition

stuhlmueller

4y

6

29 Update on Ought's experiments on factored evaluation of arguments

Owain_Evans

2y

0

31 Vaniver's View on Factored Cognition

Vaniver

3y

4

17 Clarifying Factored Cognition

Rafael Harth

2y

2

12 Traversing a Cognition Space

Rafael Harth

2y

5

19 Alignment Newsletter #36

Rohin Shah

4y

0

10 [AN #86]: Improving debate and factored cognition through human experiments

Rohin Shah

2y

0

25 Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.

Charlie Steiner

8d

14

49 A Small Negative Result on Debate

Sam Bowman

8mo

11

78 Imitative Generalisation (AKA 'Learning the Prior')

Beth Barnes

1y

14

66 Why I'm excited about Debate

Richard_Ngo

1y

12

70 A guide to Iterated Amplification & Debate

Rafael Harth

2y

10

91 Writeup: Progress on AI Safety via Debate

Beth Barnes

2y

18

39 Looking for adversarial collaborators to test our Debate protocol

Beth Barnes

2y

5

37 How should AI debate be judged?

abramdemski

2y

27

27 Debate Minus Factored Cognition

abramdemski

1y

42

44 AI Safety Debate and Its Applications

VojtaKovarik

3y

5

29 New paper: (When) is Truth-telling Favored in AI debate?

VojtaKovarik

2y

7

18 Problems with AI debate

Stuart_Armstrong

3y

3

19 AI Safety via Debate

ESRogs

4y

13

8 Thoughts on "AI safety via debate"

Gordon Seidoh Worley

4y

4