Tree of Tags

Go Back

Choose this branch

You can't go any further

meritocratic regular democratic

hot top alive

46 posts Factored Cognition Experiments Ought AI-assisted Alignment Memory and Mnemonics Air Conditioning

23 posts Debate (AI safety technique)

11 Alignment with argument-networks and assessment-predictions

Tor Økland Barstad

7d

3

21 Research request (alignment strategy): Deep dive on "making AI solve alignment for us"

JanBrauner

19d

3

184 Godzilla Strategies

johnswentworth

6mo

65

105 Beliefs and Disagreements about Automating Alignment Research

Ian McKenzie

3mo

4

66 A Library and Tutorial for Factored Cognition with Language Models

stuhlmueller

2mo

0

52 Ought will host a factored cognition “Lab Meeting”

jungofthewon

3mo

1

62 Rant on Problem Factorization for Alignment

johnswentworth

4mo

48

17 Provably Honest - A First Step

Srijanak De

1mo

2

80 Air Conditioner Test Results & Discussion

johnswentworth

6mo

38

112 Preregistration: Air Conditioner Test

johnswentworth

8mo

64

120 Supervise Process, not Outcomes

stuhlmueller

8mo

8

21 Discussion on utilizing AI for alignment

elifland

3mo

3

28 Making it harder for an AGI to "trick" us, with STVs

Tor Økland Barstad

5mo

5

46 A bicycle for your memory

sortega

8mo

8

26 Take 9: No, RLHF/IDA/debate doesn't solve outer alignment.

Charlie Steiner

8d

14

52 A Small Negative Result on Debate

Sam Bowman

8mo

11

13 Briefly thinking through some analogs of debate

Eli Tyre

3mo

3

32 Learning the smooth prior

Geoffrey Irving

7mo

0

82 Imitative Generalisation (AKA 'Learning the Prior')

Beth Barnes

1y

14

24 An AI-in-a-box success model

azsantosk

8mo

1

69 Why I'm excited about Debate

Richard_Ngo

1y

12

74 A guide to Iterated Amplification & Debate

Rafael Harth

2y

10

97 Writeup: Progress on AI Safety via Debate

Beth Barnes

2y

18

44 Three mental images from thinking about AGI debate & corrigibility

Steven Byrnes

2y

35

41 Looking for adversarial collaborators to test our Debate protocol

Beth Barnes

2y

5

39 How should AI debate be judged?

abramdemski

2y

27

28 Debate Minus Factored Cognition

abramdemski

1y

42

46 AI Safety Debate and Its Applications

VojtaKovarik

3y

5