Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

69 posts Debate (AI safety technique) Factored Cognition Experiments Ought AI-assisted Alignment Memory and Mnemonics Air Conditioning

43 posts Iterated Amplification Humans Consulting HCH

184 Godzilla Strategies

johnswentworth

6mo

65

120 Supervise Process, not Outcomes

stuhlmueller

8mo

8

112 Preregistration: Air Conditioner Test

johnswentworth

8mo

64

105 Beliefs and Disagreements about Automating Alignment Research

Ian McKenzie

3mo

4

98 Ought: why it matters and ways to help

paulfchristiano

3y

7

97 Writeup: Progress on AI Safety via Debate

Beth Barnes

2y

18

89 Solving Math Problems by Relay

bgold

2y

26

82 Imitative Generalisation (AKA 'Learning the Prior')

Beth Barnes

1y

14

80 Air Conditioner Test Results & Discussion

johnswentworth

6mo

38

74 A guide to Iterated Amplification & Debate

Rafael Harth

2y

10

69 Why I'm excited about Debate

Richard_Ngo

1y

12

66 A Library and Tutorial for Factored Cognition with Language Models

stuhlmueller

2mo

0

62 Rant on Problem Factorization for Alignment

johnswentworth

4mo

48

52 Ought will host a factored cognition “Lab Meeting”

jungofthewon

3mo

1

139 Paul's research agenda FAQ

zhukeepa

4y

73

121 My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda

Chi Nguyen

2y

21

118 Debate update: Obfuscated arguments problem

Beth Barnes

1y

21

63 Model splintering: moving from one imperfect model to another

Stuart_Armstrong

2y

10

61 Relaxed adversarial training for inner alignment

evhub

3y

28

46 Iterated Distillation and Amplification

Ajeya Cotra

4y

13

46 Garrabrant and Shah on human modeling in AGI

Rob Bensinger

1y

10

45 Preface to the sequence on iterated amplification

paulfchristiano

4y

8

44 Notes on OpenAI’s alignment plan

Alex Flint

12d

5

38 Directions and desiderata for AI alignment

paulfchristiano

3y

1

37 HCH is not just Mechanical Turk

William_S

3y

6

37 Can HCH epistemically dominate Ramanujan?

zhukeepa

3y

4

36 HCH Speculation Post #2A

Charlie Steiner

1y

7

34 Relating HCH and Logical Induction

abramdemski

2y

4