Tree of Tags

Go Back

You can't go any further

You can't go any further

meritocratic regular democratic

hot top alive

26 posts Iterated Amplification

17 posts Humans Consulting HCH

47 Notes on OpenAI’s alignment plan

Alex Flint

12d

5

125 Debate update: Obfuscated arguments problem

Beth Barnes

1y

21

18 Surprised by ELK report's counterexample to Debate, IDA

Evan R. Murphy

4mo

0

119 My Understanding of Paul Christiano's Iterated Amplification AI Safety Research Agenda

Chi Nguyen

2y

21

74 Model splintering: moving from one imperfect model to another

Stuart_Armstrong

2y

10

125 Paul's research agenda FAQ

zhukeepa

4y

73

61 Relaxed adversarial training for inner alignment

evhub

3y

28

33 Synthesizing amplification and debate

evhub

2y

10

47 Directions and desiderata for AI alignment

paulfchristiano

3y

1

45 Iterated Distillation and Amplification

Ajeya Cotra

4y

13

42 Preface to the sequence on iterated amplification

paulfchristiano

4y

8

37 My confusions with Paul's Agenda

Vaniver

4y

1

19 How does iterated amplification exceed human abilities?

riceissa

2y

9

29 Supervising strong learners by amplifying weak experts

paulfchristiano

3y

1

57 Garrabrant and Shah on human modeling in AGI

Rob Bensinger

1y

10

42 HCH Speculation Post #2A

Charlie Steiner

1y

7

15 HCH and Adversarial Questions

David Udell

10mo

7

47 Relating HCH and Logical Induction

abramdemski

2y

4

28 Universality Unwrapped

adamShimi

2y

2

10 Universality and the “Filter”

maggiehayes

1y

3

41 HCH is not just Mechanical Turk

William_S

3y

6

16 Epistemology of HCH

adamShimi

1y

2

30 What are the differences between all the iterative/recursive approaches to AI alignment?

riceissa

3y

14

35 What's wrong with these analogies for understanding Informed Oversight and IDA?

Wei_Dai

3y

3

15 Mapping the Conceptual Territory in AI Existential Safety and Alignment

jbkjr

1y

0

34 Can HCH epistemically dominate Ramanujan?

zhukeepa

3y

4

32 Humans Consulting HCH

paulfchristiano

4y

10

27 Towards formalizing universality

paulfchristiano

3y

19