Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

74 posts Infra-Bayesianism Counterfactuals Logic & Mathematics Formal Proof Domain Theory Functional Decision Theory Counterfactual Mugging Newcomb's Problem Futarchy Ontological Crisis Meta-Honesty Intelligence Explosion

40 posts Interviews Audio Redwood Research AXRP Transcripts Adversarial Examples Adversarial Training AI Robustness Autonomous Weapons

104 Introduction To The Infra-Bayesianism Sequence

Diffractor

2y

64

98 Infra-Bayesian physicalism: a formal theory of naturalized induction

Vanessa Kosoy

1y

20

87 Counterfactual Mugging Poker Game

Scott Garrabrant

4y

2

84 Zoom In: An Introduction to Circuits

evhub

2y

11

76 Recent Progress in the Theory of Neural Networks

interstice

3y

9

50 Complete Class: Consequentialist Foundations

abramdemski

4y

34

49 Probability as Minimal Map

johnswentworth

3y

10

49 Infra-Exercises, Part 1

Diffractor

3mo

9

47 MIRI/OP exchange about decision theory

Rob Bensinger

1y

7

46 Self-confirming predictions can be arbitrarily bad

Stuart_Armstrong

3y

11

37 The Promise and Peril of Finite Sets

davidad

1y

4

35 Basic Inframeasure Theory

Diffractor

2y

16

33 AXRP Episode 5 - Infra-Bayesianism with Vanessa Kosoy

DanielFilan

1y

12

33 Hessian and Basin volume

Vivek Hebbar

5mo

9

143 Redwood Research’s current project

Buck

1y

29

136 High-stakes alignment via adversarial training [Redwood Research report]

dmz

7mo

29

135 Takeaways from our robust injury classifier project [Redwood Research]

dmz

3mo

9

134 Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau

1mo

14

130 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

112 Why I'm excited about Redwood Research's current project

paulfchristiano

1y

6

86 Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

KevinRoWang

1mo

5

56 AXRP Episode 9 - Finite Factored Sets with Scott Garrabrant

DanielFilan

1y

2

48 Redwood's Technique-Focused Epistemic Strategy

adamShimi

1y

1

44 Conversation with Paul Christiano

abergal

3y

6

43 A conversation about Katja's counterarguments to AI risk

Matthew Barnett

2mo

9

41 AXRP Episode 4 - Risks from Learned Optimization with Evan Hubinger

DanielFilan

1y

10

40 Rohin Shah on reasons for AI optimism

abergal

3y

58

36 AXRP Episode 12 - AI Existential Risk with Paul Christiano

DanielFilan

1y

0