Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

106 posts Careers Infra-Bayesianism SERI MATS Formal Proof Domain Theory Distributional Shifts

79 posts Audio Interviews Organization Updates Redwood Research AXRP Adversarial Examples Adversarial Training AI Robustness

5 What about non-degree seeking?

Lao Mein

3d

5

6 [ASoT] Reflectivity in Narrow AI

Ulisse Mini

29d

1

30 Where to be an AI Safety Professor

scasper

13d

12

55 Proper scoring rules don’t guarantee predicting fixed points

Johannes_Treutlein

4d

2

15 Is the "Valley of Confused Abstractions" real?

jacquesthibs

15d

9

16 Vanessa Kosoy's PreDCA, distilled

Martín Soto

1mo

17

98 Infra-Bayesian physicalism: a formal theory of naturalized induction

Vanessa Kosoy

1y

20

117 Taking the parameters which seem to matter and rotating them until they don't

Garrett Baker

3mo

48

15 Guardian AI (Misaligned systems are all around us.)

Jessica Mary

25d

6

68 Neural Tangent Kernel Distillation

Thomas Larsen

2mo

20

28 Why I'm Working On Model Agnostic Interpretability

Jessica Mary

1mo

9

67 Career Scouting: Dentistry

koratkar

1mo

5

7 Working towards AI alignment is better

Johannes C. Mayer

11d

2

22 How do you get a job as a software developer?

lsusr

4mo

24

130 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

24 Latent Adversarial Training

Adam Jermyn

5mo

9

134 Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau

1mo

14

46 How and why to turn everything into audio

KatWoods

4mo

18

29 Which LessWrong content would you like recorded into audio/podcast form?

Ruby

3mo

11

86 Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

KevinRoWang

1mo

5

37 Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas

Akash

25d

2

135 Takeaways from our robust injury classifier project [Redwood Research]

dmz

3mo

9

131 Announcing the LessWrong Curated Podcast

Ben Pace

6mo

17

136 High-stakes alignment via adversarial training [Redwood Research report]

dmz

7mo

29

143 Redwood Research’s current project

Buck

1y

29

26 Me (Steve Byrnes) on the “Brain Inspired” podcast

Steven Byrnes

1mo

1

10 AXRP Episode 18 - Concept Extrapolation with Stuart Armstrong

DanielFilan

3mo

1

74 Listen to top LessWrong posts with The Nonlinear Library

KatWoods

1y

27