Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

106 posts Careers Infra-Bayesianism SERI MATS Formal Proof Domain Theory Distributional Shifts

79 posts Audio Interviews Organization Updates Redwood Research AXRP Adversarial Examples Adversarial Training AI Robustness

5 What about non-degree seeking?

Lao Mein

3d

5

6 [ASoT] Reflectivity in Narrow AI

Ulisse Mini

29d

1

28 Where to be an AI Safety Professor

scasper

13d

12

39 Proper scoring rules don’t guarantee predicting fixed points

Johannes_Treutlein

4d

2

16 Is the "Valley of Confused Abstractions" real?

jacquesthibs

15d

9

2 Vanessa Kosoy's PreDCA, distilled

Martín Soto

1mo

17

93 Infra-Bayesian physicalism: a formal theory of naturalized induction

Vanessa Kosoy

1y

20

96 Taking the parameters which seem to matter and rotating them until they don't

Garrett Baker

3mo

48

4 Guardian AI (Misaligned systems are all around us.)

Jessica Mary

25d

6

54 Neural Tangent Kernel Distillation

Thomas Larsen

2mo

20

20 Why I'm Working On Model Agnostic Interpretability

Jessica Mary

1mo

9

45 Career Scouting: Dentistry

koratkar

1mo

5

1 Working towards AI alignment is better

Johannes C. Mayer

11d

2

15 How do you get a job as a software developer?

lsusr

4mo

24

96 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

17 Latent Adversarial Training

Adam Jermyn

5mo

9

109 Apply to the Redwood Research Mechanistic Interpretability Experiment (REMIX), a research program in Berkeley

maxnadeau

1mo

14

49 How and why to turn everything into audio

KatWoods

4mo

18

36 Which LessWrong content would you like recorded into audio/podcast form?

Ruby

3mo

11

67 Some Lessons Learned from Studying Indirect Object Identification in GPT-2 small

KevinRoWang

1mo

5

32 Podcast: Shoshannah Tekofsky on skilling up in AI safety, visiting Berkeley, and developing novel research ideas

Akash

25d

2

127 Takeaways from our robust injury classifier project [Redwood Research]

dmz

3mo

9

106 Announcing the LessWrong Curated Podcast

Ben Pace

6mo

17

88 High-stakes alignment via adversarial training [Redwood Research report]

dmz

7mo

29

165 Redwood Research’s current project

Buck

1y

29

26 Me (Steve Byrnes) on the “Brain Inspired” podcast

Steven Byrnes

1mo

1

8 AXRP Episode 18 - Concept Extrapolation with Stuart Armstrong

DanielFilan

3mo

1

44 Listen to top LessWrong posts with The Nonlinear Library

KatWoods

1y

27