Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

31 posts SERI MATS AI Alignment Fieldbuilding Intellectual Progress (Society-Level) Distillation & Pedagogy Practice & Philosophy of Science Information Hazards PIBBSS Intellectual Progress via LessWrong Economic Consequences of AGI Privacy Superintelligence Automation

532 posts Epistemology Intellectual Progress (Individual-Level) Research Taste Epistemic Review Selection Effects Social & Cultural Dynamics Humility

135 Your posts should be on arXiv

JanBrauner

3mo

39

68 Principles of Privacy for Alignment Research

johnswentworth

4mo

30

35 Behaviour Manifolds and the Hessian of the Total Loss - Notes and Criticism

Spencer Becker-Kahn

3mo

4

161 Most People Start With The Same Few Bad Ideas

johnswentworth

3mo

30

119 Conjecture: Internal Infohazard Policy

Connor Leahy

4mo

6

92 Intuitions about solving hard problems

Richard_Ngo

7mo

23

15 Abram Demski's ELK thoughts and proposal - distillation

Rubi J. Hudson

5mo

4

15 A distillation of Evan Hubinger's training stories (for SERI MATS)

Daphne_W

5mo

1

136 The Fusion Power Generator Scenario

johnswentworth

2y

29

61 Needed: AI infohazard policy

Vanessa Kosoy

2y

17

56 Suggestions of posts on the AF to review

adamShimi

1y

20

29 Characterizing Real-World Agents as a Research Meta-Strategy

johnswentworth

3y

4

95 Productive Mistakes, Not Perfect Answers

adamShimi

8mo

11

30 Epistemic Artefacts of (conceptual) AI alignment research

Nora_Ammann

4mo

1

54 Methodological Therapy: An Agenda For Tackling Research Bottlenecks

adamShimi

2mo

6

17 What are concrete examples of potential "lock-in" in AI research?

Grue_Slinky

3y

6

23 Attempts at Forwarding Speed Priors

james.lucassen

2mo

2

21 Rob B's Shortform Feed

Rob Bensinger

3y

79

84 How to do theoretical research, a personal perspective

Mark Xu

4mo

4

11 Thoughts on Retrieving Knowledge from Neural Networks

Jaime Ruiz

3y

2

9 Vague Thoughts and Questions about Agent Structures

loriphos

3y

3

14 Very different, very adequate outcomes

Stuart_Armstrong

3y

10

11 Impact Measure Testing with Honey Pots and Myopia

michaelcohen

4y

5

13 Toy model piece #4: partial preferences, re-re-visited

Stuart_Armstrong

3y

5

17 Hackable Rewards as a Safety Valve?

Davidmanheim

3y

17

16 Computational complexity of RL with traps

Vanessa Kosoy

4y

2

19 Torture and Dust Specks and Joy--Oh my! or: Non-Archimedean Utility Functions as Pseudograded Vector Spaces

Louis_Brown

3y

29

6 Safety in Machine Learning

Gordon Seidoh Worley

4y

0