Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

31 posts SERI MATS AI Alignment Fieldbuilding Intellectual Progress (Society-Level) Distillation & Pedagogy Practice & Philosophy of Science Information Hazards PIBBSS Intellectual Progress via LessWrong Economic Consequences of AGI Privacy Superintelligence Automation

532 posts Epistemology Intellectual Progress (Individual-Level) Research Taste Epistemic Review Selection Effects Social & Cultural Dynamics Humility

144 Your posts should be on arXiv

JanBrauner

3mo

39

54 Principles of Privacy for Alignment Research

johnswentworth

4mo

30

40 Behaviour Manifolds and the Hessian of the Total Loss - Notes and Criticism

Spencer Becker-Kahn

3mo

4

158 Most People Start With The Same Few Bad Ideas

johnswentworth

3mo

30

167 Conjecture: Internal Infohazard Policy

Connor Leahy

4mo

6

97 Intuitions about solving hard problems

Richard_Ngo

7mo

23

15 Abram Demski's ELK thoughts and proposal - distillation

Rubi J. Hudson

5mo

4

13 A distillation of Evan Hubinger's training stories (for SERI MATS)

Daphne_W

5mo

1

133 The Fusion Power Generator Scenario

johnswentworth

2y

29

52 Needed: AI infohazard policy

Vanessa Kosoy

2y

17

40 Suggestions of posts on the AF to review

adamShimi

1y

20

32 Characterizing Real-World Agents as a Research Meta-Strategy

johnswentworth

3y

4

100 Productive Mistakes, Not Perfect Answers

adamShimi

8mo

11

35 Epistemic Artefacts of (conceptual) AI alignment research

Nora_Ammann

4mo

1

55 Methodological Therapy: An Agenda For Tackling Research Bottlenecks

adamShimi

2mo

6

25 What are concrete examples of potential "lock-in" in AI research?

Grue_Slinky

3y

6

29 Attempts at Forwarding Speed Priors

james.lucassen

2mo

2

13 Rob B's Shortform Feed

Rob Bensinger

3y

79

73 How to do theoretical research, a personal perspective

Mark Xu

4mo

4

9 Thoughts on Retrieving Knowledge from Neural Networks

Jaime Ruiz

3y

2

10 Vague Thoughts and Questions about Agent Structures

loriphos

3y

3

9 Very different, very adequate outcomes

Stuart_Armstrong

3y

10

12 Impact Measure Testing with Honey Pots and Myopia

michaelcohen

4y

5

11 Toy model piece #4: partial preferences, re-re-visited

Stuart_Armstrong

3y

5

14 Hackable Rewards as a Safety Valve?

Davidmanheim

3y

17

13 Computational complexity of RL with traps

Vanessa Kosoy

4y

2

38 Torture and Dust Specks and Joy--Oh my! or: Non-Archimedean Utility Functions as Pseudograded Vector Spaces

Louis_Brown

3y

29

5 Safety in Machine Learning

Gordon Seidoh Worley

4y

0