Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
31 posts
SERI MATS
AI Alignment Fieldbuilding
Intellectual Progress (Society-Level)
Distillation & Pedagogy
Practice & Philosophy of Science
Information Hazards
PIBBSS
Intellectual Progress via LessWrong
Economic Consequences of AGI
Privacy
Superintelligence
Automation
532 posts
Epistemology
Intellectual Progress (Individual-Level)
Research Taste
Epistemic Review
Selection Effects
Social & Cultural Dynamics
Humility
144
Your posts should be on arXiv
JanBrauner
3mo
39
54
Principles of Privacy for Alignment Research
johnswentworth
4mo
30
40
Behaviour Manifolds and the Hessian of the Total Loss - Notes and Criticism
Spencer Becker-Kahn
3mo
4
158
Most People Start With The Same Few Bad Ideas
johnswentworth
3mo
30
167
Conjecture: Internal Infohazard Policy
Connor Leahy
4mo
6
97
Intuitions about solving hard problems
Richard_Ngo
7mo
23
15
Abram Demski's ELK thoughts and proposal - distillation
Rubi J. Hudson
5mo
4
13
A distillation of Evan Hubinger's training stories (for SERI MATS)
Daphne_W
5mo
1
133
The Fusion Power Generator Scenario
johnswentworth
2y
29
52
Needed: AI infohazard policy
Vanessa Kosoy
2y
17
40
Suggestions of posts on the AF to review
adamShimi
1y
20
32
Characterizing Real-World Agents as a Research Meta-Strategy
johnswentworth
3y
4
100
Productive Mistakes, Not Perfect Answers
adamShimi
8mo
11
35
Epistemic Artefacts of (conceptual) AI alignment research
Nora_Ammann
4mo
1
55
Methodological Therapy: An Agenda For Tackling Research Bottlenecks
adamShimi
2mo
6
25
What are concrete examples of potential "lock-in" in AI research?
Grue_Slinky
3y
6
29
Attempts at Forwarding Speed Priors
james.lucassen
2mo
2
13
Rob B's Shortform Feed
Rob Bensinger
3y
79
73
How to do theoretical research, a personal perspective
Mark Xu
4mo
4
9
Thoughts on Retrieving Knowledge from Neural Networks
Jaime Ruiz
3y
2
10
Vague Thoughts and Questions about Agent Structures
loriphos
3y
3
9
Very different, very adequate outcomes
Stuart_Armstrong
3y
10
12
Impact Measure Testing with Honey Pots and Myopia
michaelcohen
4y
5
11
Toy model piece #4: partial preferences, re-re-visited
Stuart_Armstrong
3y
5
14
Hackable Rewards as a Safety Valve?
Davidmanheim
3y
17
13
Computational complexity of RL with traps
Vanessa Kosoy
4y
2
38
Torture and Dust Specks and Joy--Oh my! or: Non-Archimedean Utility Functions as Pseudograded Vector Spaces
Louis_Brown
3y
29
5
Safety in Machine Learning
Gordon Seidoh Worley
4y
0