Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
17 posts
Complexity of Value
Value Drift
Whole Brain Emulation
Motivations
LessWrong Review
Psychology
Futurism
Superstimuli
5 posts
Ontology
General Alignment Properties
52
Alignment allows "nonrobust" decision-influences and doesn't require robust grading
TurnTrout
21d
27
33
Understanding and avoiding value drift
TurnTrout
3mo
9
120
Shard Theory: An Overview
David Udell
4mo
34
50
The two-layer model of human values, and problems with synthesizing preferences
Kaj_Sotala
2y
16
2
Chatbots or set answers, not WBEs
Stuart_Armstrong
7y
0
15
Would I think for ten thousand years?
Stuart_Armstrong
3y
13
71
Two Neglected Problems in Human-AI Safety
Wei_Dai
4y
24
9
Towards deconfusing values
Gordon Seidoh Worley
2y
4
34
Broad Picture of Human Values
Thane Ruthenis
4mo
5
47
Acknowledging Human Preference Types to Support Value Learning
Nandi Sabrina Erin
4y
4
12
Working towards AI alignment is better
Johannes C. Mayer
11d
2
37
Review of 'But exactly how complex and fragile?'
TurnTrout
1y
0
28
Can there be an indescribable hellworld?
Stuart_Armstrong
3y
19
58
Three AI Safety Related Ideas
Wei_Dai
4y
38
191
Humans provide an untapped wealth of evidence about alignment
TurnTrout
5mo
92
31
How are you dealing with ontology identification?
Erik Jenner
2mo
10
41
General alignment properties
TurnTrout
4mo
2
7
A sketch of a value-learning sovereign
jessicata
7y
0
45
Test Cases for Impact Regularisation Methods
DanielFilan
3y
5