Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
41 posts
Reinforcement Learning
AI Capabilities
Wireheading
Reward Functions
EfficientZero
Tradeoffs
33 posts
Embedded Agency
Subagents
Robust Agents
Category Theory
Spurious Counterfactuals
Memetics
Autonomous Vehicles
13
Note on algorithms with multiple trained components
Steven Byrnes
7h
1
233
Reward is not the optimization target
TurnTrout
4mo
97
83
Scaling Laws for Reward Model Overoptimization
leogao
2mo
11
37
A Short Dialogue on the Meaning of Reward Functions
Leon Lang
1mo
0
44
Will we run out of ML data? Evidence from projecting dataset size trends
Pablo Villalobos
1mo
12
77
Towards deconfusing wireheading and reward maximization
leogao
3mo
7
81
Evaluations project @ ARC is hiring a researcher and a webdev/engineer
Beth Barnes
3mo
7
80
Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout
4mo
41
42
Four usages of "loss" in AI
TurnTrout
2mo
18
38
It matters when the first sharp left turn happens
Adam Jermyn
2mo
9
108
We have achieved Noob Gains in AI
phdead
7mo
21
212
EfficientZero: How It Works
1a3orn
1y
42
38
Conditioning, Prompts, and Fine-Tuning
Adam Jermyn
4mo
9
144
EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised
gwern
1y
52
259
Humans are very reliable agents
alyssavance
6mo
35
37
Gradations of Agency
Daniel Kokotajlo
7mo
6
103
Reward Is Not Enough
Steven Byrnes
1y
18
155
Introduction to Cartesian Frames
Scott Garrabrant
2y
29
20
You Only Get One Shot: an Intuition Pump for Embedded Agency
Oliver Sourbut
6mo
4
93
Updates and additions to "Embedded Agency"
Rob Bensinger
2y
1
134
Why Subagents?
johnswentworth
3y
42
70
Additive Operations on Cartesian Frames
Scott Garrabrant
2y
6
93
Humans Are Embedded Agents Too
johnswentworth
2y
19
62
Functors and Coarse Worlds
Scott Garrabrant
2y
4
59
Time in Cartesian Frames
Scott Garrabrant
2y
16
109
Robust Delegation
abramdemski
4y
10
49
Subagents of Cartesian Frames
Scott Garrabrant
2y
5
49
Biextensional Equivalence
Scott Garrabrant
2y
13