Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
30 posts
Reinforcement Learning
Wireheading
Reward Functions
11 posts
AI Capabilities
EfficientZero
Tradeoffs
233
Reward is not the optimization target
TurnTrout
4mo
97
42
Four usages of "loss" in AI
TurnTrout
2mo
18
80
Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout
4mo
41
83
Scaling Laws for Reward Model Overoptimization
leogao
2mo
11
77
Towards deconfusing wireheading and reward maximization
leogao
3mo
7
38
Conditioning, Prompts, and Fine-Tuning
Adam Jermyn
4mo
9
29
The reward engineering problem
paulfchristiano
3y
3
6
Some work on connecting UDT and Reinforcement Learning
IAFF-User-111
7y
0
6
Modeling the capabilities of advanced AI systems as episodic reinforcement learning
jessicata
6y
0
2
Vector-Valued Reinforcement Learning
orthonormal
6y
0
0
Reward/value learning for reinforcement learning
Stuart_Armstrong
5y
0
1
Delegative Reinforcement Learning with a Merely Sane Advisor
Vanessa Kosoy
5y
2
34
Wireheading as a potential problem with the new impact measure
Stuart_Armstrong
4y
20
35
Wireheading is in the eye of the beholder
Stuart_Armstrong
3y
10
81
Evaluations project @ ARC is hiring a researcher and a webdev/engineer
Beth Barnes
3mo
7
38
It matters when the first sharp left turn happens
Adam Jermyn
2mo
9
44
Will we run out of ML data? Evidence from projecting dataset size trends
Pablo Villalobos
1mo
12
98
The alignment problem in different capability regimes
Buck
1y
12
108
We have achieved Noob Gains in AI
phdead
7mo
21
25
Remaking EfficientZero (as best I can)
Hoagy
5mo
9
8
Epistemic Strategies of Safety-Capabilities Tradeoffs
adamShimi
1y
0
144
EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised
gwern
1y
52
212
EfficientZero: How It Works
1a3orn
1y
42
77
OpenAI Solves (Some) Formal Math Olympiad Problems
Michaƫl Trazzi
10mo
26
70
Misc. questions about EfficientZero
Daniel Kokotajlo
1y
17