Go Back
Choose this branch
Choose this branch
meritocratic
regular
democratic
hot
top
alive
73 posts
Reinforcement Learning
Inverse Reinforcement Learning
Wireheading
Reward Functions
Road To AI Safety Excellence
28 posts
AI Capabilities
Definitions
Stag Hunt
Goals
Prompt Engineering
PaLM
EfficientZero
13
Note on algorithms with multiple trained components
Steven Byrnes
6h
1
218
Reward is not the optimization target
TurnTrout
4mo
97
35
A Short Dialogue on the Meaning of Reward Functions
Leon Lang
1mo
0
16
generalized wireheading
carado
1mo
7
77
Seriously, what goes wrong with "reward the agent when it makes you smile"?
TurnTrout
4mo
41
5
AGIs may value intrinsic rewards more than extrinsic ones
catubc
1mo
6
84
Jitters No Evidence of Stupidity in RL
1a3orn
1y
18
24
Is CIRL a promising agenda?
Chris_Leong
6mo
12
11
An investigation into when agents may be incentivized to manipulate our beliefs.
Felix Hofstätter
3mo
0
10
A Survey of Foundational Methods in Inverse Reinforcement Learning
adamk
3mo
0
63
My take on Michael Littman on "The HCI of HAI"
Alex Flint
1y
4
92
Book Review: Human Compatible
Scott Alexander
2y
6
27
$100/$50 rewards for good references
Stuart_Armstrong
1y
5
51
Draft papers for REALab and Decoupled Approval on tampering
Jonathan Uesato
2y
2
71
When AI solves a game, focus on the game's mechanics, not its theme.
Cleo Nardo
27d
7
39
Will we run out of ML data? Evidence from projecting dataset size trends
Pablo Villalobos
1mo
12
281
Is AI Progress Impossible To Predict?
alyssavance
7mo
38
10
Can GPT-3 Write Contra Dances?
jefftk
16d
0
194
EfficientZero: How It Works
1a3orn
1y
42
5
Mastering Stratego (Deepmind)
svemirski
18d
0
139
EfficientZero: human ALE sample-efficiency w/MuZero+self-supervised
gwern
1y
52
35
The Problem With The Current State of AGI Definitions
Yitz
6mo
22
69
Misc. questions about EfficientZero
Daniel Kokotajlo
1y
17
51
Competitive programming with AlphaCode
Algon
10mo
37
23
Remaking EfficientZero (as best I can)
Hoagy
5mo
9
11
What's the Most Impressive Thing That GPT-4 Could Plausibly Do?
bayesed
3mo
24
20
What Belongs in my Glossary?
Zvi
2y
8
44
Compact vs. Wide Models
Vaniver
4y
5