Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

934 posts AI Value Learning Embedded Agency Community Eliciting Latent Knowledge (ELK) Reinforcement Learning Infra-Bayesianism Counterfactuals Logic & Mathematics Interviews AI Capabilities Inverse Reinforcement Learning

80 posts AI Timelines AI Takeoff AI Persuasion History Forecasting & Prediction Dialogue (format) Technological Forecasting Forecasts (Specific Predictions) Industrial Revolution Effective Altruism Transformative AI Progress Studies

79 Towards Hodge-podge Alignment

Cleo Nardo

1d

20

39 The "Minimal Latents" Approach to Natural Abstractions

johnswentworth

22h

6

251 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

7 Note on algorithms with multiple trained components

Steven Byrnes

7h

1

12 Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.

Charlie Steiner

19h

0

53 Can we efficiently explain model behaviors?

paulfchristiano

4d

0

85 Trying to disambiguate different questions about whether RLHF is “good”

Buck

6d

39

182 Using GPT-Eliezer against ChatGPT Jailbreaking

Stuart_Armstrong

14d

77

13 Event [Berkeley]: Alignment Collaborator Speed-Meeting

AlexMennen

1d

2

154 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

49 Existential AI Safety is NOT separate from near-term applications

scasper

7d

15

29 High-level hopes for AI alignment

HoldenKarnofsky

5d

3

13 Looking for an alignment tutor

JanBrauner

3d

2

129 Mechanistic anomaly detection and ELK

paulfchristiano

25d

17

182 What does it take to defend the world against out-of-control AGIs?

Steven Byrnes

1mo

31

84 Disagreement with bio anchors that lead to shorter timelines

Marius Hobbhahn

1mo

16

104 Applying superintelligence without collusion

Eric Drexler

1mo

56

315 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

13 How promising are legal avenues to restrict AI training data?

thehalliard

10d

2

110 How might we align transformative AI if it’s developed very soon?

HoldenKarnofsky

3mo

17

29 The economy as an analogy for advanced AI systems

rosehadshar

1mo

0

62 my current outlook on AI risk mitigation

carado

2mo

4

206 Deepmind's Gato: Generalist Agent

Daniel Kokotajlo

7mo

61

62 A review of the Bio-Anchors report

jylin04

2mo

4

486 What 2026 looks like

Daniel Kokotajlo

1y

98

84 AI strategy nearcasting

HoldenKarnofsky

3mo

3

110 Announcing Epoch: A research organization investigating the road to Transformative AI

Jsevillamol

5mo

2

118 Everything I Need To Know About Takeoff Speeds I Learned From Air Conditioner Ratings On Amazon

johnswentworth

8mo

130