Tree of Tags

Go Back

Choose this branch

Choose this branch

meritocratic regular democratic

hot top alive

934 posts AI Value Learning Embedded Agency Community Eliciting Latent Knowledge (ELK) Reinforcement Learning Infra-Bayesianism Counterfactuals Logic & Mathematics Interviews AI Capabilities Inverse Reinforcement Learning

80 posts AI Timelines AI Takeoff AI Persuasion History Forecasting & Prediction Dialogue (format) Technological Forecasting Forecasts (Specific Predictions) Industrial Revolution Effective Altruism Transformative AI Progress Studies

13 Note on algorithms with multiple trained components

Steven Byrnes

7h

1

45 Towards Hodge-podge Alignment

Cleo Nardo

1d

20

30 Take 12: RLHF's use is evidence that orgs will jam RL at real-world problems.

Charlie Steiner

19h

0

35 The "Minimal Latents" Approach to Natural Abstractions

johnswentworth

22h

6

213 AI alignment is distinct from its near-term applications

paulfchristiano

7d

5

73 Can we efficiently explain model behaviors?

paulfchristiano

4d

0

99 Trying to disambiguate different questions about whether RLHF is “good”

Buck

6d

39

23 Event [Berkeley]: Alignment Collaborator Speed-Meeting

AlexMennen

1d

2

55 High-level hopes for AI alignment

HoldenKarnofsky

5d

3

136 Using GPT-Eliezer against ChatGPT Jailbreaking

Stuart_Armstrong

14d

77

17 Looking for an alignment tutor

JanBrauner

3d

2

106 Causal Scrubbing: a method for rigorously testing interpretability hypotheses [Redwood Research]

LawrenceC

17d

9

106 Finding gliders in the game of life

paulfchristiano

19d

7

37 Take 10: Fine-tuning with RLHF is aesthetically unsatisfying.

Charlie Steiner

7d

3

60 Disagreement with bio anchors that lead to shorter timelines

Marius Hobbhahn

1mo

16

100 What does it take to defend the world against out-of-control AGIs?

Steven Byrnes

1mo

31

70 Applying superintelligence without collusion

Eric Drexler

1mo

56

259 Two-year update on my personal AI timelines

Ajeya Cotra

4mo

60

104 How might we align transformative AI if it’s developed very soon?

HoldenKarnofsky

3mo

17

23 The economy as an analogy for advanced AI systems

rosehadshar

1mo

0

54 my current outlook on AI risk mitigation

carado

2mo

4

5 How promising are legal avenues to restrict AI training data?

thehalliard

10d

2

180 Everything I Need To Know About Takeoff Speeds I Learned From Air Conditioner Ratings On Amazon

johnswentworth

8mo

130

74 AI strategy nearcasting

HoldenKarnofsky

3mo

3

122 Deepmind's Gato: Generalist Agent

Daniel Kokotajlo

7mo

61

52 Replacement for PONR concept

Daniel Kokotajlo

3mo

6

124 Takeoff speeds have a huge effect on what it means to work on AI x-risk

Buck

8mo

25

51 AGI Timelines Are Mostly Not Strategically Relevant To Alignment

johnswentworth

3mo

35