“Value fragility and AI takeover” by Joe_Carlsmith
EA Forum Podcast (All audio) - A podcast by EA Forum Team
Categories:
1. Introduction “Value fragility,” as I’ll construe it, is the claim that slightly-different value systems tend to lead in importantly-different directions when subject to extreme optimization. I think the idea of value fragility haunts the AI risk discourse in various ways – and in particular, that it informs a backdrop prior that adequately aligning a superintelligence requires an extremely precise and sophisticated kind of technical and ethical achievement. That is, the thought goes: if you get a superintelligence's values even slightly wrong, you’re screwed. This post is a collection of loose and not-super-organized reflections on value fragility and its role in arguments for pessimism about AI risk. I start by trying to tease apart a number of different claims in the vicinity of value fragility. In particular: I distinguish between questions about value fragility and questions about how different agents would converge on the same values given adequate [...] ---Outline:(00:04) 1. Introduction(03:46) 2. Variants of value fragility(03:57) 2.1 Some initial definitions(09:02) 2.2 Are these claims true?(11:23) 2.3 Value fragility in the real world(11:59) 2.3.1 Will agent's optimize for their values on reflection, and does this matter?(14:59) 2.3.2 Will agents optimize extremely/intensely, and does this matter?(24:06) 2.4 Multipolar value fragility(28:21) 2.4.1 Does multipolarity diffuse value fragility somehow?(32:10) 3. What's the role of value fragility in the case for AI risk?(35:43) 3.1 The value of what an AI does after taking over the world(37:15) 3.2 Value fragility in the context of extremely-easy takeovers(45:43) 3.3 Value fragility in cases where takeover isn’t extremely easy(52:36) 4. The possible role of niceness and power-sharing in diffusing these dynamicsThe original text contained 16 footnotes which were omitted from this narration. The original text contained 2 images which were described by AI. --- First published: August 5th, 2024 Source: https://forum.effectivealtruism.org/posts/fhkkScpkrLPNzsQqF/value-fragility-and-ai-takeover --- Narrated by TYPE III AUDIO. ---Images from the article:Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.