On AI catastrophism

Why, despite the red flags, leftists should take existential risks from AI seriously

Let me tell a story about a group of people who are worried that superhuman artificial intelligence might turn against humanity and kill us all.

In the 90s and 2000s, the story goes, even as popular media worried about Y2K and the Mayan calendar, a few internet weirdos proclaimed themselves prophets of a coming AI apocalypse. Godlike AI was coming, they claimed, and we could either “align” it to be benevolent and obtain a paradise, or fail and burn. They slowly developed a cult following. The AI-catastrophe narrative appealed to rich white techbros by placing them, their work, and their nerdy scifi cultural references at the center of history. It also took advantage of the age-old human impulse to millenarian religious belief, which has regularly produced new cults since before the time of Jesus. This particular cult earned the backing of a few tech billionaires, allowing them to sublimate their fear of an uprising of oppressed and exploited workers into a story about demonic rebellious AIs, and also managed to hijack the naive “effective altruism” movement, turning do-gooder energy away from distributing cash to impoverished Africans and towards fantasy, self-promotion and self-dealing. These key successes, and the financial and media backing that came with them, let the AI doommongers spread out of obscure forums into mainstream influence, diverting attention from real problems both within AI (e.g. bias and deepfakes) and elsewhere (e.g. climate change, poverty, and war).

I would find it hard to deny that, at least sociologically, there’s a fair bit of truth to this uncharitable story. And yet, I’m going to argue that the risk of AI catastrophe is worth taking seriously anyway. The argument won’t be that it should become a drop-all-else priority for organizers who care about the future of humanity, but just that, as a threat, it’s more like nuclear war than like alien invasion: something that, while tough to address directly, is a real danger to people alive today, and one that intersects with all of the problems which are more familiar to those of us on the left.

First, let’s spend a little more time with the story above. It gestures towards at least a couple of heuristics that could justify dismissing AI fearmongering without deeper engagement: “that’s techbro billionaire nonsense” and “that’s millenarian apocalyptic nonsense.” Both of these are valid heuristics, in general! People are capable of making up all kinds of lofty justifications for behaving according to their material interests, and so it’s entirely reasonable to default to suspicion of any narrative that seems to be pushed by powerful resource hoarders and to direct attention away from inequity. Meanwhile, humans have been inventing stories about an imminent apocalypse for thousands of years. And the followers of these prophecies often have a pretty persuasive schtick! If a member of some strange cult came to your door, you shouldn’t necessarily expect to win an argument with them; they’ve had a lot more practice talking about the subject than you and likely have pat answers to the typical things people like you say. That doesn’t mean that you’re rationally obligated to join the next cult who stop by; it’s perfectly reasonable to instead just say, “sorry, not interested in cults”, and close the door.

But, while valid, neither of those heuristics is necessarily decisive. There are situations where further consideration ought to cause you to set them aside. Leftists of many varieties should be used to this already: US conservatives routinely accuse fear of climate change of being just a sort of apocalyptic religion, and the same has been said about Marxism and anarchism since the 19th century. Marx lived off capitalist money (from Engels!) and so does just about every foundation-funded NGO today–and more importantly, digging into the class angle surfaces a case that it cuts the other way.

While plenty of rich individuals like to speculate about the dangers of AI, corporations including Google, Microsoft and Facebook are plowing ahead with billions of dollars of investment into advancing capabilities as rapidly as possible, with typically token attention to a narrow definition of safety. (Much corporate AI safety investment in practice goes towards preventing chatbots from saying openly bigoted things, which is primarily a brand risk to the corporations themselves. Biases in algorithms that make actual decisions, e.g. about who gets credit or gets to use a service at all, get much less press and PR attention, and the same goes for existential risk.) Mark Zuckerberg has called worrying about AI at all “irresponsible”, while Larry Page has said it’s “speciesist” to prioritize actual humans above hypothetical sentient AIs. Even Elon Musk, who has actually expressed worries about malign AI, parts ways in practice with essentially everyone who has thought about the problem for more than two seconds by advocating for powerful models to be released with fewer restrictions.

There can be a lot of ideological animosity between AI critics who are worried about superhuman existential threats and those who are primarily concerned about present-day impacts like bias and the theft of artistic work. AI-catastrophe prophet Eliezer Yudkowsky has called near-term safety work a “derailment” from efforts to forestall existential risk, while many tech industry critics like Timnit Gebru view Yudkowsky-style alarmism as just the flip side of techno-utopianism and a force that has “shaped the field of AI and its priorities in ways that harm people in marginalized groups.” However, there is in fact a lot of convergence of interests on practical questions. Yudkowsky and Gebru both oppose a race-to-the-bottom proliferation of dangerous technology, breathless AI hype, and the routine publication of state-of-the-art models whose inner workings nobody actually understands. And either position points towards a world with a great deal more regulatory friction around AI development than ours has in 2022. This doesn’t necessarily mean that some sort of alliance ought to be anybody’s priority, but it does make it awkward to label AI alarmism as simply an expression of tech billionaire class interest.

This essay is in part an argument with my past self. I’m a rich white tech worker, and yet have considered myself a supporter of the anti-capitalist left for my adult life. I’ve known the general outline of the AI catastrophe story for several years, but didn’t take it seriously until pretty recently. Meanwhile, I haven’t been engaged in any serious activism since my now-preschooler was born. You could certainly read this change of mind as my social position finally overcoming my past ideological commitments, and in fact I write it in part as an attempt to explore that question. While in the end I don’t think I’m simply expressing a class affiliation here, readers will of course draw their own conclusions.

One obvious place to start would be asking actual experts (ie. not me). Unfortunately, if you try a reasonably broad sample, you’re not likely find the answers very reassuring:

One survey asked machine-learning researchers about the potential effects of A.I. Nearly half said there was a 10 percent or greater chance that the outcome would be “extremely bad (e.g., human extinction).” These are people saying that their life’s work could destroy humanity.

Perhaps then it’s worth sketching the argument for being worried in a slightly more charitable form than we did at the start! I don’t have anything especially original to say here, so let’s quote an outline from a thoughtful skeptic, Katja Grace:

1. If superhuman AI systems are built, any given system is likely to be ‘goal-directed’;

2. If goal-directed superhuman AI systems are built, their desired outcomes will probably be about as bad as an empty universe by human lights;

3. If most goal-directed superhuman AI systems have bad goals, the future will very likely be bad [either because]

Superhuman AI would destroy humanity rapidly [or because]
Superhuman AI would gradually come to control the future via accruing power and resources.

This sketch leaves open whether superhuman AI systems will be built in the foreseeable future, so you could insert, as a step zero, the assertion that they will.

A side note here: Grace’s outline doesn’t depend on the notion that the first superhuman AI will rapidly improve itself, going from just-smarter-than-us to godlike power in a matter of days or even minutes. Debates about AI risk often center on the plausibility of this idea, but it’s not ultimately essential.

Grace nevertheless goes on to make a case for skepticism. In one notable bit, she analogizes AI to corporations and points out that the outline above could easily be modified to make an argument that corporations will destroy humanity. Grace views this as a rebuttal, showing the anti-AI argument would prove too much. But a better interpretation might be that the left ought to get some credit for first noticing that the alienated products of human labor could come to dominate us–especially in a world where, if someone does develop a dangerous AI, it’s likely to be capital’s competitive self-expansion that provides the motivation and resources!

Those who don’t worry about AI catastrophe part ways from the case sketched by Grace at a variety of points. My impression is that machine learning researchers are most likely to dispute step 2, as for example Facebook researcher Yann LeCun does in this debate with a co-author of my college AI textbook, Stuart Russell. Sanguine researchers believe we’ll naturally develop the skill to align AIs to our goals at the same time as we advance their capabilities.

I think, however, there are good reasons to doubt this. At the abstract level, there’s what worriers call “instrumental convergence”, which is basically just a fact that anyone who’s tried to make the world better runs into: it’s hard to make change without power. And as those who’ve been at the sharp end of state violence know well, if you’re at risk of someone simply turning you off, even selfless strategic considerations will make you direct some effort towards self-preservation. At a more concrete level, there’s the problem that we don’t actually understand the internal algorithms that the most advanced existing models use to make decisions, which contributes to every kind of bias that AI critics point out. And of course there’s our notable failure to align even human-powered economic and political institutions such as capitalism to human needs–which besides being evidence against alignment being trivial, brings the risk that even if humans somewhere develop the technical capability to align AI, the people who actually build it might make it malign anyway.

Some ML researchers also dispute Grace’s step 1, arguing that machine learning models are simple tools which will not develop goals of their own. This argument perhaps tends to have less intuitive force for those of us who are inclined to analogize AIs to out-of-control capital or rebellious workers, but even if one wants to carefully avoid any anthropomorphism, there’s a case that competitive pressure will promote the development of AIs with some level of agency. We see hints of this already in the tendency to “fine-tune” the basically undirected language models underlying tools like ChatGPT with reward according to the “helpfulness” of answers.

Leftists, in my experience, are more likely to depart either at step 0 or step 3. It’s hard for a good materialist to deny the very possibility of an artificial mind. But skeptics point out that intelligence isn’t a single thing which can simply be scaled up, that despite the hype there are plenty of ways in which existing AI is far short of human intelligence, and that power in the present world isn’t distributed according to intelligence anyway.

It’s true that intelligence isn’t linear. One way to see this is to note that computers have already surpassed human capabilities in many realms, without being generally more intelligent! So this point isn’t necessarily reassuring; AIs don’t have to be better than us at everything in order to be dangerous. And while it’s possible that humans are near the peak of what’s possible at some key capabilities like long-term planning, computer security, emotional manipulation, and engineering, it seems hubristic to be very confident of that–especially when we know that AIs can be copied, backed up, edited, and sped up by hardware advancements in ways that don’t have clear human analogies.

What about, then, the fact that the human ruling class is pretty plainly not selected according to intelligence? Isn’t that evidence that being smart alone doesn’t make you powerful? There’s some reassurance here, but the question of hubris arises again. It’s one thing to say that within the range of human variation, smarts are evidently not enough to take over the world, and another to say that’s true for every possible mind. It’s also true that the many human flaws exhibited by the rich and powerful have tended to play an important role in making space for successful resistance by the oppressed and exploited. Better a world in which self-expanding capital is managed by calculating AIs which are as far beyond the most ruthlessly effective of today’s executives as computer chess and go programs are beyond the strongest human players, than a world in which all of our atoms are repurposed to the manufacture of paperclips, but let’s not settle for either.

Finally, there’s the question of timelines. If superhuman AI is hundreds or thousands of years away, it’d likely be infeasible to predict how anything we might do today would affect our prospects. That long of a wait is certainly possible, but trends in achievement on benchmarks both quantitative and qualitative suggest it’s a risky bet, and both algorithms and hardware continue to advance. Would you have predicted a decade ago that, by today, illustrators would be organizing to save their jobs from AI, or that language models would be scoring as well on SAT reading comprehension questions as an average high schooler? I didn’t. If you didn’t either, how confident should you be in any prediction about what will be possible in another decade or two?

Where does this leave us? There’s plenty of uncertainty about the path to doom, but not, in my mind, the sort of knockdown refutation that one might hope to see when the stakes are this high. While any quantification of a risk this complex is going to be illustrative at best, even something like a 1-in-10 chance this century of humanity surrendering control of our destiny to AIs with incomprehensible goals would be pretty bad! Seeing that doesn’t require you to get into weird “longtermist” philosophical questions about the importance of marginal effects today on quadrillions of potential future lives.

Still, for leftists who don’t happen to be working in the field of machine learning (ML), I don’t think the immediate implications are that big. The left isn’t in a position to write policy in any of the countries on the leading edge of AI research, and on the whole we’re already doing our imperfect best in the struggle to make reasoned debate about human needs and risks have any impact beyond the imperatives of capitalist and imperialist competition.

It is, though, maybe worth noting that sometimes the left has more power over culture than policy, especially in media and academic circles. And in the academy, I’m afraid it’s presently more embarrassing for an ML researcher to worry publicly about existential risk than it is to publish something that advances state of the art on a benchmark by tweaking a few numbers on a model while having essentially no insight into what the model is actually doing internally. If you’re at all concerned about either short-term or long-term AI risks, that’s backwards, and it’d be a shame if leftists outside of ML were to reinforce it. The people who could actually use a bit more scorn are the ones who are convinced that their research has transformative potential but simultaneously that it’s not their job to care whether that potential is used for good. If you’re going to strap a few million dollars of compute to a machine learning algorithm, it’s better to think, even a little, about how to align the result with human flourishing, than to take the all-too-common attitude of “I just build the rockets, where they come down is the Luftwaffe’s concern.”

On AI catastrophism

Why, despite the red flags, leftists should take existential risks from AI seriously

Share this:

Related

0 Comments Leave a trackback: Trackback URL