Bounding Power ~∞

The ASI Control Problem, Public Safety, and Republican Constitutionalism

by Daniel Deudney and Devanshu Singh


Humanity’s Final Invention?

In the half century since computer pioneer Irving Goode declared that an ASI (artificial super intelligence) might be humanity’s ‘final invention’ – for better or worse – the prospect of an extremely intelligent machine has increasingly loomed on the horizon. Widely read fictions, starting with Mary Shelley’s Frankenstein and extending to recent ‘robot revolt’ stories, have created a large and captivating literary and cinematic technological imaginary. ASI has also been seriously explored by computer researchers, as well as analysts of utopian futures and existential threats. The general notion that humanity’s technological creations could escape control and become frightening menaces is present in many ancient myths and stories. With the coming of increasingly powerful machines over the last two centuries, the specter of autonomous and hostile technology has haunted scientific-technological modernity. Every aspect of the topic of ASI, from feasibility and controllability to desirability and consequences, is vigorously contested. Unfortunately, humanity’s capacities to predict – or control – the trajectory of computer technological development may be inadequate to the task. Absent agreement on whether ASI is a threat, and on how it might be restrained, the prospects for humanity are dim.

The most recent development in the AI story is the recent release of ChatGPT and other large language models (LLMs). Hailed by some as a technological revolution comparable to the printing press, LLMs and associated technologies are capable of generating natural language and images across a wide variety of domains, from writing stories, poems, and essays, to creating paintings, movies, and false images of real people, or ‘deepfakes.’ These AIs are capable of generating drug formulas, writing code, doing mathematics, and guiding robots. Given these abilities, it is unsurprising that ChatGPT and its associated inventions have become quickly and widely adopted, attaining millions of users shortly after release. These developments have supercharged and publicized a series of philosophical debates among experts about the nature of these intelligences, with some heralding them as first contact with human-like artificial general intelligence (AGI), as well as how society should respond to and regulate them. As a result, analysis of ASI as an existential threat has now become a public intellectual project, vividly inflamed by viral examples of human-like interaction with ChatGPT.

The ASI threat is part of a widening contemporary horizon of utopian and dystopian possibilities. Humans have long imagined both transformative improvements and great catastrophes, but in recent decades the study of technological utopias and civilizational and existential threats has become a large and flourishing intellectual enterprise. Advocates of transhumanism, advanced computerization, nanotechnology, and large-scale expansion into outer space paint a picture of secular utopia and approximations of apotheosis. At the same time, the discoveries of deep time and deep space, along with rapidly developing technology, have revealed an ominous menu of macro-disasters, some cosmogenic (asteroid/comet collision, solar eruption), some terragenic (supervolcanoes, widely lethal pandemics), and some anthropogenic and technogenic (nuclear war, severe climate change). Some of these threats are temporally distant. Some cannot be plausibly addressed. Some are imminent. But some stem directly from decisions humans make – or fail to make – about technology.

Within this complex horizon of dangers, there is a developed body of argument for putting artificial superintelligence at the top of the list of threats due to both the speed of advance in computational technology and the immense powers which such an entity might plausibly possess. Humanity’s steering responses to the two currently most developed technogenic threats – nuclear weapons and climate change – are significantly inadequate, and early indications suggest that the restraint of artificial intelligence could pose even more daunting governance challenges.

The ASI Control Problem

Over the last several decades, computer engineers and scientists, futurists and philosophers, brain and cognitive scientists, and others have seriously explored the prospect of an artificial superintelligence vastly dwarfing humanity in a wide range of important abilities. Despite the seriousness with which ASI is being discussed, it is important to emphasize that such a machine and the intellectual enterprise of thinking about it are informed speculations about an entity that does not exist, and may never exist. As part of these discussions, the possibility of a supercompetent machine eliminating or subordinating humanity has been extensively explored. And in response to this potential peril, many sophisticated ideas about what is called the AI (or ASI) ‘control problem’ have been developed and debated. Well-funded research on this problem is occurring in think tanks, universities, corporations, and the military. These questions and debates were brought to a larger audience by Oxford philosopher Nick Bostrom’s bestselling 2014 book Superintelligence: Paths, Dangers, Strategies, which synthesized and advanced much of the best thinking about the prospects, perils and control possibilities. Additional theories have also been developed by computer scientist Stuart Russell, physicist and techno-futurist Max Tegmark, and others. The stakes in these inquiries are potentially of the highest importance for the future of humanity and the Earth. 

In this uncertain and speculative realm, the only certainty is the potentially very high stakes of the choices humans make – inadvertently or intentionally – about the power of computers and the human ability to control them. In simple terms, an artificial superintelligence would be a globally omnipotent, omnipresent, and omniscient device (GO3D), a literal deus ex machina. An ASI would have the powers that humans have traditionally ascribed to gods and God. Until very recently, the natural world and cosmos were understood by humans as the creations, and under the control of, the ultimate power entities, the gods and God. But unlike supernatural all-powerful movers of the natural world, the cybernetic superpower would be a human creation.

If ASI is possible and benevolent and supportive of human ends, the state of humanity could be radically transformed for the better and longstanding human interests (peace, prosperity, health and longevity, and environmental sustainability) fulfilled and expanded. But, if ASI is possible and threatening, then the conclusion is inescapable that the survival of humanity will hinge – possibly very soon – on human capacities to conceive, implement, and sustain effective control strategies.

The ASI Control Problem and Republican Constitutionalism

One potentially valuable way to shed new light on the ASI control problem is to look at it through the lens of republican constitutionalism. This is potentially fruitful because many of the most important control strategies envisioned by ASI control thinkers are, in several central ways, returns to and recapitulations of power restraint strategies developed by republican-constitutional theory and practice across many centuries. 

Across time, republican constitutionalism has sought to check predatory concentrations of power while aligning the use of state power with the interests of the people. Despots, autocratic states and empires pose dangers to the people as a whole if they are unchecked. The ASI control problem and republican constitutionalism share a common project of restraining potentially overweening power. Centralized and unchecked power is for both ASI control theory and republican constitutionalism the defining and perennial enemy, because it inherently threatens both domination and destruction. When this core continuity with republican constitutionalism is recognized, the ASI control problem can be clearly recognized as the newest, and probably most challenging, version of an ancient power restraint problem. Thus, the apparently radically novel ASI control problem is actually an extension of a quite traditional and familiar one: can powers expanding exponentially be effectively restrained on behalf of the public interest? 

This essay explores ASI control as a problem in republican-constitutional power restraint. The argument proceeds in three main steps. First, some of the main features of republican constitutionalism in theory and history are summarized, with particular focus on the often neglected notions about the relationships between power restraint arrangements and the restraints and empowerments produced by material nature and artifacts. Second, the views of ASI thinkers about the potential powers, promise and perils of such an entity are summarized. Third, we explore several leading control strategies developed by recent ASI researchers in the light of republican-constitutional theory and practice. We claim that these proposed ASI control strategies parallel republican ideas on controlling and educating despots, constitution-making, representation, and the appeals and problems with limitations on power produced by divisions of the capabilities of authorities.

Although expectations among ASI researchers about the prospects of successful restraint of ASI vary, much of this research community is quite pessimistic about control prospects.1 Considering the ASI control problem from the perspective of republican-constitutional thought provides further strong grounds for pessimism about the likelihood and difficulty of successful control, while also suggesting the better of our difficult, if not terrible, choices. To the extent the ASI control problem is at least very difficult to solve, the appeal of a moratorium on further developments grows, while the difficulty of doing so remains very high.

PART ONE: Republicanism, Restraints, and Things

Republican Constitutionalism, Power Restraints and Practical Materialism

To speak of republicanism is to enter into a vast realm of political theory, practice and history.2 Republican language and concepts have appeared in a wide variety of uses, times, and places. Sprawling and complicated debates still mark thinking about republics and their many component parts such as human and liberal rights, democracy, representation and constitutions. A version of this kind of thinking is also found in international and world order theory, where its republican character is seldom acknowledged. And there are flourishing, and elaborately developed practices and institutions governing the design of artifacts, infrastructures, and systems in the service of public safety which are not commonly thought of as republican, but which are essentially republican because they concern architectures of power restraint in the service of public security and safety. It is this broad, but clearly delineated, version of republicanism which provides insight into the ASI control problem.

While this is a broad understanding of republicanism, it is important to acknowledge that not all restraints on power are republican in character. Civilization itself, as Norbert Elias has argued, is a dense network and architecture of power restraints of all sorts, which permeate and partially constitute the fabric of human life in all societies, especially modern advanced ones.3 Similarly, the idea of the ‘constitution’ of a political order is not necessarily republican, but instead describes the character of that order. ‘Constitution’ can refer to simply the order in a polity, a codified unitary document, or a limited government – a republican constitution. Further, political power restraints are not confined to republicanism. Hierarchical, imperial and despotic political arrangements, the antithesis, and long-time nemesis, of republican political forms, are marked by architectures of power restraint configured to achieve the interests of the one and the few over the many. ‘Divide and conquer’ is the adage of imperialism. Despots and dictators, ancient and modern, employ many of the same basic power restraint configurations, such as balance and separation of power, to advance and sustain rule over the many.

Given that an unchecked ASI might bring about the extinction or domination of humanity as a whole, it represents the largest possible threat to public security, safety, and liberty. It is vastly larger in the magnitude of power needing to be restrained and directed, and of the consequences of failing to do so. The formulation of a universal human interest has been long sought but difficult to achieve. But in the face of the existential threat of ASI, humanity has an unmistakable common interest in surviving by appropriately controlling the ASI. The public consisting of humanity as a whole, even if not aware of itself, and even if severely fragmented, can unmistakably be said to exist in the shadow of ASI as an existential threat.

Republican-Constitutional Power Restraints

A very simple problematic at the root and center of republican constitutionalism runs like a strong and vivid cord across the centuries: the need for, and project of, restraining enormously powerful entities – traditionally tyrants, despots, and empire and states – on behalf of the fundamental survival and security interests of individuals and the public. This project spills across the disciplinary divisions of political science, encompassing both ‘internal’ regime character, and ‘external’ international and world order. Republicanism offers many remedies and plans for restraining overweening and hostile powers, but it starts with an understanding of concentrated and unaccountable power as a fundamental security problem, captured in the Roman dictum that ‘public safety is the highest law.’ The avoidance of arbitrary death and domination through checks on power is the prerequisite for the realization of the wider array of goals and ends held by individuals. 

Unfortunately for human well-being, building and sustaining architectures of restraint is difficult and has often been foreclosed by circumstances. Regimes with effective restraints on despotism and state predation have never been universal, or even in the majority of polities. That being said, the last several centuries have been marked by important expansions and successes, most notably the decline of slavery and imperialism, the growth in constitutional government, rule of law, human rights, and democratic accountability, as well as the waning of war. More recent advances have come through the growth of international law, organizations, and regimes, providing important, if incomplete and rudimentary, ‘global governance’ for solving collective problems stemming from the globalization of scientific technological modernity and the cascade of material empowerments it has generated. 

While this project has had important successes, most notably in the liberal democratic world, these successes are fragile and subject to forces of decay, erosion and backsliding. And this effort has been perennially hobbled by important recurring weaknesses in the people as a whole. Most importantly, the people as a whole face acute collective action problems stemming from the fact that the members of the people are numerous, have different identities, have clashing immediate interests, and are typically spatially scattered. As a result, the members of the public are weaker than their numbers would indicate.

The first polities able to overcome, at least partially, these severe problems had small populations gathered in one relatively small space, namely city-states. Governments limited in authority and under the direction of their peoples have been historically rare. Republican thinkers, from Aristotle until Montesquieu, viewed republican constitutions as inherently confined to dwarfdom because they had only been viable at a city-state scale.

In thinking about the prospects for limited and accountable government, republican theorists paid close attention to restraints on power that were geographic in character, most notably those stemming from topography. City-states were deemed viable in places affording significant natural defensive advantages, which enabled such small regimes to militarily survive in interstate systems populated by larger, sometimes much larger, neighboring polities, which had chronic imperial tendencies. Some of the first developed theories of international systems as distinct political forms borrowed straightforwardly from the theory of republican city-states. While now largely forgotten and ignored by standard, especially realist, accounts of international politics, republican theories influenced the idea that modern Europe, after several centuries of failed efforts to build region-wide empires, should be thought of as a species of ‘republic,’ a formulation appearing in numerous eighteenth-century political and international thinkers. These thinkers commonly attributed the failure of the Universal Empire project in Europe to various ‘divisions,’ ‘balances’ and ‘mixtures’ of power, and capabilities that were ‘by nature,’ such as the continent’s topographical fragmentation.

The persistence of region-wide political plurality, based on these divisions and balances of power, afforded space for the development of republican-constitutional political forms within some significantly larger polities, such as England. Most larger polities in Europe were monarchies, many of which developed into autocracies as various feudal limits on the center were replaced by modernized state organs of power and domination. 

In addition to seeking to control the goals of government, republics have sought to fundamentally constrain their abilities through checks and balances so that, even if a tyrant were to gain power, the structural restraints built into the system would prevent disastrous abuses. At their core, these structural restraints have depended on material possibilities and impossibilities, especially technological capabilities. The totalitarian regimes of the twentieth century were distinguished from previous despotic states by their significantly enhanced ability to actually control minute workings of society with modern technology. Other states in the past were constrained by geography, for example the mountainous geography of Greece helping restrain conquest by the Persian Empire. As technological capabilities have grown, republican governments have had to reconfigure their structures to account for new possibilities for centralization and abuse of power.

Given these realities, republican polities have employed multiple strategies to hamper tyrannical rule. First, they limit the authority and capabilities of the government. For example, the US Constitution limits the jurisdiction of the federal government through the Tenth Amendment. Second, republics divided the government into multiple bodies with equal or balanced material capabilities, such that no one could dominate. For example, the US government consists of several branches with comparable and complementary capabilities, as well as states with organized militias that could resist the national government. Finally, republics empower and organize the people to sustain their capabilities to check central powers. For example, the Second Amendment materially and technologically empowered the population to effectively check the government, at least in the technological environment of the eighteenth century.

Public Safety, Republican-Constitutional Power Restraints, and the New Empires of Things

In recent centuries, the power-restraint problem has taken on a new and more urgent form with the cascade of technological empowerments produced by the Baconian and Enlightenment project of scientific-technological modernization. Since the coming of the industrial machine, republican-constitutional thinking and practice have been struggling to understand how the power of modern technics (particularly of production, destruction, and communication) can be governed in ways consistent with the broader public interest. And these restraint projects have been shadowed by the credible possibility that modern technical empowerments would produce the comprehensive despotism of the modern totalitarian state. As the Russian political theorist Alexander Herzen famously foresaw, the horizon of industrial modernity included the prospect of “Genghis Khan with a railroad and telegraph.” Looking ahead, the new all-conquering Genghis Khan might be an ASI. 

Given the power potentials of modern industrial technics, much of the republican camp, including libertarians, liberal democrats, human rights advocates, social democrats and various left-progressive emancipatory thinkers, has been focused on circumscribing state (and concentrated unchecked ‘private’) powers.

Another important facet of restraint theory and practice as it has developed is found in the institutions which have emerged to govern the increasingly powerful artifacts, machines and infrastructures produced by the cascading advances of the industrial and related technological revolutions. The only reason these powerful artifacts are tolerable is because they have been designed and built in ways that make them significantly safer to operate. ‘Safety engineering’ is an ongoing and extremely sophisticated enterprise occurring throughout the design, deployment, and governance of artifacts and infrastructures. These extensive ‘public safety’ activities and republican constitutionalism have two core shared features: pursuit of the interest of the people as a whole, and attention to power restraint architectures. The building of restraints into things has been extensively (but incompletely) successful, and the ASI control problem appears to be the ultimate test of this enterprise.

There are extensive historical and contemporary experiences with the design of apparatuses and infrastructures to simultaneously restrain and, to the maximal degree possible, prevent disastrous breakdowns and leakages with enormously catastrophic dimensions. As industrial technology has advanced, it has generated increasingly potent technologies, with dual potentials to provide human benefits, as well as great downsides if not properly governed. Across the twentieth century, the new technologies of industrial chemistry, biotechnology, and nuclear energy developed have been based on new, technogenic substances (chemicals, organisms, and radioactive isotopes). The practical problem of designing scientific apparatuses and infrastructures to exploit benefits while avoiding dangers often comes down to strategies of containment. 

Nuclear power plants are the paradigmatic examples of a supertechnology capable of producing catastrophic damage, not just to the mega-artifact itself, but to surrounding populations and environments. For this reason, the nuclear-electric energy infrastructure has the containment of radioactive material as one of its central design features. Extensive and sophisticated modeling techniques have been used to assess the luxuriant ‘tree’ of breakdown scenarios. Perhaps the apex of this type of risk analysis was the Rasmussen Report, which employed 1,500 experts and produced a 21-volume report. Despite all these efforts, disasters still occur. Nuclear accidents at Three-Mile Island, Chernobyl, and Fukushima are globally memorable because, in an instant, a powerful and useful machine turned into a major catastrophe.

A key recurring pattern in such regimes of material restraint is the division of capabilities, i.e. the containment of what would be a menace to public safety if uncontained. The most vivid example of such practical material republican arrangements is the centrality of containment in designing governance architectures of nuclear power plants. But, similar patterns of division-as-containment are central to the governance of new biohazards produced by advancing genetic biotechnology. It is also worth noting that the nuclear physicist and weapon designer Ted Taylor (whose work was so influential in advancing understanding of the threat of nuclear ‘leakage’ and ‘terrorism’) wrote a treatise on containment as the governance arrangement with the widest application in modern technological societies. A key part of his argument was that activities with catastrophic potential which cannot readily be contained should be simply and fully prohibited (‘relinquishment’ in the language of ASI control theory).

PART TWO: Scenarios of Cybernetic Empowerment

The Powers of a Secular GO3D

How might an artificial intelligence potentially be an all powerful, everywhere present, and everywhere knowing device? Bostrom provides an extensive account of just how super-powerful such a machine might be, both from its hardware and software.

The hardware of a digital computer could dwarf the capacities of the human brain in seven dimensions. First is operating speed: biological brains operate at a peak speed seven orders of magnitude slower than contemporary microprocessors. Second, internal communication speed limits the size of a viable biological brain to under a tenth of a cubic meter, while an electronic system could be the size of a dwarf planet, larger by eighteen orders of magnitude. Third, the human brain has less than a hundred billion neurons, while computer hardware can be scaled to extremely high physical limits. Fourth, the human brain’s storage capacity is several orders of magnitude less than a cheap smartphone (around 1 billion bits), making it readily feasible for an artificial intelligence to possess all human knowledge — and vastly more. Fifth, human biological senses are narrow and spatially limited, while the sensors of an intelligent machine could span the electromagnetic spectrum, number in the trillions, and be ubiquitously distributed. Sixth, biological brains are unreliable, easily fatigued and permanently decay within decades, while silicon transistors are extremely reliable, very long-lasting, and resilient to extreme conditions. Finally, while all human brains die, digital software can be copied indefinitely, making an ASI potentially immortal.

With these stupendous hardware advantages, a wide menu of software capabilities might be plausible. In simple terms, all the cognitive attributes of humanity would be massively increased qualitatively and quantitatively. Perhaps the most important ASI capability would be ‘intelligence amplification,’ resulting from ‘self-improvement.’ In some scenarios, this might occur slowly and incrementally. Alternatively, self-amplification might cross a critical threshold followed by exponentially rapid improvement, an ‘intelligence explosion.’ Among its other superpowers would be strategizing (forecasting, prioritizing, distant goal optimization), social manipulation (psychological modeling, manipulation, persuasion), hacking (exploiting computer security flaws), technological development (engineering advanced biotechnology and nanotechnology), and wealth generation. With these capabilities, an ASI could overcome intelligent opposition, leverage resources by recruiting human support, steal financial resources, hijack infrastructures and military assets, and create comprehensive surveillance. And because digital computers operate at such vastly greater speeds, they would be able to anticipate and respond to human countermeasures with decisive rapidity.

ASI Benefit and Salvation Scenarios

The positive scenario for ASI is that it would be a transformative development for humanity and life. In this view, an ASI would be our ‘final invention’ because it would make all subsequent inventions. In principle, anything that can potentially be invented would be invented by an ASI. At multiple frontiers of technological development, an ASI would, its advocates claim, produce major progress rapidly. Support for this anticipation comes from the many ways in which less capable machine intelligences have been making rapid advances in the design of drugs and materials, medical diagnoses, and, increasingly, autonomous weapons, vehicles, and industrial machines. Controlled fusion could be developed for cheap abundant energy. Advances in materials science could make possible highly efficient photovoltaics and improved photosynthesis. The vast potentials of nanotechnology could produce medical advances, new super-materials, and new ways to mine metals. Space technology could be rapidly improved, enabling the exploitation of extraterrestrial resources and large-scale space colonization. An ASI would realize the utopian vision of scientific-technological modernity in producing a world without disease, poverty, and conflict. With its god-like powers in the service of humanity, an ASI would bring a secular heaven on Earth, the culmination of the program of technological progress sketched by Bacon and developed in the Enlightenment.

Some ASI advocates propose turning the overall governance of human affairs over to intelligent machines to resolve conflicts, efficiently allocate resources, and produce high levels of abundance. Such an arrangement, referred to by Bostrom as a ‘singleton,’ would be a world government by machine. In the face of potentially resisting or non-conforming humans, such an ultimate world sovereign would likely employ sophisticated and comprehensive psychological manipulation. 

ASI advocates also claim that such a machine could diminish or eliminate a variety of catastrophic and existential risks, such as nuclear weapons, severe climate disruption, and pandemics. The basic notion is that human technological empowerments have outstripped humanity’s ability to foresee, judge, and steer, and more generally solve collective action problems. And humans labor under the burden of what Kant aptly called the ‘crooked timber of humanity,’ deeply ingrained traits rooted in biology. ASI advocates claim humanity’s survival now requires a leap in capacities to foresee and steer, which only an ASI can realistically provide. Humanity would thus be saved from the consequences of its numerous incapacities by a human-built machine.

The ‘techno-Gaians’ provide another version of ASI as salvation, to realize the ‘good Anthropocene.’ They advance ASI as necessary to effectively maintain the Earth’s optimal habitability for humanity and the planet’s myriad life forms. Recent human technological and economic activities have undermined this habitability, and it remains unclear whether human governance alone will be capable of maintaining it. In this way of thinking, maintaining habitability will require the governance of the planet by an ASI simply because of the immense complexities of the Earth’s geophysical and biological systems. An ASI would also make possible large-scale geoengineering, laying the groundwork for terraforming the celestial bodies into viable human habitats. Advocates speak of this vision as the conversion of the Earth into a garden of biological productivity, in effect a technologically enabled creation of a new, vastly enlarged, Garden of Eden. In James Lovelock’s recent version of this scenario, ASI and humanity would jointly govern the material and energy systems of the planet, transforming the Anthropocene into the Novacene.

Others claim an ASI would empower a path of cosmic expansion for humanity. In this view, an ASI would allow humanity to maximize its potential to colonize outer space by effectively exploiting the resources of what has been dubbed our species’ ‘cosmic endowment.’ After transforming the Earth and the bodies of the solar system into high quality habitats for humanity, the ASI-enabled sphere of human expansion would widen to interstellar space and across the galaxy. With this expansive transformation of galactic material resources into habitat artifacts, human population would expand into currently unimaginable levels, trillions and beyond. 

The rapid actual progress in AI, potentially leading to an ASI, is propelled by expectations, repeatedly vindicated across many decades, of important immediate, but relatively incremental benefits. But this progress is also in part propelled by a larger and fuller Promethean and techno-optimist vision of human empowerment, elevation and expansion widely held in the sprawling global technosphere and often articulated in strong form by its leaders.4 For some, this larger prospect gives the quest for ever more capable artificial intelligence a messianic and quasi-religious character.

ASI Catastrophe and Extinction Scenarios

Equally elaborately developed and vividly compelling scenarios of extinction, essentially the mirror image of the utopians, have been widely articulated. In this view, an ASI might be humanity’s ‘final invention’ in a completely dark way, resulting in our extinction. How might a super-capable computer destroy humanity? The SF technological imaginary tends to emphasize close-fought and drawn-out battles over human survival. However, far more rapid and assuredly effective (if less cinematically compelling), paths would be available. These include the creation and dissemination of potent viral pathogens, large-scale nuclear attacks, or Bostrom’s scenario of mosquito-sized drones with nerve agents. If such a machine were to emerge, humans would no longer be the most intelligent entity on the planet, and humanity’s fate would be in the hands of its superpower silicon progeny.

But would an ASI be hostile to humanity? Experts disagree about fundamental questions regarding the possibilities of machine consciousness, will, and self-generated goals. And if such machines are capable of generating their own goals, what would their character be? Perhaps such a machine would want to meditate or write poetry, or set off to explore and settle the cosmos. But if this emergent silicon-based life form takes paths characteristic of previous life forms, then it would have the minimal aims of self-preservation and the acquisition of resources for realizing other goals. With such objectives, an ASI might intentionally seek to eradicate humanity as a potential threat to its existence – a scenario that it might learn from science fiction. Or it might be indifferent to humanity, as we are to ants, and inadvertently eradicate humanity in its quest to repurpose all the resources of the Earth for its agenda.

PART THREE: ASI Control Strategies and Republican Constitutionalism

Strategies of Control: GO4D?

Assuming ASI is possible, human survival depends on ensuring that this emergent entity is not only omnipotent, omnipresent, and omniscient but also omnibenevolent, a GO4D not a GO3D.

In thinking about strategies to control an ASI, Bostrom and others lay out a variety of strategies. Some aim to alter motivation, to shape and determine the fundamental preferences and goals of the ASI in ways consistent with basic and important human values and goals. This preferred arrangement is referred to as ‘value alignment.’ This human-machine goal convergence might potentially be achieved in several, sometimes overlapping, ways, referred to by Bostrom as ‘motivation selection’ strategies: ‘direct specification,’ ‘indirect normativity,’ and ‘goal domesticity.’ The basic idea is to create a ‘code constitution,’ an unchangeable core software stipulating beneficence toward humanity. Similar to political constitutions, these strategies vary widely in how they align with republican values and the perceived role of ASI for humanity’s future. Another cluster of strategies seek to limit the capabilities of a machine, both in software and hardware, to hobble the abilities of an ASI by limiting its ability to realize its goals, both generally and those inimical to humans. ‘Capability control’ strategies also come in several varieties: ‘boxing,’ ‘incentive methods,’ ‘stunting,’ and ‘tripwires.’

In multiple ways, republican constitutionalism and ASI control strategies overlap and entail common logics and remedies. The strategy of ‘value alignment’ is a new version of ‘enlightened despotism’ because both seek to combine unchecked power with outstanding public virtue, or beneficence. And the idea of generating a more constraining core governing code, and then ensuring that it remains governing by reinforcing it with restraints on capability, is the project of limited government Constitutionalism in the republican political universe. Other discussed ASI control strategies generalize the republican practices of representation, separation of power, and balance of power.

‘Motivation Selection’ Strategies: ‘Value Alignment’ and Enlightened Despotism

To start, consider the project of ‘enlightened despotism’ alongside the problem of ‘value alignment’ in ASI control strategies. Theorists claim such a sentient machine can be rendered humanity’s useful servant, not a menacing overlord, if it can be designed with software that aligns its goals and behaviors with human goals and interests. This strategy does not attempt to limit the capabilities of the great central power, but rather to direct it toward serving humanity.

Western political thought from its inception has distinguished between monarchy, the rule of one, and tyranny, the rule of one in predatory and oppressive ways. Advocates of such concentration of power argue that non-monarchical regimes, particularly democracy, are faction-ridden and cannot generate adequate collective purpose and effectively cooperate. Advocates of the rule of one, often under the rubric of ‘enlightened despotism,’ have sought an arrangement where society can benefit from the ordering and coordinating actions of concentrated power without the liabilities associated with tyranny. This has entailed the quest for a ‘philosopher king’ or an ‘enlightened despot.’ The aspiration is to combine complete power with an exceptional level of virtue and knowledge. This requires the creation of a ‘psychic republic’ of restraints in the character of an otherwise unrestrained ruler. 

In the earliest extant iteration of this quest, Plato in The Republic, sketched a regime in which the philosopher would rule with justice in the general interest. Much of early Western political thought focused on the project of teaching virtues to princes, and many of the leading philosophers devoted strenuous efforts to directly educate rulers with unchecked power. Plato sought to do this with the powerful tyrant of Syracuse Hiero II. Aristotle sought to educate Alexander, heir to the throne of Macedonia. The Roman Stoic philosopher Seneca sought to inculcate virtue in Nero, heir to the Principate. Large numbers of lesser-known philosophers and intellectuals found employment as tutors to princes, and they generated a large literature, known as the ‘Mirrors of Princes,’ laying out in elaborate detail hierarchies of virtues suitable to guide monarchical rule.

The project of enlightened despotism to combine unchecked power with extreme virtue and knowledge is also found in the late modern ideology of communism and its vision of a totalitarian order. The Marxist vision of total rule differed from its predecessors in its faith and reliance on technological progress to create the material foundations for the ‘end of history’ state of mature communism. The virtuous prince is now the disciplined vanguard Party, equipped with the final science of history and society provided by dialectical materialism. It is often overlooked that Soviet theorists in the postwar era envisioned the use of computers to substitute for the free market in efficiently allocating resources, maximizing productivity, and achieving high growth rates. Chinese communist theorists of modernized autocracy also have high expectations that the new technologies of distributed computation and surveillance will help realize a totalitarian vision of social harmony. In many ways, the positive vision of a virtuous ASI running the world is the final culmination of this old and still developing tradition of enlightened despotism.

Republican-constitutional theory offers a powerful critique of enlightened despotism. The problems with this project, in both its early and more recent versions, are several and severe. To start with, philosophers, despite several thousand years of effort, have not yet agreed on the content of virtue. Over time, the content of enlightenment and virtue advanced by philosopher-advisors varied enormously. In some versions, the pagan virtues of magnanimity and glory had top billing. In contrast, Christian-influenced visions of enlightened central rule emphasized the virtues of humility and the avoidance of a litany of sins, venal to mortal. And then, in the European Enlightenment, a utilitarian ‘greatest good for the greatest number’ standard was advanced by thinkers such as Jeremy Bentham. While ethicists today rarely imagine their ethical systems guiding the statecraft of actual rulers, they continue to disagree fundamentally about whether utilitarian, deontological, or virtue-centered ethics are superior. 

Even more intractable was the problem of pedagogy, of effectively inculcating a preferred system of values into the character and psyche of students. It is hard to look at the record of such advisors to princes as anything but disastrous failures. Perhaps the most spectacular pedagogical misfire was by Seneca, who was murdered by his wayward pupil Nero. In important ways princes are likely to be intrinsically difficult to educate towards virtue due to the luxuries and indulgences inherent in their station, as well as the corrupting influence of power itself. From the standpoint of republican constitutionalism, the project of enlightened despotism of building a restraining psychic republic in the mind and character of autocratic rulers as a bulwark against the abuse of power is inherently utopian. In contrast, republican constitutionalism, while valuing citizen virtue, looks primarily to preventing the concentration of power and creating institutions for maintaining the accountability of those powers which exigencies necessitate.

The ASI version of the enlightened despotism project goes under the rubric of ‘value alignment’ and its core problematics parallel the older problems with agreeing on the content of virtue and pedagogy. ‘Value alignment’ means designing a core software for an ASI configured in such a way that the ASI will serve rather than threaten humanity. If the goals of the machine are aligned with the goals of humanity as a whole, then the ASI is a GO4D, not a GO3D. This is understood as requiring two broad steps. First is the construction by human computer designers, philosophers, and others of a core ‘code constitution,’ codifying, in computer code, the fundamental principles and goals to govern the actions of the ASI. Second is the pedagogical task of implanting this software in the computer in ways that ensure its perpetual rule. In Max Tegmark’s formulation, this second task requires getting the machine to understand this code constitution, then adopt it, and, finally, retain it.

One source of optimism for the prospects for this project is rooted in the prevalent modernist assumption that technologies, being created by humans, are therefore interpretable and controllable by humans. In this way of thinking, the ASI control problem should be inherently realizable, as both the architectures of hardware and software are human conceptions and artifacts. The cyber domain is a true blank slate, a ‘tabula rasa,’ for computer educational designers. In contrast, human beings are born with an array of biologically determined goals, tendencies, and limitations which educators can at best partly temper and channel. Furthermore, the limited plasticity of individual humans wanes with the passing of youth.

The first task is for designers to write what amounts to a constitution to govern the entire future of humanity, which may or may not be republican in character. The construction of the constitutional code is likely to be deeply problematic and practically unattainable. Strategies for designing the motivations of an ASI through such a code constitution are of three fundamental types: direct specification, goal domesticity, and indirect normativity. Direct specification ambitiously aims to enumerate all possible criteria of success for this machine and give it a certain and definite goal. The construction of such a constitutional code requires that humans universally agree on preferences and goals, given that the ASI would have unchecked power. The goal-defining software must be formulated and implanted before an ASI emerges, and must be effectively perfect the first time. But it is doubtful that humans can realistically formulate such comprehensive goals. As republican-constitutional thinkers have historically argued, theologians, philosophers, and social scientists have been asking questions about fundamental and universal human preferences and goals for thousands of years, without coming to any consensus. Should the machine be programmed with utilitarian ethics, to pursue the greatest good for the greatest number? Or should it be Kantian, stipulating inviolable human rights for every person?

Furthermore, both Bostrom in Superintelligence and Stuart Russell in Human Compatible see this approach as being doomed to fail. Again, this is because it is simply intractable to first collectively agree on what goals an ASI should have, and second have the foresight to forestall all possible failure modes. This scenario is called ‘perverse instantiation,’ of which these theorists give multiple stark examples, such as a paperclip-producing ASI that converts the solar system into a paperclip factory. Such failure modes parallel the disastrous ends unintentionally brought about by overconfident utopian visions such as communism, realized by unrestrained and unchecked totalitarian dictatorships.

In addition to disagreeing on the content of virtue, humanity is marked by divisions of many types, from religion, ‘race’ and nationality, to class and gender. The formulation of the ASI’s fundamental goals might be in the name of humanity, but could readily be the product of a small minority of the most powerful, wealthy and knowledgeable, and could bear the imprint of their preferences. Even with today’s relatively simple AI algorithms, biases of race have emerged, such as in the case of the inability of some facial recognition software to recognize darker-skinned people.

Additionally, the ASI control design project faces a list of fundamental difficulties in pedagogy (understanding, adopting, and retaining human goals) that many analysts believe may be insurmountable. To begin, a key attraction of ASI, according to advocates, is the ability of a machine to recursively change and improve its abilities. This poses the question of whether the goals implanted by human designers might also be subject to modification by the machine. Another perilous scenario occurs if the machine comes to possess the poorly understood qualities of free will and consciousness. If this happens, it might discover or create its own sense of self and self-interest. The menu of possible goals an ASI might pursue is essentially infinite and could well be radically contrary to the original, and now displaced, human-implanted goals. But even if humans are capable of directly engineering the goals of such a superpowered entity, it might misunderstand and fulfill such goals in ways lethal or catastrophic to humanity, the scenario known as ‘perverse instantiation.’

‘Motivation Selection’ Strategies: Towards a Republican ‘Code Constitution’?

The motivation selection approaches of indirect normativity and goal domesticity accept the intractability of ambitious definite goals, and seek to either limit the jurisdiction of an ASI or define an indirect process for discovering its goals. In creating an ASI with ‘domestic’ goals, the approach is to limit what an ASI aims to do, with the idea that such limited goals can be more easily agreed upon and defined. This approach parallels the republican constitutional idea of limited and restrained government, as well as limited and restrained market power. It is better to not do, than to do and potentially destroy. 

Second, the more complex approach of indirect normativity also accepts the intractability of comprehensive moral philosophy, and seeks to create a process by which an ASI, being more capable than humanity, can discover the right goals to pursue. Bostrom lays out two general types of processes, known as ‘coherent extrapolated volition’ and ‘moral rightness’ or ‘moral permissibility.’ Russell develops this idea more concretely in his project to create a ‘provably beneficial’ ASI, with technical developments such as ‘inverse reinforcement learning.’ Coherent extrapolated volition and Russell’s paradigm are based on the idea that an ASI would use its observations of human behavior as a source of data to discover and then realize the ideal preferences of humanity as a whole. These ideas and their motivations parallel the approach taken in republican constitutions. Republican constitutions have fixed procedures and processes for determining the public good which, instead of adopting definite moral goals for governance as in enlightened despotism or a theocracy, put the questions of what the government should do to the people, to be discovered through processes such as elections. As Russell notes, elections can be seen as a process for the government to collect data on the preferences of the people; the ASI would simply generalize and perfect this process. Such an ASI would be inherently restrained in its behavior, like republics, since it would only do what humanity as a collective expresses as its preferences. Alternatively, the ASI could be directed to discover ‘moral rightness’ using its superior intelligence, regardless of the will of humanity. This again parallels the project of enlightened despotism as it sacrifices human freedom and autonomy to the presumably superior purview of a ‘philosopher king’ ASI. While so far enlightened despots have been corrupted by human psychic limitations, this project may be successful in its ASI iteration if an altruistic and ‘provably beneficial’ ASI can be designed. Ultimately, the task of designing an ASI is a constitutional project and raises anew eternal questions of political theory and practice.

‘Capability Control’ Strategies

Another cluster of control strategies seek to limit the capabilities of an ASI. The capability control strategies envisioned by Bostrom fall into four categories: boxing, stunting, tripwires, and incentive methods. These strategies parallel the republican strategies of creating structural restraints on states, independent of their motivations.

Boxing is the strategy of physically separating the ASI from the world by putting it in a ‘box’ that limits its ability to interact with the outside world except through one channel of communication with one or more humans. Boxing is a strategy of containment and division. The ASI might be extremely capable and have misaligned goals, but it would not be able to interact with the world to effect much change. The ASI would be fed very controlled sets of inputs as data, such as static copies of the internet, and it could use this information to answer questions or invent software, for example. It would fall upon the human beings receiving these outputs to actually impact the world using them. It is interesting to note that this exact scenario is no longer a hypothetical. ChatGPT was originally trained on a static copy of the internet, ending in September 2021. However, it has now been connected to the internet, which shows that a precedent has potentially already been set for not boxing AIs with general abilities.

However, even assuming that there are no other ways of interacting with the outside world, the key weakness with the boxing strategy is an imbalance of power between the humans gatekeeping the box and the ASI. The ASI could presumably manipulate the gatekeepers using its superior social intelligence. This possibility depends on whether its super-intelligent capabilities would be general enough to extend to social skills, or whether this domain of intelligence is separate from others. Thus, boxing is an example of the structural restraint strategy of limiting the jurisdiction of a governmental body. Preventing a governmental body from growing its authority ultimately depends on the ability of opposing bodies to quickly counter it, which depends on their balance of power. Unfortunately, there is likely to be an imbalance of power between humans and ASI, even in the realm of social skills and strategic manipulation.

Stunting, the second capability control strategy, simply tries to design the ASI so that it is less intelligent or has access to less information. This method is related to the boxing method since one could stunt a boxed ASI by severely limiting the information inputs into the box. This method has the same logic as limiting the powers of a governmental body. The first issue with this method is that it might be too effective. One could stunt an ASI to the degree where it is no longer an ASI. This limits the usefulness as well as the abusefulness of an ASI. While relinquishing the ASI project may be the path forward, stunting an AI to this degree does not fully address the control problem. Much of the innovation in republican-constitutional theory and practice has consisted of scaling up control strategies for large and powerful states, based on the unavoidable fact that such powerful states are more secure from external predation. It remains to be seen whether yet another scaling up is possible in the case of ASI.

Another potential weakness of this method is that its effectiveness (as with boxing), depends on the generality of super-intelligence and the balance of power with humans. A super-intelligence might be able to deduce far more from what seem to human observers to be unrelated scraps of evidence. Furthermore, its own existence would be a significant source of data about the world, depending on its degree of self-awareness and self-observation. It is likely to have far superior capabilities in a priori reasoning, which would allow it to learn a lot about the world even before observing data. Thus, if there is too little stunting, then a superintelligence with sufficient capability to generalize could figure out how to make itself more intelligent and outsmart humans. Once again, the imbalance of capabilities between humans and ASI poses a severe problem for effective restraint.

Another capability control strategy, using ‘tripwires’, refers to providing humans the ability to shutdown, switch off, or otherwise limit an ASI if its behavior does not conform to human expectations. For example, if an ASI tries to breach the box, then as a last resort the whole system could be powered off. Machines of all sorts are commonly designed with ‘off buttons’ to simply shut the machine down. A potential strength of tripwires is that they do not rely on the ASI knowing about them to work. The problem with this method is that it might be prudent to assume that an ASI would be able to outsmart any simple, unintelligent, and inflexible shutdown mechanism. Again, the problem is the difference in intelligence between humans trying to control the entity and the entity’s superior ability to evade the controls. If the ASI does come to know about tripwires, it could serve to restrain its behavior. Alternatively, it could motivate the ASI to figure out an escape plan.

Tripwires can serve as a recessed balance of power, similar to the Second Amendment. A recessed check on power is a counter-attack capability that is not actively used, but whose existence is known and used as a threat to restrain behavior. The Second Amendment, at least in its original formulation, served this function against the state. Combining this idea with that of Mutually Assured Destruction (MAD), one could imagine humans threatening to detonate nuclear weapons across the planet if the ASI behaves poorly, similar to the system of deterrence between states. However, MAD relies on second-strike capability, and an ASI that knows about such a tripwire could potentially figure out a way to evade it. The possibility of effectively designing unintelligent tripwires depends on the features of an ASI. The key question that tripwires hinge upon is whether it would be possible for a simple and unintelligent mechanism to reliably destroy a superintelligent machine. In the case of human governments, there are several examples of unintelligent but existential threats, from nuclear weapons to asteroid strikes and climate change.

The most sophisticated set of capability control strategies are incentive methods, which are similar to several important republican-constitutional power restraint practices. Incentive methods do not necessarily try to physically limit the entity to be controlled, or engineer its goals, but instead rely upon engineering the environment so that the entity finds it more useful to follow the controller’s goals to achieve its own goals. Balance of power in all its forms is such an incentive method. For example, states in anarchy with roughly equal capabilities are incentivized to avoid imperial and total war, and restrain themselves, because of the possibility of potential annihilation. Thus, a balance of power forces states to consider each other’s goals and cooperate to some degree. In this way, incentives require the agent to know about the punishment. A similar logic applies to separate branches of government within a nation, between states and the central government in a federal system, and in the logic of the Second Amendment, as discussed above. With an ASI, the human collective action problem is so acute that it is difficult to imagine any human organization that could be as efficient, flexible, capable, fast and decisive as an ASI, especially one that reasonably represents all of humanity’s interests. Hypothetically, humans could set up balances of power between multiple ASIs, similar to separate branches of government. However, both of these cases run the risk of collaboration between the entities designed to check one another.

Other incentive methods focus on shaping the behavior of the ASI through rewards. For example, Bostrom proposes engineering the ASI to want reward tokens, which are delivered by humans when the ASI acts for human interest. These ideas blend together well with motivation selection strategies designed to get the ASI to have goals that are democratically aligned. Elections are an analogous example of an incentive method in political systems. With elections, an individual seeking power is incentivized to do so by winning the election, since that is easier than executing a coup. Winning the election further legitimizes the ruling power to act as they wish, but within certain bounds. However, with a reward system, as Bostrom notes, similar to coups against elected governments, an ASI could ultimately take control of the reward mechanism using its superior capabilities to obtain a stronger guarantee of its goals being fulfilled because it might not trust the humans to deliver rewards consistently. 

The similarities between capability control strategies and republican structural restraints shows yet again how the ASI control problem recapitulates and extends the republican-constitutional restraint project. Ultimately, the ASI control problem confronts the same questions republicans have confronted when designing restrained governments. With human governments, structural restraints are ultimately sturdier and more effective at controlling state power than motivation selection strategies, since human leaders cannot always be relied upon to have virtuous goals. However, this calculus shifts with an ASI for two reasons. One, its goals can potentially be fundamentally rewired, unlike humans. Second, by the very nature of an ASI, such a machine would likely be impossible to materially restrain without limiting its usefulness. The most important variable that shapes the nature and effectiveness of any capability control strategy is the degree of generalization superintelligence is capable of. Effective generalization from relatively few specific examples to general theories about the world is a core capability of intelligence. However, if there are limits to such generalization, then human beings can create capability controls on ASI by exploiting these limits using divisions and separations of capabilities. Consequently, before an ASI can be built or emerges, the nature of intelligence and generalization should be thoroughly investigated and understood, possibly under a broader research program investigating guarantees on the safety of ASI behavior.

Public Safety, ASI Containment, and the Information Infrastructure

The ASI control problem unifies and extends the complementary republican-constitutional projects of controlling both powerful people and things. Public safety regimes such as nuclear power plant disaster prevention are political in nature because dangerous and powerful dual-use technologies often serve to amplify the goals of powerful actors. Technological empowerments built to advance such goals prioritize performance over safety, tend to be rushed, and do not consider the interest of the public. Inherently, maximizing the output of powerful technologies towards a singular goal often requires making trade-offs that open up the possibility of catastrophic and unintended failures. This is similar to the perverse instantiation scenario conceptualized by Bostrom. For this reason, public safety comes down to building restraints into things that slow down and introduce checks and balances into the exercise of power using machines, which is inherently tied to the more political goal of advancing the interests of the public as a whole.

Powerful technologies can also pose a danger for public safety from the other direction by compromising the stability and strength of the state and empowering rogue actors, such as in the case of nuclear leakage. States often use such threats to public safety to preemptively control these technologies and expand their policing power. This approach to public safety is not republican because it delegates security to the benevolence of state power. In contrast to this, public safety regimes with a republican character focus on providing safety by building restraints into the artifacts themselves, complimenting institutional checks and balances. By designing tools with built-in restraints, these regimes shift the guarantee of public safety away from arbitrary authorities to the things themselves. This strategy also serves to restrain the misuse of technology from both centralized and rogue actors, as well as accidents. It is a balm to both anarchy and hierarchy.

Public safety from dangerous technologies often comes down to containment, as exemplified by the case of nuclear energy, and this logic may be generally applicable. Of course, effective containment depends on the features of the technology itself. For example, the highly radioactive nature of fissile material and the intensive process of enrichment requires the creation of highly professionalized and controlled environments in nuclear reactors far removed from the daily activities of society. When thinking about containment of ASI, it is important to realize that an ASI would be a physically embodied entity. The body of the ASI could be the set of globally connected digital tools and infrastructures currently operated by humans across the globe. This includes the internet, but could potentially also consist of all those tools that are connected in some way by universal digitalization and standardized information sharing protocols. An ASI would become the ‘brain’ to this vast and globally ubiquitous ‘body,’ and be able to coordinate the activities of this physical infrastructure across the entire globe. In contrast, this megastructure is currently operated by wide and diverse groups of humans, who are all connected by the internet and connected to this apparatus by the internet, but who lack a unifying and coherent purpose. This megastructure is currently tightly coupled across the globe by features such as the universal digitalization of all data, standardized information sharing protocols, ubiquitous data collection and communication devices, and many others. But it is also deeply divided along political lines for reasons of security, as exemplified by the Great Firewall of China. The Great Firewall is an example of a top-down architecture of restraints that serves to advance despotic goals. It restrains the ability of outside actors to influence the Chinese internet, but empowers the Chinese state. Creating a set of capability controls on ASI could consist of building mutually restraining checks and balances into this system that introduce friction and filters into the flow of information. This would split the internet into a set of ‘containers’ or boxes that would be mutually incompatible to a strong degree, and thus potentially restrain an ASI. However, given that the internet is already built on universal digitalization and standardized information sharing protocols, an ASI might be able to connect and centralize these containers by reinventing these technologies, and thus escape the box. It thus remains an open question whether the ‘body’ of an ASI can be separated from the ‘brain’ and controlled.

Conclusion: Saying No for Preservation?

Perhaps the main conclusion which emerges from this analysis is that the ASI control problem is sufficiently difficult as to be in practical terms utopian. ASI embodies a particularly acute form of the power restraint problem, but it is not unique in doing so. The historical difficulties in creating and sustaining effective restraints on state power should be a cautionary warning for optimistic visions of restraining the potentially far more disproportionate power of an ASI. Once a GO3D ASI actually exists on this planet, the fate of humanity rests in its hands, and reasonable guarantees of survival cannot be made. We must either attain stronger guarantees on the behavior of such a machine, a near-utopian project requiring the best of humanity’s efforts, or relinquish the possibility of an ASI entirely.

This raises the question of what the paths leading up to either safe scenario look like in terms of human activity. In other words, what is the path of preservation in the cone of possibility? When considering the strategic picture, Bostrom analyzes the potential effects of collaboration and competition in ASI research and development on the likelihood of solving the control problem. Simply put, technological race dynamics are likely to disfavor safety for speed, and thus increase the risk of an unsafe ASI. Furthermore, the resulting interests amplified by ASI technology would likely be very narrow, limited to the state or corporation that successfully creates it first, and the benefits would be very inequitably distributed. In order to achieve an ASI that is beneficial, we not only need to consider republican strategies in its design, but also need to organize ourselves in a republican manner with respect to AI research activities. Given the unlikelihood of this happening, the path of preservation towards an ASI seems extremely utopian. 

Humans may just not be up to the task of designing a satisfactory code constitution to achieve and sustain ‘value alignment.’ The vast imbalance of powers between an ASI and its would-be human masters means restraint through capability controls might also be doomed to failure. And unlike trial and error learning processes which can encompass failures and learn from them, the control mechanism for an ASI would have to work, from the beginning, every time.

If an ASI is almost certainly going to be uncontrollable and unrestrainable, then perhaps the only hope is preventing an ASI from coming into existence. If such a path is taken, then the paramount goal of all information and computer technology development would have to become preventing an ASI from coming into existence. This would require keeping all AI narrow, configured to do one narrow task extremely well. The creation of general capabilities would have to be avoided. The quest to make an AGI would have to be explicitly, perhaps even legally, relinquished. Designers would have to seek to make narrow AIs incompatible with each other, and otherwise introduce friction into the information infrastructure. The homogenizing trajectory marking standardization and universal digitalization might have to be extensively reversed for security. And perhaps the universal internet would be chopped into pieces, making for separate cyber continents and islands. 

Will this path foreclose great and vital contributions from AI? AI advocates have laid out a glittering array of astounding benefits which advancing AI can plausibly provide. But how many of these benefits could stem from ever more capable narrow AI, as compared to an ASI? A world with numerous, diverse, and highly capable narrow AI could enable humans to reap enormous benefits, and avoid the ultimate peril of an ASI. Even if these benefits are not as transformative as those in the salvation ASI scenario, the right decision at this critical juncture may be to sacrifice the perfect for the good.


NOTES

  1. Eliezer Yudkowsky, “Artificial Intelligence as a Positive and Negative Factor in Global Risks,” ch. 15 in Bostrom and Cirkovic, eds., Global Catastrophic Risks (Oxford University Press, 2008). ↩︎
  2. Here, for example, are some classics: William Everdell’s 1983 The End of Kings: A History of Republics and Republicanism, Scott Gordon’s Controlling the State: Constitutionalism from Ancient Athens to Today (1999), Paul Rahe’s 1992 Republics: Ancient and Modern, Joyce Appleby’s Liberalism and Republicanism in the Historical Imagination (1992), Stephen Holmes’ Passions and Constraint: On the Theory of Liberal Democracy (1995), and, of course, Vincent Ostrom’s “Two Different Approaches to the Design of Public Order,” from The Political Theory of a Compound Republic (which has several revised editions, but the first one came out in 1971). ↩︎
  3. see Norbert Elias’ 1939 The Civilizing Process and Brent J. Steele’s 2019 Restraint in International Politics for two excellent examples ↩︎
  4. For example, see “Planning for AGI and Beyond,” which is a statement by OpenAI CEO Sam Altman about their goals for creating AGI “and beyond” in order to “elevate humanity.” ↩︎

Daniel Deudney is a Professor of Political Science at Johns Hopkins University (send him mail). Devanshu Singh is a Data Analyst at C4ADS, or the Center for Advanced Defense Studies, where he analyzes open-source data to investigate transnational illicit networks such as wildlife traffickers (send him mail).

Blog at WordPress.com.