Ch.

Introduction

We humans are well accustomed to controlling the technologies we develop. From the first flint axes to incredibly sophisticated contemporary machine and information systems, our technologies are designed to do what we want them to do. We largely take this for granted.¹ We are also well used to things that are hard or even impossible to control. It is tricky for us to control animals, quite difficult to control people, enormously difficult to control adversaries (such as other nations) with comparable capabilities to our own, and impossible to control the weather on Jupiter.

This paper represents an urgent warning: while AI thus far has largely been a controllable tool, as we develop more and more highly autonomous, general, and capable AI systems, they will become increasingly difficult and even impossible to control. In particular, on our current developmental pathway, we are far closer to building autonomous superintelligent² AI systems than to understanding and implementing reliable means by which to control them. Humanity is, therefore, currently on a trajectory to build uncontrolled machines more capable than ourselves.

After providing a framework for what “control” means, we give three basic arguments for our thesis: first, that controlling superintelligence by default means being in an adversarial relationship to something that is more capable than we are. Second, that even if the relationhip were highly cooperative and aligned — which we do not know how to make happen — the incommensurability in speed, complexity, scope, and depth between humans and a superhuman machine intelligence renders control either meaningless or impossible. Third, that even if the control problem could be solved in principle, evolutionary and game theory considerations provide overwhelming obstacles to doing so in practice on our current trajectory.

This is a strong and important claim. For if true, it implies that a race to build AGI and superintelligence is a fool’s errand: superhuman AI will not ultimately grant capability, wealth, or power to those who get it first. Those seeking these advantages imagine that superintelligence would be their tool; it would not be. Getting there first simply determines who brings a new, uncontrolled, and potentially catastrophic power into the world.

But where the stakes are high, such as in nuclear command and control, we exert great care and effort in designing extremely robust control systems to ensure both that these powerful technologies “always” do what we want and “never” take unsanctioned actions. ↩
The classic text on Superintelligence, Bostrom’s Superintelligence, is somewhat outdated in terms of how AI technology has progressed, but more relevant than ever in many of its definitions and arguments. ↩

Introduction

Footnotes