7.4. Conflict between types of friendliness
There could be different types of benevolent AIs, which would be perfectly fine if each existed alone.
However, conflicts between friendly AIs can be imagined. For example, if the first AI cared only about humans,
and the second cared about all living beings on Earth, the first could be pure evil from the point of view of the
second. Humans would probably be fine under the rule of either of them. Conflict could also arise between a
Kantian AI, which would seek to preserve human moral autonomy based on a categorical imperative, and an
“invasive happiness” AI, which would want to build a paradise for everyone.
If two or more AIs aimed to bring happiness to humans, they could have a conflict or even a war about how
it could be done. The Machine Intelligence Research Institute (MIRI) (LaVictoire et al. 2014) thinks that such agents
could present their source code to each other and use it to create a united utility function. However, source code
could be faked, and predicting the interactions of multiple superintelligences is even more complicated than for one
8. Late-stage technical problems with an AI singleton
AI may be prone to technical bugs like any computer system (Yampolskiy 2015a). The growing complexity
of a singleton AI would make such bugs very difficult to find, because the number of possible internal states of such
a system grows by combinatory laws. Thus, testing such a system would become difficult, and later intractable. This
feature could limit the growth of most self-improving AIs or make them choose risky paths with a higher probability
of failure. If the first AI competes with other AIs, it will probably choose such a risky path (Turchin and
Denkenberger 2017). The bug in the AI may be more complex than just syntax errors in code, resulting instead from
interaction between various parts of the system. Bugs could result in AI malfunction or halting.
We may hope that superhuman AI will design an effective way to recover from most bugs, e.g. with a “safe
mode.” A less centralized AI design, similar to the architecture of the Internet, may be more resistant to bugs, but
more prone to “AI wars.” However, if the AI singleton halts, all systems it controls will stop working, which may
include critical infrastructure, including brain implants, clouds of nanobots, and protection against other AIs.
Even worse, robotic agents could continue to work without central supervision and evolve dangerous
behavior, such as military drones, which could initiate wars. Other possibilities include evolution into non-aligned
superintelligence, grey goo (Freitas 2000), or the mechanical evolution of a swarm intelligence (Lem 1973). The
more advanced an AI-singleton becomes, the more dangerous its halt or malfunction could be.
Types of technical bugs and errors, from low-level to high-level, may include:
Errors due to hardware failure. Highly centralized AI may have a critical central computer, and if a rogue
atom decay created a flip in a bit in some important part of it, like the goal function description, it could cause a
cascade of consequences.
Intelligence failure: bugs in AI code. A self-improving AI may create bugs in each new version of its code;
in that case, the more often it rewrites the code, the more likely bugs are to appear. The AI may also have to reboot
itself to get changes working, and during the reboot, it may lose control of its surroundings. Complexity may
contribute to AI failures. AI could become so complex that its complexity results in errors and unpredictability, as
the AI would no longer be able to predict its own behavior.
Inherited design limitations. AI may have “sleeping” bugs, accidentally created by its first programmers,
which may show themselves only at very late stages of its development.
Higher level problems include conflicts appearing between parts of an AI:
Viruses. Sophisticated self-replicating units could exist inside the AI and lead to its malfunction. Such a