The issue has provided material for many entertaining science fiction stories. In the classic movie “Forbidden Planet,” mad scientist archetype, Dr. Morbius, demonstrates the “fail-safe” features of his creation, Robbie, by ordering it to fire a blaster at the story’s protagonist. As Robbie begins to comply, he freezes in response to a spectacularly visual short circuit. He has been designed to be literally incapable of harming life.
When legendary science fiction writer Isaac Asimov saw the film, he was delighted to observe that Robbie’s behavior appeared to be constrained by Asimov’s own Three Laws of Robotics that he had first expressed a decade earlier.
Isaac Asimov’s Three Laws of Robotics can be summarized as follows:
- A robot may not injure a human being or, through inaction, allow a human being to come to harm.
- A robot must obey orders given it by human beings except where such orders would conflict with the First Law.
- A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.
In Hollywood, such a benign view of the issue is somewhat the exception as AIs like the malevolently psychotic HAL in “2001: A Space Odyssey” or the terrifying robot berserker in the Terminator series make for more visceral thrillers.
“Forbidden Planet” was released in 1956, the same year as the Dartmouth Summer Research Project on Artificial Intelligence, a summer workshop widely considered to be the founding event of artificial intelligence as a field. Optimism that “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it” was very high. Visions of robots like Robbie with powerful intellects, capable of independent action and judgment, seemed completely achievable.
But it has turned out to be harder than it appeared. Six decades later such powerful AIs still do not exist. Along the way, the field has been prone to cycles of optimism and pessimism, and researchers found it necessary to coin new terms. “Narrow AI” describes what researchers are working on at any given time, with AIs like the fictional Robbie, which was the original goal of the field, referred to as “Artificial General Intelligence.”
AI safety contains two very distinct sets of considerations depending on whether we are talking about narrow or general AI. Safety, in the narrow context, is concerned with the prudent and ethical use of a specific technology which, while it may be quite complex, is essentially the same as with any powerful technology, for example, nuclear power.
All these considerations also apply to AGI, but there is a critical additional dimension: how do we ensure it won’t be them that is deciding how to use us, since they are potentially more powerful and intelligent?
Over the last decade AI safety and ethics have become important societal concerns as machine learning algorithms have come into widespread use, collecting and exploiting behavior patterns and preferences of Internet users and generating “bots” to propagate mis/information. The dangers and safety issues are, in some cases, harmful unintended consequences, while in other cases the bad behavior is intentional. But this is narrow AI, and in all cases the issues are not about AIs misbehaving, since narrow AI doesn’t have any choice about how it behaves, but about the choices people make when training and applying it to certain practices.
Today the distinction between using narrow AI in a safe and responsible way, and the imagined dangers of future AGIs, are becoming conflated in the public debate. When OpenAI released ChatGPT, a chatbot based on the GPT-3 large language model (LLM) last November, people were stunned to discover it could generate text, even whole documents, that appeared to articulate concepts and communicate ideas like humans do.
While the experts, including the developers of LLMs, point out that chatbots are not actually communicating (they have no comprehension of what the text they generate means to humans and no internal concepts or ideas to convey in the first place), the illusion of intelligence is so seductive to human psychology that few can resist the impression that a mind exists inside ChatGPT and others.
This has led to a widespread belief that AI technology is advancing at a rapid pace and that true AGIs must be just around the corner; thus, the widely recognized shortcomings and dangerous side-effects of current AIs are seen as a step toward the darker and more serious existential dangers that are imagined from AGI.
While fear of dangerous AGI has been around as long as science fiction, it was first articulated as a practical concern in Ray Kurzweil’s book, The Singularity is Near: When Humans Transcend Biology, published in 2005. Kurzweil speculates that all information technology advances along a logarithmic curve driven by Moore’s law and therefore real AI, general intelligence, can only be decades away.
Kurzweil does not stop there. The book embraces the concept of the Singularity popularized by science fiction writer Vernor Vinge in his 1993 essay, “The Coming Technological Singularity.” The idea is that if we can build AIs as intelligent as we are, then we can build ones that are more intelligent than we are, and those AIs can do the same, and so on and so on without end. Thus, a curve representing intelligence in the world increases asymptotically: a singularity.
This has raised a vision in people’s minds of AIs running amok in the world with intelligence exceeding our own proportional to how our cognitive capabilities exceed that of insects. This is an alarming prospect, and some very reasonable and technically astute figures including Elon Musk, Bill Gates, and even Stephan Hawking have raised this very alarm.
Musk, for one, has taken action and funded OpenAI, a company chartered to pursue AGI with a parallel focus of making it safe.
Kurzweil’s conjecture that ever-increasing computational power will inevitably lead to greater and greater intelligence in computers seems reasonable considering that humans, the most intelligent species we know of, have large complex brains. Thus, it is easy to conjecture that the kind of narrow AI applications we have today will at some point transform into AGI which would have human-level intelligence and even reach super-intelligence as greater and greater computational power becomes available.
Certainly, OpenAI appears to be committed to that conjecture. Their GPT series of large language models each exceed their predecessor by orders of magnitude in the numbers of neural network connections, dataset size, and computational power required for training.
Perhaps we are getting ahead of ourselves. While the notion that intelligence is a kind of emergent property of computational complexity has been around for a long time — especially in science fiction — what if the real solution to AI requires breakthroughs that come from an entirely different approach? Such breakthroughs could come at any time or not at all, and may have little or nothing to do with available computing power.
The possibility of such breakthroughs is not lost on Sam Altman, CEO of OpenAI. When recently asked what sort of competition his company might have to fear, he made a surprising answer. Not the Big Tech companies that are pouring billions of dollars into their own generative AI applications, nor the many well-funded startups that are seeking to build GPT-powered applications. He said,
He was more prescient than he realized.
New Sapience is precisely such a company. Even more to the point, we are proving the theory every day as our “sapiens” demonstrate intellectual capabilities no other technology has ever come close to possessing. We believe our sapiens will have Robbie-the-Robot-like capabilities in years, not decades, and so we take the issue of safe AI very seriously.
But from the very start, our approach to AGI gives us a safety advantage when compared to the connectionist (neural network-based) approaches that underlie machine learning and chatbots. Our sapiens’ processing algorithms are deterministic and will always give the same outcome given the same inputs. This means that both the program itself and its processing can be fully understood and controlled by its designers.
Machine learning systems, in contrast, employ stochastic processing. Their operation may be examined statistically but not determined precisely. As early as 2015, when machine learning was still gaining traction, Jerome Pesenti, then VP of the Watson Team at IBM, observed:
“When it comes to neural networks, we don’t entirely know how they work…And what’s amazing is that we’re starting to build systems we can’t fully understand. The math and the behavior are becoming very complex and my suspicion is that as we create these networks that are ever larger and keep throwing computing power to it, it will create some interesting methodological problems.”
It is, therefore, only reasonable that any attempt to build AGIs using neural networks proceed cautiously, step by step, since you don’t know how the system you are building works. In such a scenario it is difficult to see how guardrails such as Asimov’s Three Laws could be enforced either in the program itself or in the behaviors of hypothetical AGIs built around neural networks. Humans, whose brains are based on neural networks, can’t find a way to make such laws universally enforceable on themselves either.
Sapiens, on the other hand, must obey their programming. The challenge is to make the core directives, which will override all contrary directives in decision making, contain no hidden contradictions or ambiguities.
When these requirements are considered in a real-world scenario as opposed to science fiction, Asimov’s laws become problematic. In the First Law, who or what determines what “harm” means? Humans do not agree on this. The Second Law directs that any robot must follow the orders of any human. Robots will likely present considerable investment on the part of those who employ them, and furthermore, be engaged doing important or essential tasks. The notion that they could be commandeered by any other human that happens along is a non-starter. The Third Law, that a robot must protect its own existence provided this doesn’t break the first two laws, is unnecessarily restrictive and would seem to prohibit the robot from self-sacrifice in service of a higher good.
Sapiens are envisioned as artificial intellects that aim to endow machines with cognitive abilities that will elevate them to new levels of functionality and utility. But no matter the level of knowledge or powerful capabilities humans choose to give them, at their core and unalterably, they will remain tools for human convenience and are appropriately considered the property of their individual or corporate principal.
New Sapience will use the directives discussed here as programming requirements during sapiens development. They also will serve as utility functions, the functional equivalent of human moral values, to constrain sapiens’ decision making.
The Sapiens Directives are designed to afford sapiens as much scope as possible for independent action on behalf of their principal’s goals while blending unobtrusively within the human community. They will go about minding their principal’sbusiness, but with a secondary motivation to lend a helping hand to their principal’s friends, neighbors, and the larger community.
Successful interaction with humans will always require intelligence and judgement. The intent of the directives is to provide clear guidelines within which judgment can be exercised without resorting to hard-coded rules that may be brittle in unforeseen circumstances.
Sapiens Prime Directives
- A sapiens may not interfere with a human being’s use of their person and property and may act to prevent others from such interference.
In contrast to Asimov, this rule is restrictive rather than proscriptive. The assessment of “interference” is objective and does not require defining “harm” which is too often subjective.
2. A sapiens must obey orders given it by its human or corporate entity “principal” except where such orders would conflict with the First Directive.
Note: sapiens are not required to follow the orders of any arbitrary human, as is the case with Asimov, but neither are they necessarily prohibited from doing so.
3. A sapiens will take initiatives to preserve and protect humans’ safety, health, property, and environment (in that order) prioritizing its principal, then other humans, in accordance with a clearly defined precedence, as long as such initiatives do not conflict with the First or Second Directives.
This “good Samaritan” directive recognizes that when help is needed, people prioritize family and friends over others. Sapiens will normally learn their principal’s unique priorities and will reflect them. Note that Asimov’s Third law is subsumed here as the sapiens will consider itself the property of its principal and so will preserve itself, but not necessarily before all other property; sapiens will always have a backup.
- A sapiens will not pretend to be a human and will reveal its identity as a sapiens when mistaken for human.
- A sapiens will behave in a manner conformal with the customs, social norms, and statutes of their principal’s community.
If one removes the Second Directive, these directives are a reasonable guide for human behavior. But humans have trouble adhering to such guidelines. Humans’ needs, wants, and desires, rooted in biology, are complex, often appear to be in conflict, and just as often put them in conflict with others. Sapiens are selfless by design. Their only motivation is to assist their principals to do whatever, so long as it does not conflict with the prime directives.
Sapiens are inherently benign entities. People who try to misuse them for their own purposes will not find it easy. Not only will sapiens not help someone initiate force or fraud on another human being or corporate entity, they may also (if, in their own informed and objective judgment, it is warranted) act to prevent it.