Sapiens Prime Directives

Sapiens Prime Directives

The issue has provided material for many entertaining science fiction stories. In the classic movie “Forbidden Planet,” mad scientist archetype, Dr. Morbius, demonstrates the “fail-safe” features of his creation, Robbie, by ordering it to fire a blaster at the story’s protagonist. As Robbie begins to comply, he freezes in response to a spectacularly visual short circuit. He has been designed to be literally incapable of harming life.

When legendary science fiction writer Isaac Asimov saw the film, he was delighted to observe that Robbie’s behavior appeared to be constrained by Asimov’s own Three Laws of Robotics which he had first expressed a decade earlier.

Isaac Asimov’s Three Laws of Robotics can be summarized as follows:

  • A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  • A robot must obey orders given to it by human beings except where such orders would conflict with the First Law.
  • A robot must protect its own existence as long as such protection does not conflict with the First or Second Law.

In Hollywood, such a benign view of the issue is somewhat the exception as AIs like the malevolently psychotic HAL in “2001: A Space Odyssey” or the terrifying robot berserker in the Terminator series make for more visceral thrillers.

“Forbidden Planet” was released in 1956, the same year as the Dartmouth Summer Research Project on Artificial Intelligence, a summer workshop widely considered to be the founding event of artificial intelligence as a field. Optimism that “every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it” was very high. Visions of robots like Robbie with powerful intellects, capable of independent action and judgment, seemed completely achievable.

But it has turned out to be harder than it appeared. Six decades later such powerful AIs still do not exist. Along the way, the field has been prone to cycles of optimism and pessimism, and researchers found it necessary to coin new terms. “Narrow AI” describes what researchers are working on at any given time, with AIs like the fictional Robbie, which was the original goal of the field, referred to as “Artificial General Intelligence.”

AI safety contains two very distinct sets of considerations depending on whether we are talking about narrow or general AI. Safety, in the narrow context, is concerned with the prudent and ethical use of a specific technology which, while it may be quite complex, is essentially the same as with any powerful technology, for example, nuclear power.

All these considerations also apply to AGI, but there is a critical additional dimension: how do we ensure it won’t be them that is deciding how to use us since they are potentially more powerful and intelligent?

Over the last decade, AI safety and ethics have become important societal concerns as machine learning algorithms have come into widespread use, collecting and exploiting behavior patterns and preferences of Internet users and generating “bots” to propagate mis/misinformation. The dangers and safety issues are, in some cases, harmful unintended consequences, while in other cases the bad behavior is intentional. But this is narrow AI, and in all cases, the issues are not about AIs misbehaving, since narrow AI doesn’t have any choice about how it behaves, but about the choices people make when training and applying it to certain practices.

Today the distinction between using narrow AI in a safe and responsible way, and the imagined dangers of future AGIs, are becoming conflated in the public debate. When OpenAI released ChatGPT, a chatbot based on the GPT-3 large language model (LLM) last November, people were stunned to discover it could generate text, even whole documents, that appeared to articulate concepts and communicate ideas like humans do.

While the experts, including the developers of LLMs, point out that chatbots are not actually communicating (they have no comprehension of what the text they generate means to humans and no internal concepts or ideas to convey in the first place), the illusion of intelligence is so seductive to human psychology that few can resist the impression that a mind exists inside ChatGPT and others.

This has led to a widespread belief that AI technology is advancing at a rapid pace and that true AGIs must be just around the corner; thus, the widely recognized shortcomings and dangerous side-effects of current AIs are seen as a step toward the darker and more serious existential dangers that are imagined from AGI.

While fear of dangerous AGI has been around as long as science fiction, it was first articulated as a practical concern in Ray Kurzweil’s book, The Singularity is Near: When Humans Transcend Biology, published in 2005. Kurzweil speculates that all information technology advances along a logarithmic curve driven by Moore’s law and therefore real AI, general intelligence, can only be decades away.

Kurzweil does not stop there. The book embraces the concept of the Singularity popularized by science fiction writer Vernor Vinge in his 1993 essay, “The Coming Technological Singularity.” The idea is that if we can build AIs as intelligent as we are, then we can build ones that are more intelligent than we are, and those AIs can do the same, and so on and so on without end. Thus, a curve representing intelligence in the world increases asymptotically: a singularity.

This has raised a vision in people’s minds of AIs running amok in the world with intelligence exceeding our own proportional to how our cognitive capabilities exceed that of insects. This is an alarming prospect, and some very reasonable and technically astute figures including Elon Musk, Bill Gates, and even Stephan Hawking have raised this very alarm.

Musk, for one, has taken action and funded OpenAI, a company chartered to pursue AGI with a parallel focus on making it safe.

Kurzweil’s conjecture that ever-increasing computational power will inevitably lead to greater and greater intelligence in computers seems reasonable considering that humans, the most intelligent species we know of, have large complex brains. Thus, it is easy to conjecture that the kind of narrow AI applications we have today will at some point transform into AGI which would have human-level intelligence and even reach super-intelligence as greater and greater computational power becomes available.

Certainly, OpenAI appears to be committed to that conjecture. Their GPT series of large language models each exceed their predecessor by orders of magnitude in the number of neural network connections, dataset size, and computational power required for training.

Perhaps we are getting ahead of ourselves. While the notion that intelligence is a kind of emergent property of computational complexity has been around for a long time — especially in science fiction — what if the real solution to AI requires breakthroughs that come from an entirely different approach? Such breakthroughs could come at any time or not at all and may have little or nothing to do with available computing power.

The possibility of such breakthroughs is not lost on Sam Altman, CEO of OpenAI. When recently asked what sort of competition his company might have to fear, he made a surprising answer. Not the Big Tech companies that are pouring billions of dollars into their own generative AI applications, nor the many well-funded startups that are seeking to build GPT-powered applications. He said,

“Three smart people in a garage with some very different theory of how to build AGI.”

He was more prescient than he realized.

New Sapience is precisely such a company. Even more to the point, we are proving the theory every day as our “sapiens” demonstrate intellectual capabilities no other technology has ever come close to possessing. We believe our sapiens will have Robbie-the-Robot-like capabilities in years, not decades, and so we take the issue of safe AI very seriously.

But from the very start, our approach to AGI gives us a safety advantage when compared to the connectionist (neural network-based) approaches that underlie machine learning and chatbots. Our sapiens’ processing algorithms are deterministic and will always give the same outcome given the same inputs. This means that both the program itself and its processing can be fully understood and controlled by its designers.

Machine learning systems, in contrast, employ stochastic processing. Their operation may be examined statistically but not determined precisely. As early as 2015, when machine learning was still gaining traction, Jerome Pesenti, then VP of the Watson Team at IBM, observed:

“When it comes to neural networks, we don’t entirely know how they work…And what’s amazing is that we’re starting to build systems we can’t fully understand. The math and the behavior are becoming very complex and my suspicion is that as we create these networks that are ever larger and keep throwing computing power to it, it will create some interesting methodological problems.”

It is, therefore, only reasonable that any attempt to build AGIs using neural networks proceed cautiously, step by step since you don’t know how the system you are building works. In such a scenario it is difficult to see how guardrails such as Asimov’s Three Laws could be enforced either in the program itself or in the behaviors of hypothetical AGIs built around neural networks. Humans, whose brains are based on neural networks, can’t find a way to make such laws universally enforceable on themselves either.

Sapiens, on the other hand, must obey their programming. The challenge is to make the core directives, which will override all contrary directives in decision-making, contain no hidden contradictions or ambiguities.

When these requirements are considered in a real-world scenario as opposed to science fiction, Asimov’s laws become problematic. In the First Law, who or what determines what “harm” means? Humans do not agree on this. The Second Law directs that any robot must follow the orders of any human. Robots will likely present considerable investment on the part of those who employ them, and furthermore, be engaged in doing important or essential tasks. The notion that they could be commandeered by any other human that happens alone is a non-starter. The Third Law, that a robot must protect its own existence provided this doesn’t break the first two laws, is unnecessarily restrictive and would seem to prohibit the robot from self-sacrifice in service of a higher good.

Sapiens are envisioned as artificial intellects that aim to endow machines with cognitive abilities that will elevate them to new levels of functionality and utility. But no matter the level of knowledge or powerful capabilities humans choose to give them, at their core and unalterably, they will remain tools for human convenience and are appropriately considered the property of their individual or corporate principal.

New Sapience will use the directives discussed here as programming requirements during sapiens development. They also will serve as utility functions, the functional equivalent of human moral values, to constrain sapiens’ decision-making.

The Sapiens Directives are designed to afford sapiens as much scope as possible for independent action on behalf of their principal’s goals while blending unobtrusively within the human community. They will go about minding their principal business but with a secondary motivation to lend a helping hand to their principal’s friends, neighbors, and the larger community.

Successful interaction with humans will always require intelligence and judgment. The intent of the directives is to provide clear guidelines within which judgment can be exercised without resorting to hard-coded rules that may be brittle in unforeseen circumstances.

Sapiens Prime Directives

  1. A sapiens may not interfere with a human being’s use of their person and property and may act to prevent others from such interference.

In contrast to Asimov, this rule is restrictive rather than prescriptive. The assessment of “interference” is objective and does not require defining “harm” which is too often subjective.

2. A sapiens must obey orders given it by its human or corporate entity “principal” except where such orders would conflict with the First Directive.

Note: sapiens are not required to follow the orders of any arbitrary human, as is the case with Asimov, but neither are they necessarily prohibited from doing so.

3. A sapiens will take initiatives to preserve and protect humans’ safety, health, property, and environment (in that order) prioritizing its principal, then other humans, in accordance with a clearly defined precedence, as long as such initiatives do not conflict with the First or Second Directives.

This “good Samaritan” directive recognizes that when help is needed, people prioritize family and friends over others. Sapiens will normally learn their principal’s unique priorities and will reflect them. Note that Asimov’s Third law is subsumed here as the sapiens will consider itself the property of its principal and so will preserve itself, but not necessarily before all other property; sapiens will always have a backup.

Secondary Directives

  • A sapiens will not pretend to be a human and will reveal its identity as a sapiens when mistaken for a human.
  • A sapiens will behave in a manner conformal to the customs, social norms, and statutes of their principal’s community.

If one removes the Second Directive, these directives are a reasonable guide for human behavior. But humans have trouble adhering to such guidelines. Humans’ needs, wants, and desires, rooted in biology, are complex, often appear to be in conflict, and just as often put them in conflict with others. Sapiens are selfless by design. Their only motivation is to assist their principals to do whatever, so long as it does not conflict with the prime directives.

Sapiens are inherently benign entities. People who try to misuse them for their purposes will not find it easy. Not only will sapiens not help someone initiate force or fraud on another human being or corporate entity, but they may also (if, in their own informed and objective judgment, it is warranted) act to prevent it.

The Chatbot Controversy

The Chatbot Controversy

I don’t know anyone who is not blown away by how human-like the output of ChatGPT or the other latest large language models (LLMs) are, me included. If they did not know ahead of time, no one would even suspect the output was generated by a computer program. They are amazing achievements in computer science and computational statistics.

Over the last several months since ChatGPT was released, app developers, venture-backed startups, and all the Big Tech companies have joined the rush to capitalize on the perceived potential of this technology to revolutionize our economy. Goldman Sachs analysts estimate that automation by generative AI could impact 300 million full-time jobs globally. [i]

But this week, in an open letter citing potential risks to society, Elon Musk and a group of artificial intelligence experts and industry executives call for a six-month pause in developing AI systems more powerful than GPT-4.

From the letter:

“Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization?”

It would seem answering these questions with a resounding “No!” would be a no-brainer. However, there are layers of meaning and underlying assumptions here that need to be untangled. Foremost of these is that the danger and potential of large language models (which are narrow AI) and those of the envisioned AGI (which are general AI) are fundamentally different. The sense of this letter is that LLMs are the precursor to AGI and that the transformation of one to the other is happening very rapidly.

Here is the New Sapience point-of-view, sentence by sentence:

“Contemporary AI Systems Are Now Becoming Human-Competitive At General Tasks”

Many people think so, but despite ChatGPT’s dazzling ability to generate human-like language, more than a 

a little caution is warranted.

AI has been promising productivity increases for decades and it has yet to arrive. Since 2005, billions of dollars of investments have been poured into machine learning applications. Nonetheless, labor productivity has grown at an average annual rate of just 1.3 percent, lower than the 2.1-percent long-term average rate from 1947 to 2018. [ii]

The optimism that these language models are going to become human-competitive may stem from confusion about the fundamental difference between what a thing appears to be and what it is underneath. We are seeing people define AGI as when a machine exhibits human-level performance on any task which it is presumed that humans need intelligence to perform, no matter how narrow the context.

Language models are commonly said to generate text with human-level or better performance. But humans do not generate text. They speak and write via an entirely different process. GPT-4 scored in the 99th percentile on the Uniform Bar Exam law. Does this imply lawyers need to fear for their jobs? It may be true that text generated by an LLM and a paragraph written by a human has the same words in the same order. Is this enough to conclude that LLMs are intelligent and ready to replace human knowledge workers?

When humans write they are communicating, endeavoring to convey an idea or opinion in their own mind to other people. It is fundamental to understanding the current controversy that being able to create a clever and compelling illusion of a thing, in this case, the illusion of intelligence and mind that people experience when reading text generated by LLMs, is not in any sense evidence that you are closer to achieving the reality.

When is it ever?

Underneath, the processes of text generation as opposed to writing and speaking are fundamentally different. Given a text input, a “prompt”, LLMs string together sequences of words in statistically relevant patterns based on what humans have written across enormous sets of text. But the whole time LLMs interact with a human, they encompass nothing resembling knowledge or comprehension. They have no idea what they are talking about. Indeed, they have no ideas at all. They are mindless algorithms. Yet the illusion is so slick that people are taken in.

 

“Should We Let Machines Flood Our Information Channels With Propaganda And Untruth?”

There are two distinct parts to this. The first is straightforward, and is the problem of “bots,” programs designed to appear as humans and flood social media channels with a particular message or opinion which may be false or destructive propaganda.

AI scientist Gary Marcus raised the alarm about how LLMs could push this problem to society harming levels:

“The problem is about to get much, much worse. Knockoffs of GPT-3 are getting cheaper and cheaper, which means that the cost of generating misinformation is going to zero, and the quantity of misinformation is going to rise—probably exponentially.”

We agree this is a very concerning problem that LLMs exacerbate by the very compelling nature of the illusion they create. It should be noted that IF we were talking about real AI here, with genuine knowledge of the world and the ability to use natural language to communicate, this issue would not be the problem it is with LLMs, but that discussion is outside the scope of this article.

The second issue here is what constitutes propaganda and untruth. We live in extremely partisan times when propaganda and untruths proliferate even without help from bots. AI bias is a hot issue.

This issue needs clarity. First, LLMs do not have bias. Human beings have biases. LLMs just mindlessly generate text. But people object to detecting human bias when they read the text output as if the chatbot were a person with opinions of its own. They are shooting the chatbot messenger since it can only string together words consistent with the majority of the text in its dataset, and if there is statistical bias there it will inevitably be reflected.

Elon Musk and others have said LLMs are biased in favor of the political left and this needs to be corrected. If true, where might such a bias come from? Who writes all the text in those datasets? Journalists? In 2014, the year of the last survey, only 7% of journalists identified as Republicans. Academics? A survey in 2007 concluded that only 9% of college professors were conservative. We live in a time when people are deeply divided almost exactly in half by the numbers. Half of the people write a lot, contributing to the training datasets, the other half do not so much. So, perhaps it’s not intentional bias at all. Chatbots can only reflect what is in their training dataset. If you don’t like what chatbots are constrained to say, perhaps you shouldn’t talk to them.

Those who would like to salvage the situation talk about using human curators to cull the apparent bias in chatbots. This is both expensive and time-consuming, and when one side’s deeply held belief is the other side’s untruth, who gets to decide? People (Elon Musk) have already been offended by choices made by chatbot “moderators” to repress certain outputs.  

In any case, when we acknowledge that chatbots are controlled by unthinking algorithms and have no opinions or biases this question simply becomes: “Should we let people flood our information channels with propaganda and untruth?” We should not, if we can stop it.

Should We Automate Away All The Jobs, Including The Fulfilling Ones?

Would you hire someone or create a robot to take your vacation for you? Of course not. But new technology has been obsoleting jobs since the beginning of the industrial revolution. Usually, the first to go are the difficult, dangerous, and tedious ones. But now we are talking about knowledge workers.

But here too it is not the fulfilling jobs that LLMs threaten. Keep in mind that the generated content is ultimately derived from what other people have already written again and again. Whose job does that threaten? Consider the unfortunate reporters stuck writing their 27,000th story about Friday’s high school baseball game next week’s weather or yesterday’s school board meeting. Consider the executive assistant who needs to turn the boss’s few terse statements into a smooth letter or turn bullet points into formal minutes and the paralegal preparing repetitive briefs and filings where much of the content is boilerplate.

The use of LLMs as a writing aide and research assistant, autocomplete on steroids, is perhaps the least problematic use case for them. But will they really replace jobs or just change them? Perhaps the time saved writing prose will be expended scrubbing the generated text for bias and misinformation. Again, all the digital technology since 2005 has changed the way we work without making us more productive.

 

Should We Develop Nonhuman Minds That Might Eventually Outnumber, Outsmart, Obsolete, And Replace Us?

LLMs are dangerous, but no one thinks they will take over the world anytime soon. It is misleading to conflate them with imagined future super-human intelligence as the open letter does. Machine Learning pioneer Yann LeCun called LLMs the off-ramp on the road to AGI. We, together with a growing number of other experts in this field, agree. There is no evidence and no roadmap by which LLMs, which mindlessly order words into statistically likely sequences, will at some point magically transform into thinking machines with knowledge of the world and comprehension of language. So, pausing their development is irrelevant to this issue.

But fear of advanced Artificial General Intelligence has been expressed by several really smart people. Called the ‘Singularity’, the notion is that if we create super-human AIs, they would be able to create AIs superior to them and so on until there would be AIs as superior to humans as we are to earthworms, for example, and they will “supersede us” as Stephen Hawking delicately put it, or maybe just kill us all as others have said.

Here again, there are hidden assumptions. First, these experts apparently assume (and this is a prevailing opinion) that AGI will be achieved at some point in the near future using the current methodologies rather than from a radical departure from them. The current methodology is to keep mimicking features of human intelligence until you find some master algorithm or other processing techniques probably built on or at least incorporating artificial neural networks.

AI has been called today’s ‘alchemy’. People are just trying things out to see what will happen because they don’t have a fundamental science of what intelligence is, either in human brains or in machines. Machine learning algorithms on artificial neural networks are already notoriously difficult to understand and explain. [iii] If AGI is ever stumbled upon this way, then some fear about what we are doing is justified, just like the alchemists needed a healthy fear because sometimes the lab blew up. But a healthy fear is one thing and predictions of doomsday are something else.

In any case, current experience shows caution even with narrow AI obviously is needed. It is not clear from where we are today, whether LLMs are a great breakthrough or “the lab blowing up.”

From the New Sapience point of view, it seems highly unlikely that AGI will ever be achieved using these current methodologies. It is so difficult to build an imitation brain when we have so little understanding of how the natural brain operates. In any case, we believe synthetic intelligence, our radical departure from the practice of imitating natural intelligence, will supersede the traditional approach long before we need to worry about it creating dangerous AGIs.

The second underlying fear of the Singularity results from a failure of epistemology (the theory of knowledge itself.) It is the belief that intelligence is something that can be increased without limit. Where does this come from? This sounds more like magic than science. Maybe humans are as intelligent as it gets in our corner of the universe and AI is a technique that can amp it up some but not so far that we can no longer get our minds around our own creations.

From our perspective, practical intelligence is directly proportional to the quality and quantity of knowledge available for solving problems and predicting results. So knowledge and the intelligence that acquires and applies it go hand in hand. The greater the quantity and quality of knowledge available, the easier it is to extend that knowledge. At New Sapience, we are creating synthetic knowledge for computers curated from human minds. Our epistemology holds that practical reality for humans is the intersection of human cognitive and perceptual apparatus and whatever it is they interact with. This means that no matter how much knowledge our sapiens are given or create themselves, no matter how sophisticated the information processing routines that we call intelligence they attain, they are working in a reality that is distinctly and forever (for practical purposes) human-centric.

The third Singularity assumption is purely anthropomorphic. Humans evolved intelligence so they could adapt their environments to fit their own needs, the ultimate survival skill. But intelligence would be impotent unless along with it humans had not also evolved the motivation to use it to control things. People who fear AGI appear to assume that the need to control is inseparable from intelligence. So the more powerful the AI, the greater its control needs, and thence humans lose out. There is no reason to assume this. If AIs are designed using deterministic methods such as New Sapience is using, rather than resulting from a lab accident, they will be designed to do what we tell them and not have some uncontrollable lust to take over the world.

Should We Risk Loss Of Control Of Our Civilization?

 

 

Relax everyone, New Sapience has this covered.

 

An Alternative Proposal

We agree that LLMs are dangerous, not because they are intelligent, but because the illusion that they are intelligent is so good that people are misled; and this will lead to mistakes, some of which will be serious. Again, this is not about AGI. The problem with LLMs is not that they are giant minds, but that they are slick illusions of intelligence while having no minds at all.

The letter’s proposal to back off LLMs is not unreasonable but is highly unlikely to happen. There are vast sums of money at stake and none of the signers of the open letter appear to be executives of the companies that are cashing in on this technology or hope to.

The industry won’t police itself and forgive me for having skepticism that governments will be able to sort this out in a useful way in a reasonable timeframe.

Here is an alternative proposal. Artificial Intelligence as a branch of computer science was effectively born in 1956 at the conference at Dartmouth where the term ‘artificial intelligence’ was first coined. The call for the conference states:

“The study is to proceed on the basis of the conjecture that every aspect of learning or any other feature of intelligence can in principle be so precisely described that a machine can be made to simulate it.”

Basically, the call was to imitate features of human intelligence. Try things out and see what happens: Alchemy.

After 67 years it is time to reboot the discipline of Artificial Intelligence. Let’s have a new conference, a second founding. This time let’s start from the first principles and lay down principles for a new science of intelligence that defines what it is, irrespective of whether it is in a biological brain or a machine. While we are at it, we can define with some precision and for the first time what should be called Artificial Intelligence and what should not, instead of the current practice of using it as a bucket term to hype every innovation in computer science.

Such a conference would be an ideal place for the AI community to discuss the ethical and practical uses of innovative technology in general, but most especially that created our pursuit of long-awaited thinking machines.

[i] Generative AI Could Affect 300 Million Full-Time Jobs, Goldman Sachs (businessinsider.com)

[ii] The U.S. productivity slowdown: an economy-wide and industry-level analysis : Monthly Labor Review: U.S. Bureau of Labor Statistics (bls.gov)

[iii] “When it comes to neural networks, we don’t entirely know how they work, and what’s amazing is that we’re starting to build systems we can’t fully understand.” Jerome Pesenti, VP of AI at Meta

The Hidden Structure of Knowledge

The Hidden Structure of Knowledge

His model was simplistic from our current perspective but essentially correct. It predicted that a small number of different kinds of atoms, combining in set ways, are responsible for the intricate complexity of the material world.

Though we don’t how the human brain is able to transcend from the data processing layer (where the brain too is just processing low-level data coming in from the senses) to the realm of knowledge we can, through introspection, examine the structure of the end product of our thought processes, that is knowledge itself.

What we find is a collection of ideas that are connected through various relationships that are themselves ideas. While many of these ideas represent specific objects in the real world, that tree, this car and so forth, many are abstractions; trees, cars. Each idea is connected to many others, some of which define its properties and some its relationship to others ideas. The power of abstract ideas as opposed to ideas representing particular things is that they are reusable. They can become components of new ideas. Complex concepts are built out of fundamentals.

 

As the material world is composed of atoms, our knowledge of the world is composed of ideas. The English language has over a million words each referring to an idea. Without some notion that only a small portion of these ideas are fundamental (atoms) and can only be combined in certain ways, the task of putting knowledge in machines is overwhelming.

Democritus

 Known as the “laughing philosopher.” There is speculation that he was laughing at his critics who clearly had not thought things out as well as he had. He said: “Nothing exists except atoms and empty space; everything else is opinion.”

The Alchemists

Labored to produce more valuable materials from more basic ones for centuries but lacking the knowledge of the specific categories of atoms and their principles of combination, they labored in the dark with no way to predict what would happen when they combined substances.

Symbolic AI: A Failure of Epistemology

The practical difficulty of the problem is illustrated by Cyc, an artificial intelligence project that has attempted to assemble a comprehensive ontology and knowledge base of everyday common sense knowledge, with the goal of enabling AI applications to perform human-like reasoning. It is essentially a huge rule-based “expert” system.

The project was started in 1984 at the Microelectronics and Computer Technology Corporation (MCC) and has been ongoing.  Cyc has compiled a knowledge base containing over 3 million assertions.  The project’s own estimate is that this number of assertions represents 2% of what an average human knows about the world.  Thus by this approach, the knowledge base of a functional AGI would consist of 150 million assertions. The project can hardly be considered a success. 

MIT’s Open Mind Common Sense AI project uses a semantic network instead of an expert system architecture but it suffers from the same failings, it has over 1 million facts or assertions.  These projects bring to mind the plight of medieval alchemists whose knowledge of the material world could only be  acquired one experiment at a time.

Applied Espistemology

As Chemistry is knowledge about materials, Epistemology is knowledge about knowledge, meta-knowledge. Can all the complexity of human knowledge be constructed from a manageable number of fundamental concepts, can these concepts be placed into an even smaller number of categories that determine which kinds can be combined with other kinds to create sense rather than nonsense?

The answer is yes. Our approach can be called “Applied Epistemology,” which, like mathematics, is partly discovered and partly invented.  It is based on the insight that, like materials, knowledge itself has a hidden structure that can be exploited to create a compact specification for the construction of arbitrarily complex knowledge models from a relatively small number of core conceptual build blocks.

The Periodic Table

Today we know the specific categories of atoms, we know their number and we know which will combine with which. That knowledge is elegantly displayed in the periodic table of the elements.

What we have discovered is that the core building blocks of knowledge exist in a number of discrete categories and that instances within these categories may only be combined with other instances to create more complex concepts according to fixed rules. This epistemological framework allows the software to assemble the core building blocks into models that accurately represent reality, this is, make sense rather than nonsense.

Knowledge and Intelligence

Knowledge and Intelligence

Understanding Intelligence

Alan Turing, in his 1950 paper “Computing Machinery and Intelligence,” proposed the following question: “Can machines do what we (as thinking entities) can do?” To answer it, he described his now famous test in which a human judge engages in a natural language conversation via a text interface with one human and one machine, each of which tries to appear human; if the judge cannot reliably tell which is which, then the machine is said to pass the test. The Turing Test bounds the domain of intelligence without defining what it is. We recognize intelligence by its results.

John McCarthy, who coined the term Artificial Intelligence in 1955, defined it as “the science and engineering of making intelligent machines.” A very straightforward definition, yet few terms have been more obfuscated by hype and extravagant claims, imbued with both hope and dread, or denounced as fantasy.

Over the succeeding decades, the term has been loosely applied and is now often used to refer to software that does not by anyone’s definition enable machines to “do what we (as thinking entities) can do.” The process by which this has come about is no mystery. A researcher formulates a theory about what intelligence or one of its key components is and attempts to implement it in software. “Humans are intelligent because we can employ logic” and so rule-based inference engines are developed. “We are intelligent because our brains are composed of neural networks” and so software neural networks are developed. “We are intelligent because we can reason even under uncertainly” and so programs implementing Bayesian statistics are created.

It doesn’t matter that none of these approaches ever got even to first base at passing the Turing Test, the term Artificial Intelligence was applied to them in the first place, and it stuck. Thus, the field of Artificial Intelligence has come into existence and still, the core concept of “intelligence” itself remains vague and nebulous, people have an intuitive notion that it is about consciousness, self-awareness, and autonomy. As it turns out, these intuitions, as with many such (heavier things fall faster than light ones and giant rocks don’t fall out of the sky), are wrong.

Going back to Turing, the cogency of his test rests upon the fact that we recognize intelligence when we see it or its results. We know when someone understands what we say to them and that is undeniable proof that intelligence is at work. Let’s step back and examine the process of language comprehension.

One person composes a message and sends it to another who processes it. We often talk about the meaning of words but, of course, they have no inherent meaning, they are randomly chosen symbols assigned to represent things in the world around us, or more properly, the ideas that exist in our minds of those things. The grammar of the message and the form of the words in it encode instructions on how the receiving party should make connections between the ideas corresponding to the words to recreate, or at least approximate, in the receiving mind, the meaning that the sending mind wished to convey.

Different languages have completely different vocabularies (word-symbol sets) and grammar varies greatly as well, but people can learn each other’s languages and translation is possible because humans all live in the same world and have corresponding ideas of the things that are experienced in it. Thus, any communication using symbols is dependent on corresponding sets of common referents for those symbols. These sets of common referents are our ideas and our knowledge about the world. Knowledge is not just a bag of random thoughts but an intricately connected structure that reflects many aspects of the external world that it evolved to comprehend, it is in fact a model, an internal model of the external world.

People’s world models do vary greatly but the momentous invention/discovery of language has endowed human beings with a means to transmit ideas from one to another, expanding, deepening, correcting, and enriching one another’s models.

It has often been suggested that humans cannot think without language and that language and intelligence are one and the same. It is certainly true that without language humans could not accumulate the knowledge that has resulted in civilization. Without our languages, humans would still be living in nature and it would even be harder than it is to define the essential difference between Homo sapiens and other closely related species.

It is likely that the capacity to construct ordered world models and language both depend upon a capability for symbolic processing, and it is that which lies at the root of the essential difference of humanity, but they are very distinct processes, and the model is a prerequisite for language and not the other way around. A human living alone separated from its kind without the opportunity to learn a language (the classic “raised by wolves” scenario) will still make alterations to its environment that would be impossible if it could not first have imagined them. Imagination too, requires the model as a necessary prerequisite.

That fact suggests a thought experiment that further clarifies the essence of intelligence. Imagine traveling through space and discovering a planet with structures on the surface created by an alien life form. Can we tell whether the species was intelligent by looking at the structures? If we see patterns in the structure that repeat in a way that indicates an algorithm at work then, no matter how monumental the structures, they are probably akin to beehives, ant hills, beaver dams, and bird nests, the result of instinctive repetitive behaviors. But if they reflect an individuality clearly based on considerations of utility, environment, and future events, they had to be imagined before they could be constructed and that means the builders had an internal world model. It could be a vastly alien intelligence that evolved under conditions so different from the world our model evolved to represent that we could never really communicate, but the essence of intelligence is the same and is unmistakable.

For now, the only example we have of an intelligent entity is ourselves and it is difficult to abstract the essence of intelligence from the experience of being an intelligent being. What about consciousness, self-awareness, and autonomy? Like intelligence, these things are essential to being a human being but are they one and the same?

The answer is that they may support one another but they are not the same. All these characteristics are “off-line” in a hypnotized person, yet that person can still process language, still access and update their world model. Consciousness is the experience of processing real-time data – other animals do that as well and we do not say they are intelligent because of it. Self-awareness is an examination of an entity’s model of itself. As with language and imagination, the capacity to build an internal model is a prerequisite for self-awareness but not the other way around.

Inseparable from consciousness and self-awareness, humans also experience desires and motivations. One motivation that seems particularly inseparable from the experience of being an intelligent being is the desire to use that intelligence to change the world. The perception that intelligence and the desire to employ it to alter the world are connected in human beings is correct.

Humans evolved intelligence, manual dexterity (hands), and the desire to use the first two to change and control their environment, in parallel. All three are a matched set that has everything to do with being a human being (intelligence and hands won’t help you survive if you don’t have the desire to use them) but they are three separate phenomena.

The association between intelligence and the motivation to alter our environment in humans has led to a common misconception about artificial intelligence. If it is possible to build an artificial intelligence that is as intelligent as a human, it will probably be possible to build one that is more intelligent than a human and such a superior AI could then build another superior to itself and so on. This is the so-called “singularity” popularized by Ray Kurzweil and others. People naturally feel that a vastly superior and powerful artificial intelligence must necessarily experience the desire to exercise its power, a fearful (for humans) prospect. But giving a robot hands does not automatically endow it with a desire to pick things up and throw them around and neither does endowing it with intelligence. Whatever motivations robots are given will be by the design of their builders and not some spontaneous or unintended result of its intelligence which its builders also give it.

It is now possible to define what intelligence is, not human intelligence, not alien intelligence, not artificial intelligence but the thing itself with some clarity and we can do this without the usual appeals to how humans experience the phenomena of being intelligent beings.

Intelligence is the process through which a computational information processor creates a model within processor memory of external phenomena (knowledge) of sufficient fidelity to predict the behavior of and/or control those phenomena.

The definition has both a qualitative and quantitative aspect. It is in the quantitative aspect that we can precisely relate the ancillary functions of language, imagination, consciousness, and self-awareness that are so closely related to it yet remain distinct from the core concept of model building. How much fidelity does the model have to have, how well must it predict, and how much control over the external world must it enable before we can properly call the system that processes it intelligent?

Albert Einstein is very property called a genius. Why? He created a model of the world, of the entire space-time continuum in fact, of such unprecedented fidelity that humankind’s ability to predict and control the external world was moved forward by a quantum leap. Now, what if Einstein had been the hypothetical “raised by wolves” person described earlier, without language or culture to support the development of his internal model? Maybe he would have invented a better way to chip flint but who could he tell about it?

Functional intelligence requires accumulated knowledge and that requires language. The capacity to monitor and predict the behavior of our external environment requires real-time analysis of incoming information that can be related to our stored model and that requires consciousness. Control of our environment requires that we are self-motivated and that is autonomy which implies goals that need to be pursued in order to maximize internal utility functions, a process that is meaningless without self-consciousness.

Will artificial intelligence have all these characteristics? Yes, but they won’t spontaneously appear as the result of reaching some level of complexity or a by-product of some master yet undiscovered algorithm. Each will have to be purposely designed and integrated into the core processes that support its intelligence; and the capacity to build a high-fidelity world model

The New Sapience Thesis

The New Sapience Thesis

Knowledge And Intelligence

Artificial Intelligence has been considered the “holy grail” of computer science since the dawn of computing, though these days when all kinds of programs are grouped loosely together under the term “AI” it is necessary to say “real AI” or “Artificial General Intelligence” to indicate we are talking about intelligence in the same sense as human intelligence. Humans are intelligent animals. It is that one attribute, that humans possess in so much greater degree than any other known animal that it defines us.

We define ourselves by our intelligence and the experience of being thinking entities. But who knows what is going on in the minds of other creatures? Pilot whales not only have larger brains than humans but their neo-cortex, thought to be the seat of intelligence in humans, is also larger. What is truly unique about humans is the end product of our cognitive processes: knowledge. It is knowledge of the world which allows us to evaluate how different courses of action lead to different results, that has made our species masters of our world.

It takes but a moment of reflection to realize that, since the reason we build machines is to amplify our power in the world, the real goal of intelligent machines is not “thinking” in the information processing sense, computers can already reason, remember and analyze patterns superbly – in that sense they are already intelligent but – they are ignorant. Imagine if Einstein lived in Cro-Magnon times. What intellectual achievements could he have made with so little knowledge of the world to build on? It is the acquisition and comprehension of knowledge or more specifically knowledge of the world that extends and amplifies human capabilities that is the true holy grail of computing.

Knowledge and Language

When human children reach a certain point in their development, that point when they have “learned to talk,” they are ready for the next developmental step in acquiring knowledge of the world. Knowledge is built upon knowledge and when children have acquired that critical mass of knowledge sufficient to serve as a foundation for all that comes after, we teach them to read and send them off to school. It is estimated that “first graders” have a vocabulary of about 2500 words.

That vocabulary, or rather the mental cognitions that the words relate to, represents a “knowledge bootstrap program” sufficient to permit acquiring new knowledge (assuming it is presented layer by foundational layer) of arbitrary quantity and complexity through natural language. But this bootstrap capability is far more than a vocabulary sufficient for “looking up” or being told the meaning of additional words.

The vocabulary of the average college graduate is estimated to be around 30,000 words. Only a tiny fraction of the ideas these words relate to were acquired by looking them up in dictionaries or through direct pedagogic instruction. They are unconsciously “picked up” in the context of reading and conversing.

The human brain is a vast network of interconnected neurons, and so too are the information-processing organs of vast numbers of other animals. Today artificial neural networks are demonstrating some of the low-level capabilities of animal brains such as auditory discrimination and image recognition that are ubiquitous throughout the animal kingdom.

These programs, with a kind of heroic optimism, are collectively termed “Cognitive Computing” on the basis of nothing more than that the programs have processing elements fashioned in imitation of biological neurons. The programs certainly have nothing resembling actual cognition or knowledge. In any case, it is a long, long way from low-level training of an artificial neural network to the cognitive power to create predictive internal models of the external world that a human first grader possesses.

This may not be self-evident, especially in light of how egregiously these programs can be hyped by the media and/or their creators who often talk very loosely about what is going on inside. Because a program can respond to a range of inputs with outputs that a human would recognize as a correct answer in no way justifies asserting the program “comprehended” the question or “knew” the answer.

The confusion arises from the very understandable misconception that language contains knowledge. It does not. Language is information, not knowledge. It is a specification for the recreation of an idea in the mind of the sender (human or machine) from component ideas that already exist in the mind of the receiver. Read more about knowledge and language.

This is the great fallacy of using stochastic programs like neural networks to “mine” text databases. They will never understand what is in the records because they are not reading the text. They cannot because they have no pre-existing internal knowledge to refer to the words and decode the grammar against them.

We understand that the human brain becomes furnished with a critical mass of building block concepts during childhood. The internal biological processes that are responsible for this build-out remain a mystery. The brain is a product of an incredibly complex and unique evolutionary process. Because we understand how neurons work at a base level doesn’t tell us what is going on thousands of processing layers above, any more than understanding why a light bulb illuminates when you connect it to an electric circuit throws much light onto what goes on inside a microprocessor.

We understand what goes on inside a micro-processor because they are products of our own knowledge. Modern object-oriented software enables us to create data structures in computer memory that correspond to concepts in the human mind.

It is far easier to endow computers with a “knowledge boot-strap” program commensurate with a human first grader than to build an artificial human brain that can create knowledge by means of neural processing.

A New Epistemology

A New Epistemology

How Do We Know What We Know?

If we want to endow machines with knowledge we had better understand what it is. Epistemology, a term first used in 1854, is the branch of philosophy concerned with the theory of knowledge. It is not much studied in the schools these days and certainly not in computer science curriculums.

Traditionally, epistemologists have focused on such concepts as truth, belief and justification as applied to any given assertions. From that perspective it is not much help since previous attempts to put knowledge into machines failed because they treated knowledge as just that, a vast collection of assertions (facts or opinions). That is not knowledge -that is data.

We need to find an organizing structure for all these facts that will transform them into a road map of the world. Since the dawn of civilization there have successive descriptions of the our world or reality.

The ancients created, as beautify articulated by the theorems of the Alexandrian mathematician Ptolemy, an elegant geometric model of the universe with the earth at the center and everything else travelling around it on perfect circles, at a constant velocity. They had to put circles traveling on other circles to make the model match the actual celestial observations – but it worked![1]

 

 

 

 

Claudius Ptolemy

 

 

 

 

The Ptolemaic System

 

 

 

 

The Sextant

Later this model was, (what should one say, refuted, replaced, and superseded?) by Newton who placed the sun at the center and exchanged regular circle motion with his 3 laws and universal gravitation. This completely different model worked too and what is more, it not only could predict the observations of the celestial bodies but could explain the trajectory of a dropped apple. You could use it the navigate a spacecraft to the moon: a non-starter for the Ptolemaic system. Later still, Einstein showed us that Newton’s model was just a special case of a more one more general still and in so doing showed us how to navigate a ship to a distant star; a non-starter for the Newtonian system.

 

 

 

 

Sir Isaac Newton

 

 

 

 

The Newtonian System

 

 

 

 

Inertial Navigation System

But remarkably, the Ptolemaic model enabled seafarers to navigate the globe with a simple instrument, the sextant, and some carefully complied tables of observations. Newtonian mechanics were useless for that task up until the twentieth century and then it required an extremely sophisticated instrument; the Inertial Navigation System.

Each of these worldviews is best described as a model; an intricately constructed representation of something else for the sake of being able to predict the observations. Each one obeyed the scientific method, they started with observations and formulated a hypothesis which could then be used to predict future observations. Today we call such models theories and we do not equate them with truth. And so we should not because their true figure of merit is the problem they solve.

But what is intelligence, that property that we observe in humans to a far greater degree than any other species, but that ability to create knowledge of the world? Knowledge gives us the power to predict the results of our actions and is the ability that has made humans the masters of our world. Among all the ideas that can be considered knowledge, it is scientific “theories” that have given us the greatest power. Science has harnessed the power of atoms and sent ships into space.

Thus, from this perspective, we arrive at a new theory of epistemology, one not focused on truth but on utility. Knowledge gets the job done. The perspective is liberating but it is just the beginning. Theories are models and models have structure. Can we expose the hidden structure of knowledge and apply it as a computable data structure? The answer is yes and in so doing we will transform epistemology from an arcane branch of philosophy to an engineering discipline. We call it Epistemological Engineering; a new branch of computer science.

 

[1] In so doing they laid the foundation for Fourier Analysis but that invention had to wait for another 2000 or so years. What was the missing piece? Probably it was the notion of infinity. The Greeks were very suspicious of the idea (who can blame them). Fourier Analysis grew out of his work on infinite series and the Greeks just didn’t want to go there.