GPT-3 is leading on an era of aggregation computation in which computers synthesise useful answers from vast bodies of previously solved problems. This smart aggregation and synthesis capacity is profound and will change the world.
It has been seen as a a first sign of true machine intelligence. But is it? Does it represent a step towards instilling human intelligence in a machine, or a step away? Is it the rise of some other kind of intelligence?
I want to review here three assumptions inside the idea that GPT-3 exemplifies machine intelligence:
- intelligence is logical
- intelligence is mono-dimensional
- is intelligence is accessible to machine
1 THE BROKEN METRONOME OF LOGIC
Science is likened to march, implying if nothing else a regimented process. That process is logic, formal or intuitive abstract reasoning. And from this has arisen computation all its analogies of intelligence.
The idea that logic is a drumbeat to guide though and creativity, a drillbit with which we can mine extract all knowledge, if only we apply it vigorously and extensively enough, is pervasive to all dreams of artificial intelligence. The artificial mind is supposed to be the Everest of logic.
But logic has shortcomings. Logic is not very … logical. It has texture, and holes, and it's not the metronome that people imagine. There's many ways to assess the limits of logic, but three of them are:
The modern history of logic already ought to be cautionary enough for those assuming logic is the highroad to intelligence. It gave us a series of limits to the decidability of logic: its provide definite conclusions to meaningful questions.
Gödel's Incompleteness, Russell's Paradox, and Turing's Halting Problem all indicate the same class of limit in logic: that when a logical system is applied to itself, some aspects of it become incomplete, inconsistent, and thus some features of the system and its outputs are undecidable. These anomalies in pure logic contribute to a failure of one of the overarching ambitions for logic, to offer a comprehensive framework for logical analysis, something seriously proposed by great thinkers such as as David Hilbert.
These are just the limits to logic discovered by logicians: limits to logic discovered by other disciplines are many. One whole addition to the classes of decidability limits came out of modern mathematical modelling of complex dynamical systems, so called chaos.
These decidability limits are not randomness, or just temporary limits on clarity: they are formal descriptions of how logic just runs out of road. The march stops, because the metronome goes crazy.
Is it likely that there are more such formal inconsistencies and limits embedded in logic? I believe it's inevitable, that logic has a texture and character—and the more we become truly familiar with it, the more we will understand its capacity and constraint.
So, it seems to signal a lack of technical rigor tha proponents of GPT-3 imagine the embedded and emergent logics aren't going to be subject, or are already subject, to decidability constraints. Is the the right solution A, B or something else? Computer says…don't know.
The decidability limits of logic are the most native constraints—reflexive problems largely revealed by logicians themselves, even if ignore by everyone else so far. But there are more constraints.
Social science describes some problems as essentially contestable: by which they mean that some issues just cannot be resolved by appeal to reason, since they exist at the boundary of competing, equally logical (or illogical) standpoints.
Nicolas Georgescu-Roegen, the mathematical economist whose mission was to render economic value creation as logically visible as posible, called this early in his investigation. He talked about the 'dialectical penumbra' that exists at the boundary between some apparently logically-framed facts about the world. An example would be: "the US is a democracy" and the "US is not a democracy". Both are true and logical, depending on your standpoint, and between the two there is a dialectical penumbra—essential contestation—not just an insufficiently processes logical operation, yet to reveal a non-contestable outcome.
We can even find, as we move further from the abstract to the more concrete relevance of logic, applicability limits, different from decidability and contestability. In these cases, logic just seems to skim the surface of phenomena, without revealing anything very interesting at all.
Two examples, I suggest, of this applicability limit might be Arrow's Impossibility Theory, and Boltzmann's statistical model of thermodynamics.
Kenneth Arrow showed, with the application of logic to voting systems, that no system of capturing individual preferences could adequately translate those preferences into an interpretable and consistent set of group preferences, under reasonable conditions. The applicability limit here, then, of logic is: how helpful, that is how applicable, is pure logic to the arrangement of political preferences at the large scale. A huge question!
Boltzmann, for his part, used a statistical approach to describe how entropy works. Basically, he showed that entropic degradation is a statistical inevitability, given the infinitely greater number of states of disorder versus the extremely few and unstable states of energetic order in the universe.
But: how useful is this? No deeper insight at all is wielded into the universe from this application of logic: it's just an apparently demonstrable fact without actual penetration into the fundamental science of the cosmos. I believe this is an applicability limit of logic: Boltzmann has explained nothing of why or how the universe behaves the way it does, only that it should.
If we didn't have ample evidence, the above are just a very few examples, of the curious texture of logic—its demonstatrably limited capacity to guide insight, its broken metronome character—we should note even when it's working, it's not giving up its secrets without a fight.
Andrew Wiles' proof of Fermat's Last Theorem took 350 years to materialize, and even then is a titan of complexity, involving many subfields of mathematics, and so much complexity that the mistake in its first proposed version took six months to be uncovered.
Feynman routinely gently mocked the proposals about many extra hidden dimensions to nature by theoretical physicists at Caltech that hadn't yet—and largely still haven't—been reinforced by empirical discoveries.
David Hilbert posed 23 mathematical problems in 1900 that have to this day not been fully or in some cases even partly resolved.
Chomsky's hyperlogical, essentially mathematical, framing of linguistics, as a basically a proto-computational challenge in viable phrase formulation has not only not been completed since its inauguration in 1956 but it has arguably either gone backwards, or perhaps even been destroyed. Competing approaches, including more contextual and thus socially-oriented linguistic pragmatics, and Boltzmann-flavoured statistical crunching have taken over.
All these examples point to single emergent conclusion: logic is often either unsatisfactory, incomplete, misleading, or staggeringly hard and opaque as a guidance mechanism for foundational insight.
The idea that GPT-3, a statistical matching engine, is the thin end of a logical wedge, that will simply grow to leverage more truths, that we can simply expand our knowledge by superscaling logic, is clearly very misguided.
As a timeless reminder of how limited logic and logical competence can be, scientists and computational enthusiasts must always remember Isaac Newton's full range of study.
Surely one of the great logic-driven seekers in history, he applied his great laser-light of intellect not only to the foundations of mathematics, the calculus, cosmological and optical physics, but also…trying to discover a Philosopher's Stone that would turn base metal into gold, and predicting through 'scientific' interpretation of the Bible that the world would end in 2060.
In other words: if GPT-3 was as smart as Newton, it would have a lot to crazy to look forward to it. Logic is no guarantee of intelligence.
2 THE DIMENSIONALITY OF INTELLIGENCE
The belief that all knowledge and intelligence resolvable through logic is complemented by the idea the all knowledge lies in a single unified domain, and thus a given intelligence has the property of commanding this domain.
This is in my view a possibly major error in the evolving understanding of intelligence. The existence of universals that span domains of knowledge does not mean that all these domains can be further unified into a one superdomain. For example, the existence of mathematical regularities that span music theory and planetary motion does not mean necessarily that music and cosmology are just branches of the same knowledge tree.
To understand how this might be misguided, consider what dimensions really are. At the surface, a dimension is a measurable property of objects: colors of flowers, size of dogs, sounds of vehicles. Fundamentally, however, dimensions are a range of irreducible features a domain of phenomena, which cannot be reduced to any. The X axis cannot be expressed in any authentic way by the Y axis.
What may be the case is that much of human knowledge and intelligence is dimensionally-unique: it exists in ways that cannot be expressed in any authentic way by another dimension. What music is and what it is like, perhaps, is simply not comprehensible outside the domain of music; what is linguistic may not be truly comprehensible out of language.
This does not mean features cannot be shared across domains. The X and Y axis of any given domain cannot be expressed in terms of each other, but they can both share a features. Three of X, three of Y, a pattern in X and a pattern in Y. This implies some connection and abstract commonalities, but not that each dimension is collapsible to each other, still less to shared abstractions.
This distinction is worth reinforcing to clarify just how easy it is ignore. Three horses is not the same 'thing' as three rainbows, or three cycles of a sound wave: and it tells you nothing of real importance in the world, or about these things, that you can detect the character of three-ness to them.
A lot of intelligence-as-computation is lost in this notion: that mathematical characterization of entities is the essential step to recognising and understanding, and the rest is mere details or metadata. That would only be true if you had decided at the outset that these abstractions were the essential truth or experience, but, again this is to say that music in the air and music on the page really are the same thing. They just aren't.
If however, information is truly dimensional distinctive, what are its dimensions? How many dimensions are there? What does it mean that live in a universe where information domains are not truly reducible to each other, at least not if one wants to capture the nature of human cognition and intelligence?
Here, I want to propose, as a way to park this whole speculation about knowledge dimensions, the extreme case as the likely one: that not just every cognition, but ever cognitive moment, even every cognitive element - ever aspect of any cognition - is a dimension of knowledge. This means that every element of cognition is an irreducible reference point for every other, across all of time and space, and that phenomena exist within this metacognitive field.
If we wanted to situate this speculation somewhat in the physical realm, we don't need to go so far as to posit multiple universes or timelines—although we could—instead all we need to do is to add a cognitive root to Einsteinian inertial reference frames, which are universally unique to their observational—you might say cognitive—standpoint.
Relativity implies that higher truths of the universe are contingent on the reference frame chosen: such as is show by time dilation. For sure, we don't know yet, but it seems very likely to me that the world is actually different, perhaps imperceptibly but still factually different, from different cognition standpoints, and these are as irreducible to each other ultimately as observational reference frames.
If knowledge is dimensional, then it must be that intelligence is dimensional, and for a machine—or a human—to be intelligent, we must ask in which domains is it or he or she intelligent. There can be no general intelligence, if there is no general knowledge.
In my view what is emerging is that supposed artificial intelligence is actually intelligent not just in only one domain, but in one fragment of one domain: statistics and probability.
3 THE TRANSCENDENTAL PROPERTIES OF COGNITION
To round off these speculations on the computability of human intelligence—is it logical? is it within a single dimensional domain?—we ask if it is even accessible to computers at all. By which I mean: is human intelligence sui generis something that cannot be replicated?
I am moving on the basis of inarticulate and underdeveloped intuitions here; but also in the expectation that if there is any truth in the notion that human intelligence is materially non-fungible with machine intelligence, this is to do with embedded paradoxes and other irresolvables.
So I'll content myself to annotate a few recurring ideas that might uniquely characterize human intelligence.
Many of which I claim are transcendental phenomena, that is don't situate themselves in any simple objective framing of the universe. And if there is a root to the transcendental character of these ideas, it's associated with the human will and its relation to identity formation.
Latency: Latency here is meant in is root not technological sense: the character of things that lie hidden in readiness for possible emergence. This concept I believe to be transcendental and particular to human intelligence. It involves a kind of cognitive tension between the surface state, where something is not present, and a potential state, where it has come forth.
Anticipation: Anticipation like latency is a sense premised on an emphasis of what is not yet here, but differently, anticipation holds the focus on what is coming, rather than what is here, and even when or how soon it is coming. I believe the specific sensation of anticipation, exhibiting tension and intention, is a human cognitive feature that is not found in any other, more abstraction mapping of likely future states to current states of a thing.
Generality: The concept of generality is a sense of some universal character of a thing, type of thing or a set of instances of a thing. I think the human cognitive capacity as a generalizing capacity this unique, and is not as some kind of inductive statistical modelling of a set of phenomena. Chomsky alludes to this kind of intuition, knowledge about things without prior experience of them, in describing Plato's Problem in language: how do people know so much, from so little? In other words, how to humans generalize so much linguistic skill, that is manipulation of grammar rules to formulate new meaningful sentences, so quickly, from so little exposure to to language: it cannot be induction from assessed experience. My view is that, rather than there being simply being an innate bank of facts humans have some inner access to, I believe human insight can synthesize general properties of newly experienced things due to some transcendental property of cognition.
Paradox: Paradox is a sense of contrariness or mismatch among components of some composite or context that do not or ought not to match, but in such a way that the mismatch has a particular non-negligible character, even defying impossibility in some sense. Paradox ought to be on the surface a good example of the human capacity for transcendental cognition, since computers by definition ought not to have any regard, or cognitive scope, for things that just don't make sense, that is cannot be computed. How would they even approach such a thing? Paradox seems to exist at the level of generality: where the general sense of two things clash in some way.
Cognitive Preeminence: Preeminence is a cognitive concept which we all experience, but is otherwise nameless—so I have named it—that is the permanent sense that our cognition at any one moment is the preeminent cognition we are having. We cannot situate ourselves in a past cognition, however deep it may be or deserving of extended experience, and park the present for later: it is always the present moment in the human experience, and we feel we know and experience of ourselves is always in it. If you a imagine human cognition as a whole as fountain spring forth conscious experience, it's as if cognition is somehow always the bit of water at the very tip of the fountain. We can sense what we are about to do, or think, or feel, or experience, and we can remember what we did or felt, but our cognition can never be anywhere else preeminent in the present.
Agents: Fundamentally, I think that agents and agency, at least the subjective sense, are aspects of uniquely human intelligence. This is because that the mutual, cyclical referentiality between sense of self, willpower and projection of intent, circumstances and actions, do not have a starting point, they all seem to relate to each other.
Objects: Most radically, I believe not just the apprehension of objects, the independent status of objects, has a transcendental linkage to human cognition. How much might this be so? Ultimately, I believe it is demonstrable that for objects to be objects, they must be finally both differentiable from their environments and always associate with their constituent parts. While this is appears to be usually clear in space, it is far less clear in time: as with Theseus ship, it is impossible to say, over time, when and how objects become objects, and when they are merely a collection of uncorrelated elements. I believe the status of thingness—entityhood—is not something native to the universe, which in its truest state is merely a flow or shower of phenomena, it is some applied to the universe by agents.
These are samples of aspects of human cognition that I think can never be accessed by machine intelligence, because they really on transcendental properties that cannot be abstracted from the human experience and thus replicated.
There seems to be tight and opaque loops among a sense of self, how the self situates itself inside a flow of experiences, how the sense of self relates to the past, how will and intention relate to the sense of the future, and how many concepts of this sort rely on a tension between whole and part that ought to be logically inconsistent (because circular). These linkages make human cognition unique, and in my view inaccessible, via external abstraction and thus computation.
In the Buddhist analytical tradition, a concept was developed called dependent origination—प्रतीत्यसमुत्पाद or pratītyasamutpāda—which states that phenomena, in particular cognition and the subjective agent, only come into existence co-dependently with other entities. There are no selves without others, no others without selves. This concept inherently transcends logic, which relies on monadic irreducibles, and linear sequences of elaboration. Dependent origination, if it is in some way true, implies not only that objectivity is not the correct way to capture the truest nature of experiences, but that humans have a capacity to navigate these kinds of apparently-illogic, self-referential, transcendental concepts.
I truly believe that computation creates more or less accurate (generally not very accurate) analogues of these phenomena, by collapsing them to static structures that have an object-with-metadata-linking-to-object-with-metadata universal premise, but no dynamic tensions, paradoxes, illogical foundations of the agentive unit.
So these have been some reflections on whether machine intelligence is like human intelligence on the basis of whether logic is limited, knowledge is dimensionally constrained, and human knowledge is accessible to analogues and abstractions or is transcendental and sui generis.
To offer a summary comment it would be this. Computation is not even trying to match, or mirror human intelligence. The investigation of human cognition is almost alien to this project. Instead, the computational aspects of human intelligence are taken, themselves simplified further, and fed as models to computers for elaboration.
By analogy to linguistics, it's as if the quest for semantics - the nature of meaning and meaningful phrases - has been dumped and its implications taken as given, and all the work is on syntactics - the composition of meaningful phrases.
Indeed, this is exactly what has happened in machine translation. You might say this is arguably what GPT-3 itself is doing: one way of describing GPT-3 is as a way of translating some English into … other English. Less grand when viewed in those terms.
Semantics is the challenge trying to climb the mountain of meaning. Computational syntax is just running races.
It's clear these tools will make profound changes to human society. But they are not autonomous intelligence, and not even in my opinion even really on the way towards it.