Transformers and the Attention Schema Theory—Musings at the Intersection of Deep Learning and Consciousness Studies
The underlying architecture of large language models like GPT-4 has already upended the world of AI, and it's only six years old. What does it mean for the next six years?
Sometimes it really is wild to confront how rapidly individual developments in science and technology can balloon into world-altering innovations. One such development: the transformer, the deep learning network architecture upon which contemporary large language models (sometimes also called “foundation models”) like OpenAI’s GPT, DeepMind’s AlphaFold, and Google’s LaMDA are built.
Last Monday, the Transformer celebrated its 6th birthday.
What is a transformer model? A plain-English description from the Nvidia blog:
A transformer model is a neural network that learns context and thus meaning by tracking relationships in sequential data like the words in this sentence.
Transformer models apply an evolving set of mathematical techniques, called attention or self-attention, to detect subtle ways even distant data elements in a series influence and depend on each other.
As AI Scientist Dr. Jim Fan explains, the “Transformer did not invent attention, but pushed it to the extreme.” (That honor goes to “Neural Machine Translation by Jointly Learning to Align and Translate,” which Yoshua Bengio’s lab published in 2014, and in machine learning circles is regarded as a major milestone in natural language processing. Bengio has made headlines recently for joining fellow AI OG Geoffrey Hinton in adopting an alarmist stance, citing the existential risk AI poses to human civilization).
Transformers “push [attention] to the extreme” in their introduction of so-called “attention mechanisms,” which make it possible to track the connections among words, forward and backward, across even long texts. Coupled with their ability to process data in parallel rather than sequentially, they can map these connections faster and with more efficacy than prior deep learning models.
I’m not an AI engineer and won’t pretend to be—if you want to dive a bit deeper into transformers, I recommend this piece in TechTalks and the aforementioned piece on the Nvidia blog.
In the course of my research developing the manuscript for Postreality (more context for Postreality here), I’ve spent a lot of time in the weeds with the many and varied theories of consciousness. One that has been fruitful in my own thinking is the Attention Schema Theory (AST), developed by neuroscientist Michael S.A. Graziano’s lab. AST is described as follows in its proposal:
The theory begins with attention, the process by which signals compete for the brain's limited computing resources. This internal signal competition is partly under a bottom-up influence and partly under top-down control. We propose that the top-down control of attention is improved when the brain has access to a simplified model of attention itself. The brain therefore constructs a schematic model of the process of attention, the ‘attention schema,’ in much the same way that it constructs a schematic model of the body, the ‘body schema.’ The content of this internal model leads a brain to conclude that it has a subjective experience.
I encountered the theory through Graziano’s 2016 essay in The Atlantic, and later through his book Rethinking Consciousness. I highly recommend reading at least The Atlantic piece, which charts the evolution of sensing mechanisms in animals:
The theory suggests that consciousness arises as a solution to one of the most fundamental problems facing any nervous system: Too much information constantly flows in to be fully processed. The brain evolved increasingly sophisticated mechanisms for deeply processing a few select signals at the expense of others, and in the AST, consciousness is the ultimate result of that evolutionary sequence. If the theory is right—and that has yet to be determined—then consciousness evolved gradually over the past half billion years and is present in a range of vertebrate species.
It’s possible this theory won’t prove true with regard to the emergence of consciousness (which remains a subject of fervent debate), but it’s hard to argue that viewing the development of our own brains and nervous systems through the logic of selective signal enhancement (the process by which certain sensing systems improve for purposes of evolutionary fitness) is a helpful way of grasping our sensory experience of reality. Moreover, if there is truth in it, it could hold important implications for the future of the humble transformer and AI more broadly.
It’s easy to get caught up in reductive semantic comparisons and wander away from anything useful; of course the “attention” in AST is a different formulation than the mathematical function and weights in a transformer. I want to be extremely careful in anthropomorphizing machine intelligence, as that kind of thinking is prone to category errors. Still, I can’t shake the sense that we’re getting at something critical for understanding the future of machine intelligence. The breakthrough of transformers was pushing attention to the extreme—in other words, granting machines enhanced senses for mapping relations across a given dataset and filtering for relevance.
Current evidence says machines do not have subjective awareness, and many argue that that is an impossibility, but ongoing discussions of existential risk posed by AI lead many to wonder what would happen if machines did develop consciousness. What would that look like? What would they think and desire? What would they ask us? Would we even know it happened? If so, how?
The buzziness around this conversation is an entrypoint into the broader conversation about complex systems, which are characterized by individual parts that interact with each other in a variety of different ways, leading to randomness and emergence. According to Sante Fe Institute Co-Founder-in-Residence David Pines, emergence “refers to collective phenomena or behaviors in complex adaptive systems that are not present in their individual parts.” One way of thinking about emergence is as the aspect that makes the whole greater than the sum of the parts. Further, emergence can describe properties in systems that are in fact properties of the system’s relationship to the larger environment it is part of, such as the human immune system interacting with external pathogens.
With companies, universities, and nations all in an arms race to develop increasingly powerful AI systems, it’s safe to say a complex system has formed (and continues to evolve) around generative AI—and machine consciousness is only one of many possible properties that could conceivably emerge as a result. In fact, what emerges is more likely to be something we don’t predict, or to diverge from expectations in ways that only make sense in retrospect. As Scott J. Shapiro recently pointed out: “Predicting the future is difficult because the future must make sense in the present. It rarely does.”
Making the case for AST, Graziano describes how the evolution of the tectum during the Cambrian explosion ~500M years ago gave vertebrates a centralized controller for attention that could coordinate among senses, which formed the basis of what would one day become the cerebral cortex in humans, a development he believes was critical for the emergence of what we now call consciousness. Under this view, consciousness and other critical aspects of modern human beings can be seen to have emerged through the long and often arbitrary nature of evolution. In machine learning, human beings are innovating using the vastly more time-efficient scientific method, cranked up to an 11 through the incentives of late capitalism. Who knows what will emerge, but we would be wise to view innovations like the “attention mechanisms” of transformers as targeted cognitive enhancements that facilitate emergence.
Transformers, only six years old—and only really in the public eye for half that time—have already activated the energies of folks all over the world. And it’s exceedingly possible that a new proposal will come along that will introduce another order-of-magnitude evolution in the next few years. With so much commotion, would we even notice if a truly novel property emerges in a given model? What the explosion of energy around transformers tells me is that our best bet will be to keep paying attention to attention.