A giant, purple fly with two heads and six red eyes buzzes past the Empire State Building on a clear, blue day.
You, the reader, saw that image in your head despite the impossibility of its actual occurrence, despite the fact that all you actually saw was a specially organized collection of black squiggles on a white page. How did that just happen? It almost seems magical, right? This apparent telepathy, that my words can conjure an image in your mind, is the power, and mystery, of language. Scientists have long puzzled over this ability, and they have given us guiding questions to begin our exploration.
What happens in your brain as you read this, and in my brain as I write this? Where does this ability come from? Did our brains evolve specifically to have this linguistic ability? Or was the processing of language a beautiful side effect of other adapted features of the brain: something we can do, rather than something we are made to do? These questions have divided scientists for over a century and a half, leading to the famous feud between Charles Darwin and his collaborator Alfred Wallace. Wallace was convinced a “Higher Intelligence” had sparked the evolution of language in humanity, to which Darwin bitterly remarked, regarding the theory of natural selection, “I hope you have not murdered too completely your own and my child” . Darwin refused to accept any non-empirical explanation for the evolution of language, and the quest for that explanation has since become known as Wallace’s Problem.
Recent advancements in neuroscience, particularly from the work of Michael Graziano and his team at Princeton , shine new insights into one of the key debates surrounding the evolution of language: if language is a side effect of other abilities, a tool we invented based on more general cognitive functions, what exactly are the functions it is based on? Graziano’s work may point towards an answer. In outlining the Attention Schema Theory, he argues that our subjective awareness arises from our neurological ability to focus our attention on a specific object. You are reading this paper right now, but your mind may be on what you are about to eat for dinner. You know that your mind is on dinner, and now you can choose to refocus your attention on this paper. Our linguistic abilities may merely piggyback off of this evolved, innate ability to focus and model our own attention and the world around us. After all, what was going through your head when you read the first sentence of this article other than simplified visual and linguistic models of a fly and the Empire State Building?
The implications of this debate are far-reaching. A satisfactory answer to Wallace’s Problem and an understanding of how language evolves and operates in the brain would fill one of the largest holes in evolutionary neuroscience, open the doors for advances in humanlike artificial intelligence, and provide a deeper understanding of the linguistic differences associated with Autism Spectrum Disorder (ASD). While this paper is far from proposing a definitive answer, the following sections outline several directions for possible concrete neuroscientific research into areas that may, someday, provide the answers we seek.
The classical view of language and the brain in cognitive science comes from Noam Chomsky’s description of a Universal Grammar and Steven Pinker’s assertion that we have a Language Instinct: both argue that our neurological structures evolved with a genetically encoded, specific system for language processing. In essence, their argument rests on the following assertion: “Language is not a cultural artifact that we learn the way we learn to tell time or how the federal government works. Instead, it is a distinct piece of the biological makeup of our brains . . . People know how to talk in more or less the sense that spiders know how to spin webs” . This theory asserts that grammar is the fundamental unit of language, and that over time and through natural selection, the human brain evolved a specific mental ability to process grammar. This grammar-processor, according to the theory, does not rely on other brain structures for the completion of its task and is universal, or shared by all members of the species.
This view, however, has come under critique from other evolutionary psychologists. Michael Tomasello, an evolutionary psychologist at Duke University, posits that language itself is not a specifically-evolved functionality, but instead makes use of more general cognitive abilities. Tomasello contends that Pinker and Chomsky’s theory “though coherent, is wrong,” and that our linguistic abilities can be explained through the aggregate of simpler biological functions, thereby undermining the popular claim that there is some genetically encoded language-organ. The evolution of these simpler functions, Tomasello argues, is more aligned with our knowledge of the development of other cognitive functions .
Another critique of Chomsky and Pinker comes from Derek Bickerton, a linguist at the University of Hawaii, who argues that words are the fundamental unit of language, rather than grammar . According to Bickerton, humans first used words on the African savannah to refer to animal carcasses for scavenging, thereby facilitating both individual survival and the efficiency of early scavenging groups. This word-forming ability “requires the speaker-listener to have a mental representation, or concept, of the object talked about”; when reading the first sentence of this article, you understood it by taking the words like “fly,” “building,” and “sky,” and attaching them to individual mental representations. This more basic ability is one that Tomasello outlines in his argument as well [4,5].
There is neurological evidence that Tomasello and Bickerton’s theories are more accurate than the Chomsky-Pinker view; the evidence outlined in the rest of this paper suggests that words are the fundamental unit of language rather than grammar. Additionally, Michael Graziano’s Attention Schema Theory may aid in finding the evolutionary and neurological basis for language, potentially directing us even further back in evolutionary history than the African savannah.
The Mouth in the Mind: Broca’s Area and Mirror Neurons
The natural place to start with this exploration of language evolution is understanding what happens in our bodies when we process language. Spoken language is primarily an auditory phenomenon (as writing was only invented around 6,000 years ago), so we must start with the ears: how do we interpret what we hear? Surprisingly, according to research done on a phenomenon called the McGurk Effect, we may hear with our eyes as much as we do with our ears.
The McGurk Effect is the strange observation that a listener’s visual perception of a speaker’s mouth movements totally changes what the listener reports hearing the speaker say . The most famous example of this, shown in many introductory psychology and linguistics classes, is that the sound “ba” (a sheep bleating) is mistakenly heard as “fa” (Deck the Halls). This happens when participants see a video of a speaker making the lip movements of “fa” played on top of audio of the “ba” sound. This observation, that the mouth movements we see play such a large role in our linguistic processing, is surprising, challenging Pinker’s view of language as an independent, modulated function of the brain. Clearly, a closer look at the neurological underpinnings of this effect deserves our attention.
Where exactly does this processing happen in the brain, and how could the McGurk Effect point us in the right direction? Let us look at Broca’s Area, a small section of the prefrontal cortex and the motor cortex, that is known to be partially responsible for the production of speech. Paul Broca is credited with the discovery and identification of this neural architecture in the nineteenth century after observing that patients with damage to this specific area of the brain suddenly lost their ability to speak . This dramatic implication has fascinated psychologists, linguists, and neuroscientists for over a century. More recent neurological insights, however, may illuminate the role of Broca’s Area in the evolution of our linguistic ability.
Another piece of neural architecture implicated in linguistic processing is the mirror neuron system. Mirror neurons are a specific kind of neuron that appear to play a role in regulating bodily movement, first discovered in the motor cortex of monkeys. Interestingly, not only are these neurons activated when a monkey, for example, waves its arms up and down, but also when the monkey sees another monkey wave its own arms up and down, giving these neurons the apt name of “mirrors.” Significantly, these mirror neurons also fire when the monkeys hear sounds or witness communicative gestures made by other monkeys, activities closely related to speech in humans. These networks, moreover, are located in brain regions analogous to Broca’s Area in humans . What does this mean for our speech perception?
Significantly, Pulvermüller et al. show that these same areas of the brain, mirror neurons and Broca’s area, tend to activate when we move our lips and tongue, when we produce speech sounds, and even when we listen to the speech of others ! Where does this activation happen? In parts of the motor cortex, specifically in the regions that overlap with Broca’s Area, exactly where similar behavior is exhibited by the mirror neurons observed in monkeys . Recall the surprising McGurk Effect: the perceived movements of a speaker’s lips and tongue affect a listener’s report of what they hear. This is no longer so surprising, as Pulvermüller et al. have shown that seeing someone speak activates the same areas in the brain that are activated when we ourselves speak .
What does all this mean for what occurs in our brain when we hear and produce language? The mirror neuron system in monkeys suggests that our brains, when observing or executing motion, create neural models—specific connections of neurons that represent each particular motion in our motor cortex. We constantly keep a running model, not only of the physical actions of others, but also of our own actions, like a camera constantly following an actor. These motor models of an action are anatomically distinct from the perception of that action: when we see or execute an action, hear or utter a syllable, not only do our visual or auditory systems activate, but so does our motor cortex, keeping a schematic log of all these movements. Our brain may encode language not only as a set of sounds or ideas, but also as a model of the mouth movements and gestures of others and ourselves. Working from the basis of these mirror neurons, “it is possible to trace an evolutionary pathway that, starting from some elements of the mirror-neuron system in the monkey, may have led to the emergence of human speech” .
This neurological discovery appears to align with Bickerton’s word-model based view of language explained above . Perhaps our key ability to create Bickerton’s “mental representation, or concept, of the object talked about,” (i.e. the previously discussed model) relies on the same neural architecture under examination in the studies of the McGurk Effect, Broca’s Area, and mirror neurons. Whereas Bickerton sees this evolutionary step as primarily happening after our divergence from other primates, and others see it starting from the mirror-neuron system in monkeys, Graziano’s Attention Schema Theory suggests this underlying ability may have its origins much farther back in evolutionary history [2, 5, 8].
From Kermit’s Brain to Yours: The Tectum and Internal Modeling
As the heading of this section foreshadows, we have come to frogs in our quest for understanding the neurological underpinnings of language. The most studied feature in the amphibian brain is the tectum, which takes in visual information and creates a “literal map,” where “each point . . . on the surface of the tectum corresponds to a point in space” around the frog . When a fly enters the left side of the frog’s visual field, the optic nerve sends a signal to the corresponding area on the right side of the tectum, the tectum sends a signal to the muscles of the tongue, and the tongue shoots out to snatch the fly with pinpoint precision. Furthermore, the tectum controls more than visual input; it also integrates auditory information from the frog’s ears and tactile information from the skin to build a complete model of the space on and surrounding the animal. In short, the tectum integrates and models the frog’s sensory experience.
This network of sensory organs, the tectum, and muscular control works essentially the same in all vertebrates: it takes in the most pressing stimulus and directs the body’s attention to that stimulus. However, an ability to simply have attention does little good if an animal is unable to control that attention. If there were no control, then an animal would quickly become immobilized by the onslaught of sensory input around them. This leads us to ask, how does the brain control its bodily attention?
In the brain, similar to the models created by mirror neurons, an internal model is constructed to monitor each part of the body. This model is essentially a simplified, or schematized, map of everything that needs to be controlled. Frogs, along with other vertebrates, evolved to have such a model. For example, humans have an internal model of our limbs, a body schema, that controls our physical movements. Damage to this schema, in the case of a stroke, for example, leads to patients struggling to use certain parts of their bodies in effective ways.
Similarly, the tectum must have an internal model of its attention. Graziano and his team have named this hypothetical internal model the “attention schema,” and empirical evidence for the processing of information about physical attention has been found in the tecta of monkeys and cats. It would follow that other animals with tecta, essentially all vertebrates, including humans, have a similar kind of attention schema. Is all of this talk of modeling starting to sound familiar? May our linguistic ability be piggybacking off of this evolved functionality to control our attention ?
Recall that we have seen evolutionary psychologists argue that the emergence of words, and thereby language, centers around the ability to create a “mental representation” of the object a word refers to, and that these psychologists suggest this ability arose first in early human ancestors . Also recall that neuroscientists have found evidence for mirror neurons in monkeys and in humans that create models of the actions of others and themselves, and that these scientists suggest that, because these mirror neurons reside in Broca’s area, the underpinnings of linguistic ability may have evolved in monkeys . Now we see how this ability to create models of our world and of ourselves actually may lie much farther back in evolutionary history: the evolution of the tectum in vertebrates.
In short, we may have been barking up the wrong trees for the last seventy-odd years. With the recent developments in evolutionary psychology and neuroscience, we can see that studying our abilities to model our own mind and the world around us may be the key to understanding how language works in the brain. Maybe the answer to all of this lies somewhere much farther in the past than we thought, and also in a particular area of the brain: the motor cortex and its connection to the tectum.
Bringing it All Together
To bring it all together, the next time you hear a sentence like, say, “a giant purple fly is being eaten by a dark green frog,” remember how our brains evolved to create mental representations of the world and use words to represent those models. Our new understanding may render this telepathy not as mysterious as it seemed a few minutes ago. Language is a notoriously messy issue from all angles: psychologists, biologists, and linguists can hardly agree on anything when it comes to how language works in the brain. It is still one of the biggest outstanding questions in all of these fields. In fact, this debate has been one of the defining features of linguistic and psychological research of the last hundred and fifty years, even causing the break between Darwin and Wallace. It is high time that neuroscience, through an examination of the neural architecture implicated in language, finally and fully addresses this question and potentially settles this debate once and for all.
I am far from proposing a conclusive answer to the evolution of the neural architecture implicated in language; this article does not even begin to approach the role of Wernicke’s Area in language comprehension. I am, however, suggesting a possible direction forward for future research. We should move past the Chomskian ideas of a language organ and Universal Grammar and look towards ways that our brain uses language as a social tool for modeling the world around us. We should look to these modeling capabilities and, in particular, the motor cortex and its connection to the tectum for possible answers. A study analyzing the activity in these regions during various linguistic activities may be just what we need.
These answers are valuable for several reasons. For instance, many of the features attributed to linguistic ability, of which the motor mirroring system is only one, correspond with many of the symptoms of patients with Autism Spectrum Disorder . Perhaps a more detailed neuroscientific analysis of the motor cortex’s connection to the tectum in the brains of patients with ASD can be a way forward for a deeper neurological understanding of the condition’s effects on linguistic ability. Additionally, creating an artificial intelligence (AI) is one of the greatest challenges in computer science, and the emergence of human-language speaking AI may be just around the corner. Linguistic processing will be a key component of this development, so understanding the neurological basis of language will certainly help the creation of any AI that hopes to understand human language. Perhaps replicating the functions of the tectum and its interplay with the motor cortex will illuminate a way forward in that field as well.
Finally, to really throw you for a loop, consider how all of this knowledge found its way to you through exactly what we have been discussing all along: language.