The AI craze has hit researchers trying to understand how our brains process language. The latest technique is to try and link the output of Large Language Models (or LLMs like ChatGPT) to the brain areas observed when doing different types of language tasks. During the Annual Meeting of the Society for the Neurobiology of Language, a symposium made the topic of AI and language a topic of considerable debate.
Models of Language
LLMs are very powerful. Rather than 80 or 100 years, their experience is equivalent to close to 400 years, easily tripling or quadrupling the entire experience of the average human. The particular symposium on using these powerful models to predict brain activity in humans led to serious questions from the audience. In particular, I was struck by how the use of these models ignores both biological and evolutionary information that can help us understand human language. Our species has been in its current incarnation for over 40,000 years. We can add to this that our ancestors spent several more thousand trying to communicate with each other. We can also add to this the fact that as humans we have to learn language as children. In this sense, human brains and our culture have led to incredible cognitive abilities, chief among these the ability to use language. When the symposium ended and they got around to questions I stood up puzzled by what LLMs might help us conclude about human language. I noted that Elizabeth Bates, my doctoral advisor and famed cognitive scientist, had said that networks needed to “get a body and get a life.” It is just not clear to me what these models tell us about human language. As the symposium ended, the discussion blew up into a full-blown debate during the coffee break. It was at that point that I spontaneously pulled out an example from the current election for the president of the United States. There is an analogy here, one that I will use to try and shed light on the misgivings many of us have about using AI to understand how the human brain processes language. First, let’s consider polls and keys.
Polls vs. Keys
For the past 40 or so years, Allan Lichtman, professor, author, and historian, has been predicting US presidential elections. His 13 keys to the white house were developed when he met Vladimir Keilis-Borok a physicist while Lichtman was visiting Cal Tech in 1980. Keilis-Borok had been specializing in studying earthquakes and in one aha moment, the two of them realized that presidential elections were like earthquakes. Basically, it was a vote on the party in power. If the party in power was voted out, it was an earthquake. If they remained, then it was stability. Using his knowledge of history, Lichtman and Keilis-Borok worked out which factors had been responsible in previous elections. This led to 13 keys that revolved around whether the candidate was running for reelection or not, how well the party had done in the midterms, and measures of the economy among other factors.
Going forward, Lichtman began to apply his model. He correctly predicted Ronald Reagan’s reelection in 1984 when many were doubtful that a president in his 70’s would be reelected. He predicted the election of Donald Trump in 2016 when many thought Hillary Clinton would win. Four years later he predicted that Trump would lose to Biden and he made his prediction for the next election in early September 2024. Whatever your political inclinations may or may not be, Lichtman does not care. His only job is to predict the winner of the next election based on his keys. There are others such as Nate Silver who have used a different approach. Silver’s model is based on many different polls in different states. He also takes into account other factors. He argued for the power of these models in his book, The Signal and the Noise. He uses very sophisticated math and statistics to guess who will win. Lichtman uses 13 yes/no questions. In 2016 Silver gave Clinton an 80 percent chance to win the election. Lichtman who had predicted that she would lose.
This brings us back to where I started. What approach should we use in understanding humans? Are sophisticated models such as those of Silver or AI such as that employed in LLMs for language better? The analogy is an interesting one. Human language has been around for a VERY long time. We have been solving the problem of human communication and have done it quite well. Elections for the US president have not been around nearly as long, yet going back in time has yielded great results.
So my question is: Why do we think that a new toy somehow gets us more than an old trick?
In trying to understand how the human brain manages language, we might just take a page out of Lichtman’s approach. Let’s look back in history. In the case of language, our species and our individual language-learning history can help us make sense of things. Now, you might wonder whether I am arguing that we just stop using LLMs for language or advanced models based on polls for elections. Of course, not. They are models and they tell us something about reality. It’s not in using these advanced techniques that we fail. It is in thinking that these advanced techniques are somehow superior to the “old school” tricks we have been using to figure out how the world works. While there are a lot of shiny new toys we can use today, the old boring tricks of the past are not so bad. If you think I am wrong, just ask Alan Lichtman.