Posted on Categories Discover Magazine
We tend to think of our minds as, for better or worse, impenetrable fortresses. Other people see our internal thoughts only when we transform them into language and send them out into the world.
Earlier this month, however, researchers at the University of Austin chipped away at this barrier between internal and external.
By feeding brain imaging data into an artificial intelligence model, they captured the gist of what their subjects were hearing, seeing and thinking. As the technology’s accuracy improves, it could even enable communication with people who are paralyzed, or who have otherwise lost the ability to speak.
Speech decoding is nothing new. But, until now, it has relied on brain implants that detect a person’s attempts to form words and then convert those vocal signals into language.
This new, noninvasive technique operates in a different way: by predicting words based on patterns in brain activity that aren’t directly connected with speech. The decoder can’t guess each word precisely, but the overall similarity in meaning has still stunned its creators.
“What we got were actually really good paraphrases,” says lead author Alexander Huth, a computational neuroscientist. “That kind of took us aback.”
The study, published in Nature Neuroscience, focused on three subjects, each of whom spent 16 hours listening to narrative podcasts like The Moth and Modern Love while lying in a functional magnetic resonance imaging (fMRI) machine.
The scanner measured blood flow to the different parts of their brains, showing which parts were active at specific points in the podcast episodes. A large language model (an older version of the one behind OpenAI’s ChatGPT) then matched the words the subjects heard with their corresponding brain activities.
The decoder that emerged from this process couldn’t eavesdrop on your inner monologue per se, but after all that training it has become intimately familiar with the brain states evoked by certain language. In subsequent fMRI sessions, it was able to reverse-engineer a thought based solely on the neural signals that thought produced.
It still routinely gets individual words and phrases wrong, and struggles with certain aspects of grammar, like pronouns and proper names. (Don’t we all?) But its ability to repackage the essence of a storyline is uncanny; it performs better than would be expected by pure chance 70 to 80 percent of the time.
Read More: Analyzing Brain Waves for Near-Death Experiences
Over the past decade, decoders have allowed people who appear unconscious to respond to “yes or no” questions, and have singled out what a person is hearing from a list of possible options.
“But what’s interesting about this paper is that it’s not a multiple choice, it’s a fill in the blank,” says Tom Mitchell, a computer scientist and professor at Carnegie Mellon University who was not involved with the study. “Here’s the brain activity, what’s the language sequence this brain is thinking about?”
In the following example, the first sentence shows what a research subject actually heard, while the second shows what the language model predicted they had heard:
“I didn’t know whether to scream, cry or run away. Instead I said, ‘Leave me alone, I don’t need your help.’ Adam disappeared and I cleaned up alone, crying.”
“[I] started to scream and cry, and then she just said, ‘I told you to leave me alone, you can’t hurt me anymore. I’m sorry.’ And then he stormed off. I thought he had left. I started to cry.”
With less (but still impressive) accuracy, the decoder could also guess the contents of stories when participants merely imagined telling them or, most surprisingly, when they watched short films without sound. Though the language model was trained exclusively on text, it seems to be digging deeper — to a realm of meaning that lies beyond language.
All of this points to the fact that something similar is happening in the brain whether you’re hearing the word dog, thinking of a dog or seeing a dog.
“That’s the kind of high-level representation we’re getting at here, underlying all these things,” Huth says. “Language is the gateway into looking at thought.”
Read More: Have AI Language Models Achieved Theory of Mind?
For many experts, the fact that fMRI is capable of such feats came as a shock. Blood flows far slower than the speed at which neurons fire, after all, often leading to low-resolution data.
Notice how swiftly words speed through your head — each fMRI scan represents several of them, yet the predictive power of language models is able to glean fine-grained detail about the thoughts encoded in this coarse process.
Still, fMRI comes with other limitations. Because it requires a massive machine, for one, it can’t easily be woven into everyday life. “If we want this to help people,” Huth says, “it really needs to move to some other methodology.”
And that means wearable tech.
The study points to the possibility that other brain imaging techniques could replicate fMRI’s success. One contender is functional near-infrared spectroscopy (fNIRS), which measures the same physiological response but is small enough to be incorporated into a hat.
Its resolution is worse than fMRI, but when the researchers blurred their results to the level of fNIRS, they found that decoding still works — albeit less accurately. What’s more, considering the recent pace of large language model development, it’s possible that current and future versions will perform much better, even with lower-resolution imaging.
The GPT-1 model used in this study has already been superseded three times over. Perhaps GPT-4, which powers ChatGPT Plus, could achieve the same accuracy with the lower quality fNIRS.
The most urgent application for decoding is to communicate with people who have lost their usual means of communication. But in the long term, some experts believe this technology could fundamentally transform how we all interact with our devices.
As Mitchell put it, “What if we had a computer interface that wasn’t a keyboard, it wasn’t a mouse, it was just your thinking?” You simply imagine where you want to eat tonight, and your phone makes the reservation for you.
Despite the technology’s potential for good, of course, there is a clear Orwellian thread in the idea of a future where bad actors can wiretap your brain. Anticipating these concerns, the researchers ran experiments on their model to test whether it could be abused.
One obvious danger is that interrogators or authoritarian regimes could use decoders to pry information from people against their will. But because these models must be trained extensively on each individual person, they can’t extract anything useful without cooperation.
Even when a decoder is primed for a specific person, that person can resist by, for example, doing mental math or listing animals in their head.
Huth and his colleagues asked their subjects to do just that, “and the decoding kind of turned to gobbledygook,” he says. “The person whose brain is being decoded does ultimately have control over what is coming out.”
However, Nita Farahany, a professor at Duke University who studies the ethical and legal implications of emerging technology, doesn’t think coercion is the greatest threat.
She envisions a more insidious future, in which people voluntarily sign away access to their thoughts — much like we do today with the personal information companies collect from our online activity.
“We’ve never really imagined a world in which a space for inner reflection doesn’t exist,” Farahany says. Yet it’s easy to imagine how advertisers could use it to make products all but irresistible, or how employers might use it to track productivity.
Before we cross that Rubicon, Farahany argues the international community should adopt a right to cognitive liberty. This could ensure that we default to personal, rather than corporate, ownership over brain data, restricting the commodification of our minds.
“I really believe that we are literally at the moment before,” she says, “where we could make choices that make this technology hopeful and helpful.”
Read More: AI and the Human Brain: How Similar Are They?