--Fed data from invasive brain recordings, algorithms reconstruct heard and spoken sounds
For many people who are paralyzed and unable to speak, signals of what they'd like to say hide in their brains. No one has been able to decipher those signals directly. But three research teams recently made progress in turning data from electrodes surgically placed on the brain into computer-generated speech. Using computational models known as neural networks, they reconstructed words and sentences that were, in some cases, intelligible to human listeners.
None of the efforts, described in papers in recent months on the preprint server bioRxiv, managed to re-create speech that people had merely imagined. In-stead, the researchers monitored parts of the brain as people either read aloud, silently mouthed speech, or listened to recordings. But showing the recon structed speech is understandable is "definitely exciting," says Stephanie Martin, a neural engineer at the University of Geneva in Switzer-land who was not involved in the new projects.
People who have lost the ability to speak after a stroke or disease can use their eyes or make other small movements to control a cursor or selection-screen letters.(Cosmologist Stephen Hawking tensed his cheek to trigger a switch mounted on his glasses.) But if a brain-computer interface could re-create their speech directly, they might regain much more: control over tone and inflection, for example, or the ability to interject in a fast-moving conversation.
The hurdles are high. “We are trying to work out the pattern of ...neurons that turn on and off at different time points, and infer the speech sound," says Nima Mesgarani, a computer scientist at Columbia University. “The mapping from one to the other is not very straight forward.” How these signals translate to speech sounds varies from per-son to person, so computer models must be "trained" on each individual. And the models do best with extremely precise data, which requires opening the skull.
Researchers can do such invasive recording only in rare cases. One is during the
ments, networks were exposed to recordings of speech that a person produced or heard
and data on simultaneous brain activity.
Mesgarani's team relied on data from five people with epilepsy. Their network analyzed recordings from the auditory cortex (which is active during both speech and listening)as those patients heard recordings of stories and people naming digits from zero to nine. The computer then reconstructed spoken numbers from neural data alone; when the computer "spoke" the numbers, a group of listeners named them with 75% accuracy.
Another team, led by neuroscientists Miguel Angrick of the University of Bremenin Germany and Christian Herff at Maastricht University in the Netherlands, relied on data from six people undergoing brain tumor surgery. A microphone captured their voices as they read single-syllablewords aloud. Meanwhile, electrodes re-corded from the brain's speech planning areas and motor areas, which send commands to the vocal tract to articulate words. The network mapped electrode readouts to the audio recordings,and then reconstructed words from previously unseen brain data. According to a computerized scoring system, about 40% of the computer-generated words were understandable.
Finally, neurosurgeon Edward Chang and his team at the University of California, san Francisco,reconstructed entire sentences from brain activity captured from speech and motor areas while three epilepsy patients read aloud. In an online test, 166 people heard one of the sentences and had to select it from sentences were correctly identified more than 80% the time. The researchers also pushed the model further: they used it to data recreate sentences from data recorded while people silently mouthed words. That's an important result, Herff says—“one step closer to the speech prosthesis that we all have in mind.”"
However, "What we're really waiting for is how [these methods] are going to do when the patient can’t speak” says Stephanie Ries, a neuroscientist at San Diego State University in California who studies language production. The brain signals when a person silently “speaks” or "hears” their voice in their head aren't identical to signals of speech or hearing. Without external sound to match to brain activity, it may be hard for a computer even to sort out where inner speech starts and ends. Decoding imagined speech will require "a huge jump," says Gerwin Schalk,a neuro-engineer at the National Center for Adaptive Neuro technologies at the New York State Department of Health in Albany."It's really unclear how to do that at all.”
One approach, Herff says, might be to give feedback to the user of the brain-computer interface: If they can hear the computer's speech interpretation in real time, they maybe able to adjust their thoughts to get the result they want. With enough training of both users and neural networks, brain and computer might meet in the middle.