The AI semantic decoder giving the paralysed a voice

14th November 2023

Harry Fowle

0 0

An artificial intelligence system, developed this year, known as an AI semantic decoder, is capable of translating brain activity into a continuous text stream.

This brain activity can be generated while listening to a story or imagining telling a story. Created by researchers at The University of Texas at Austin, the system could be a boon for individuals who are mentally alert but physically unable to speak due to conditions like strokes.

The study, featured in Nature Neuroscience, was led by Jerry Tang, a doctoral student in computer science, and Alex Huth, an assistant professor of neuroscience and computer science at UT Austin. The system utilises a transformer model, akin to those behind Open AI’s ChatGPT and Google’s Bard.

This semantic decoder stands out as it doesn’t require surgical implants, offering a noninvasive approach. Participants aren't restricted to a predefined list of words. The process involves measuring brain activity using an fMRI scanner after the decoder has been extensively trained, during which individuals listen to hours of podcasts in the scanner.

Once trained, the decoder can interpret brain activity from listening to a new story or imagining one, producing corresponding text. Huth said, “For a noninvasive method, this is a real leap forward compared to what’s been done before, which is typically single words or short sentences. We’re getting the model to decode continuous language for extended periods of time with complicated ideas.”

The output is more about capturing the essence of the spoken or thought words rather than providing a direct transcript. When trained, the decoder can often closely match the intended meanings. For instance, the system translated the phrase, “I don’t have my driver’s license yet” to “She has not even started to learn to drive yet.” Similarly, “I didn’t know whether to scream, cry or run away. Instead, I said, ‘Leave me alone!’” was decoded to “Started to scream and cry, and then she just said, ‘I told you to leave me alone.’”

Addressing concerns of potential misuse, the researchers noted that the technology works effectively only with willing participants who have trained the decoder. Attempts to use it on untrained individuals or those resisting its use result in unintelligible or unusable outputs. Tang commented, “We take very seriously the concerns that it could be used for bad purposes and have worked to avoid that. We want to make sure people only use these types of technologies when they want to and that it helps them.”

Additionally, the study involved having subjects watch silent videos while in the scanner. The semantic decoder was able to accurately describe events from the videos based on brain activity. Currently, the system's practical use is limited to laboratory settings due to the dependency on fMRI machines. However, the researchers believe this methodology could be adapted to more portable brain-imaging systems like functional near-infrared spectroscopy (fNIRS). Huth noted that while the resolution with fNIRS would be lower, it measures similar signals to those of fMRI.

The research received support from the Whitehall Foundation, the Alfred P. Sloan Foundation, and the Burroughs Wellcome Fund. Alongside Huth and Tang, the study’s co-authors include Amanda LeBel, a former research assistant in Huth’s lab, and Shailee Jain, a computer science graduate student at UT Austin. Huth and Tang have filed a PCT patent application related to this work.