AI-Driven Speech Neuroprostheses for Restoring Naturalistic Communication and Embodiment
Kaylo Littlejohn
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2025-147
August 7, 2025
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-147.pdf
Can we rebuild the bridge between brain and voice, restoring human communication for people with paralysis? This thesis outlines our translational systems that restore speech to individuals with vocal-tract paralysis.
Speech neuroprostheses have the potential to restore communication and embodiment to individuals living with paralysis, but achieving naturalistic speed and expressivity has remained elusive. The advances presented in this thesis enabled a clinical trial participant with severe limb and vocal paralysis to "speak again" for the first time in 18+ years using an AI "brain-to-voice" decoder that restores their pre-injury voice. We use high-density surface recordings of the speech cortex in a participant to achieve high-performance, large-vocabulary, real-time decoding across three complementary speech-related output modalities: text, speech audio, and facial-avatar animation. Leveraging advances in machine learning for automatic speech recognition and synthesis, we trained and evaluated deep-learning models using neural data collected as participants attempted to silently speak a sentence, enabling decoding speeds approaching natural conversational rates. We also demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures via a high-fidelity "digital talking avatar" controlled by the participant’s brain.
Building on the above advances in high-performance brain-to-speech decoding, I outline our findings demonstrating low-latency, continuously streaming brain-to-voice synthesis with neural decoding in 80-ms increments. The recurrent neural network transducer models demonstrated implicit speech detection capabilities and could continuously decode speech indefinitely, enabling uninterrupted use of the decoder and further increasing speed. Our framework also successfully generalized to other silent-speech interfaces, including single-unit recordings and electromyography.
Together, the findings in this thesis introduce a multimodal, low-latency speech-neuroprosthetic approach with substantial promise for restoring full, embodied communication to people with severe paralysis.
Advisors: Gopala Krishna Anumanchipalli
BibTeX citation:
@phdthesis{Littlejohn:EECS-2025-147, Author= {Littlejohn, Kaylo}, Title= {AI-Driven Speech Neuroprostheses for Restoring Naturalistic Communication and Embodiment}, School= {EECS Department, University of California, Berkeley}, Year= {2025}, Month= {Aug}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-147.html}, Number= {UCB/EECS-2025-147}, Abstract= {Can we rebuild the bridge between brain and voice, restoring human communication for people with paralysis? This thesis outlines our translational systems that restore speech to individuals with vocal-tract paralysis. Speech neuroprostheses have the potential to restore communication and embodiment to individuals living with paralysis, but achieving naturalistic speed and expressivity has remained elusive. The advances presented in this thesis enabled a clinical trial participant with severe limb and vocal paralysis to "speak again" for the first time in 18+ years using an AI "brain-to-voice" decoder that restores their pre-injury voice. We use high-density surface recordings of the speech cortex in a participant to achieve high-performance, large-vocabulary, real-time decoding across three complementary speech-related output modalities: text, speech audio, and facial-avatar animation. Leveraging advances in machine learning for automatic speech recognition and synthesis, we trained and evaluated deep-learning models using neural data collected as participants attempted to silently speak a sentence, enabling decoding speeds approaching natural conversational rates. We also demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures via a high-fidelity "digital talking avatar" controlled by the participant’s brain. Building on the above advances in high-performance brain-to-speech decoding, I outline our findings demonstrating low-latency, continuously streaming brain-to-voice synthesis with neural decoding in 80-ms increments. The recurrent neural network transducer models demonstrated implicit speech detection capabilities and could continuously decode speech indefinitely, enabling uninterrupted use of the decoder and further increasing speed. Our framework also successfully generalized to other silent-speech interfaces, including single-unit recordings and electromyography. Together, the findings in this thesis introduce a multimodal, low-latency speech-neuroprosthetic approach with substantial promise for restoring full, embodied communication to people with severe paralysis.}, }
EndNote citation:
%0 Thesis %A Littlejohn, Kaylo %T AI-Driven Speech Neuroprostheses for Restoring Naturalistic Communication and Embodiment %I EECS Department, University of California, Berkeley %D 2025 %8 August 7 %@ UCB/EECS-2025-147 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-147.html %F Littlejohn:EECS-2025-147