Kaylo Littlejohn

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2025-147

August 7, 2025

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-147.pdf

Can we rebuild the bridge between brain and voice, restoring human communication for people with paralysis? This thesis outlines our translational systems that restore speech to individuals with vocal-tract paralysis.

Speech neuroprostheses have the potential to restore communication and embodiment to individuals living with paralysis, but achieving naturalistic speed and expressivity has remained elusive. The advances presented in this thesis enabled a clinical trial participant with severe limb and vocal paralysis to "speak again" for the first time in 18+ years using an AI "brain-to-voice" decoder that restores their pre-injury voice. We use high-density surface recordings of the speech cortex in a participant to achieve high-performance, large-vocabulary, real-time decoding across three complementary speech-related output modalities: text, speech audio, and facial-avatar animation. Leveraging advances in machine learning for automatic speech recognition and synthesis, we trained and evaluated deep-learning models using neural data collected as participants attempted to silently speak a sentence, enabling decoding speeds approaching natural conversational rates. We also demonstrate the control of virtual orofacial movements for speech and non-speech communicative gestures via a high-fidelity "digital talking avatar" controlled by the participant’s brain.

Building on the above advances in high-performance brain-to-speech decoding, I outline our findings demonstrating low-latency, continuously streaming brain-to-voice synthesis with neural decoding in 80-ms increments. The recurrent neural network transducer models demonstrated implicit speech detection capabilities and could continuously decode speech indefinitely, enabling uninterrupted use of the decoder and further increasing speed. Our framework also successfully generalized to other silent-speech interfaces, including single-unit recordings and electromyography.

Together, the findings in this thesis introduce a multimodal, low-latency speech-neuroprosthetic approach with substantial promise for restoring full, embodied communication to people with severe paralysis.

Advisors: Gopala Krishna Anumanchipalli


BibTeX citation:

@phdthesis{Littlejohn:EECS-2025-147,
    Author= {Littlejohn, Kaylo},
    Title= {AI-Driven Speech Neuroprostheses for Restoring Naturalistic Communication and Embodiment},
    School= {EECS Department, University of California, Berkeley},
    Year= {2025},
    Month= {Aug},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-147.html},
    Number= {UCB/EECS-2025-147},
    Abstract= {Can we rebuild the bridge between brain and voice, restoring human communication for
people with paralysis? This thesis outlines our translational systems that restore speech to
individuals with vocal-tract paralysis.

Speech neuroprostheses have the potential to restore communication and embodiment to
individuals living with paralysis, but achieving naturalistic speed and expressivity has
remained elusive. The advances presented in this thesis enabled a clinical trial participant
with severe limb and vocal paralysis to "speak again" for the first time in 18+ years using an
AI "brain-to-voice" decoder that restores their pre-injury voice. We use high-density surface
recordings of the speech cortex in a participant to achieve high-performance, large-vocabulary,
real-time decoding across three complementary speech-related output modalities: text, speech
audio, and facial-avatar animation. Leveraging advances in machine learning for automatic
speech recognition and synthesis, we trained and evaluated deep-learning models using neural
data collected as participants attempted to silently speak a sentence, enabling decoding
speeds approaching natural conversational rates. We also demonstrate the control of virtual
orofacial movements for speech and non-speech communicative gestures via a high-fidelity
"digital talking avatar" controlled by the participant’s brain.

Building on the above advances in high-performance brain-to-speech decoding, I outline our
findings demonstrating low-latency, continuously streaming brain-to-voice synthesis with
neural decoding in 80-ms increments. The recurrent neural network transducer models
demonstrated implicit speech detection capabilities and could continuously decode speech
indefinitely, enabling uninterrupted use of the decoder and further increasing speed. Our
framework also successfully generalized to other silent-speech interfaces, including single-unit
recordings and electromyography.

Together, the findings in this thesis introduce a multimodal, low-latency speech-neuroprosthetic
approach with substantial promise for restoring full, embodied communication to people with
severe paralysis.},
}

EndNote citation:

%0 Thesis
%A Littlejohn, Kaylo 
%T AI-Driven Speech Neuroprostheses for Restoring Naturalistic Communication and Embodiment
%I EECS Department, University of California, Berkeley
%D 2025
%8 August 7
%@ UCB/EECS-2025-147
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2025/EECS-2025-147.html
%F Littlejohn:EECS-2025-147