Systems for Machine Learning on Edge Devices

Shishir Patil

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2024-1

January 3, 2024

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-1.pdf

Modern edge applications increasingly rely on Machine Learning (ML) based predictions. ML models deployed in the real world rapidly degrade in quality due to the evolution of data and our interpretation of data with time. Cloud applications combat model staleness by retraining models frequently. However, this is challenging on Edge devices such as smartphones and wearables, since they are characterized by memory and energy constraints, while the ML models continue to grow bigger. To overcome this, we propose two complementary systems POET and Minerva. POET is an algorithm designed to facilitate the training of large neural networks on memory- scarce, battery-operated edge devices. It optimizes the integrated search spaces of rematerialization and paging, two techniques that significantly reduce the memory requirements of backpropagation. By formulating a mixed-integer linear program (MILP), POET achieves energy-optimal training within given memory and runtime constraints. This approach not only allows for training substantially larger models on embedded devices but also enhances energy efficiency without compromising the mathematical correctness of backpropagation. Complementing POET, Minerva offers an end-to-end ML model update system specifically tailored for microcontroller-based devices. At its core is a novel system abstraction known as capsules, which facilitates efficient and flexible ML model updates without necessitating disruptive full device firmware updates (DFUs). These capsules exploit the pure function nature of ML model inference, overcoming the challenges typically associated with updating parts of a running program. Recognizing the absence of ground-truth labels, Minerva presents a novel technique to evaluate deployed models. Minerva’s efficient updates and the ability to shadow-test enable a wide array of embedded devices to receive regular model updates. Together, POET enables training models locally, and in those scenarios where training requires the cloud, Minerva enables efficient deployment and testing thereby opening up newer domains of computing to exploit ML models on the edge.

Advisors: Joseph Gonzalez and Prabal Dutta

BibTeX citation:

@mastersthesis{Patil:EECS-2024-1,
    Author= {Patil, Shishir},
    Title= {Systems for Machine Learning on Edge Devices},
    School= {EECS Department, University of California, Berkeley},
    Year= {2024},
    Month= {Jan},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-1.html},
    Number= {UCB/EECS-2024-1},
    Abstract= {Modern edge applications increasingly rely on Machine Learning (ML) based predictions. ML models deployed in the real world rapidly degrade in quality due to the evolution of data and our interpretation of data with time. Cloud applications combat model staleness by retraining models frequently. However, this is challenging on Edge devices such as smartphones and wearables, since they are characterized by memory and energy constraints, while the ML models continue to grow bigger. To overcome this, we propose two complementary systems POET and Minerva.
POET is an algorithm designed to facilitate the training of large neural networks on memory- scarce, battery-operated edge devices. It optimizes the integrated search spaces of rematerialization and paging, two techniques that significantly reduce the memory requirements of backpropagation. By formulating a mixed-integer linear program (MILP), POET achieves energy-optimal training within given memory and runtime constraints. This approach not only allows for training substantially larger models on embedded devices but also enhances energy efficiency without compromising the mathematical correctness of backpropagation.
Complementing POET, Minerva offers an end-to-end ML model update system specifically tailored for microcontroller-based devices. At its core is a novel system abstraction known as capsules, which facilitates efficient and flexible ML model updates without necessitating disruptive full device firmware updates (DFUs). These capsules exploit the pure function nature of ML model inference, overcoming the challenges typically associated with updating parts of a running program. Recognizing the absence of ground-truth labels, Minerva presents a novel technique to evaluate deployed models. Minerva’s efficient updates and the ability to shadow-test enable a wide array of embedded devices to receive regular model updates.
Together, POET enables training models locally, and in those scenarios where training requires the cloud, Minerva enables efficient deployment and testing thereby opening up newer domains of computing to exploit ML models on the edge.},
}

EndNote citation:

%0 Thesis
%A Patil, Shishir 
%T Systems for Machine Learning on Edge Devices
%I EECS Department, University of California, Berkeley
%D 2024
%8 January 3
%@ UCB/EECS-2024-1
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-1.html
%F Patil:EECS-2024-1