Creating a Video Classification Neural Network Architecture to Map American Sign Language Gestures to Computer Cursor Controls

Rohan Hajela

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2022-131

May 15, 2022

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-131.pdf

Although hand gesture image classification techniques have become fairly powerful, operating with high-accuracy and-low latency, video classification of dynamic gestures lag behind. To explore into this, we have researched into classifying American Sign Language which leverages handshape, palm orientation, movement, location, and expression signals for its gestures. Creating a model that can classify these gestures in real time pose many useful usecases such as a real-time sign language translator or an ASL gesture to English reverse dictionary.

We have built a prototype of a visual based assistant that utilizes video classification on American Sign Language (ASL) gestures to act as short cuts for common commands on computers. We have trained our video classification Ml model with a industry standard level of accuracy and precision, and have configured several common command mappings in this prototype. The system is designed to be scalable, and easy to download and use, and can support the addition of new commands or integration into di↵erent operating systems. The goal of this research was to explore various techniques and architectures that would enable a real time hand gesture video classification and to build this model in a modular way where it could be easily applied to other usecases as well.

Advisors: Brian A. Barsky

BibTeX citation:

@mastersthesis{Hajela:EECS-2022-131,
    Author= {Hajela, Rohan},
    Title= {Creating a Video Classification Neural Network Architecture to Map American Sign Language Gestures to Computer Cursor Controls},
    School= {EECS Department, University of California, Berkeley},
    Year= {2022},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-131.html},
    Number= {UCB/EECS-2022-131},
    Abstract= {Although hand gesture image classification techniques have become fairly powerful, operating with high-accuracy and-low latency, video classification of dynamic gestures lag behind. To explore into this, we have researched into classifying American Sign Language which leverages handshape, palm orientation, movement, location, and expression signals for its gestures. Creating a model that can classify these gestures in real time pose many useful usecases such as a real-time sign language translator or an ASL gesture to English reverse dictionary.

We have built a prototype of a visual based assistant that utilizes video classification on American Sign Language (ASL) gestures to act as short cuts for common commands on computers. We have trained our video classification Ml model with a industry standard level of accuracy and precision, and have configured several common command mappings in this prototype. The system is designed to be scalable, and easy to download and use, and can support the addition of new commands or integration into di↵erent operating systems. The goal of this research was to explore various techniques and architectures that would enable a real time hand gesture video classification and to build this model in a modular way where it could be easily applied to other usecases as well.},
}

EndNote citation:

%0 Thesis
%A Hajela, Rohan 
%T Creating a Video Classification Neural Network Architecture to Map American Sign Language Gestures to Computer Cursor Controls
%I EECS Department, University of California, Berkeley
%D 2022
%8 May 15
%@ UCB/EECS-2022-131
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-131.html
%F Hajela:EECS-2022-131