Discriminative Acoustic Features for Deployable Speech Recognition
Arlo Faria
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2016-199
December 13, 2016
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-199.pdf
This work explores discriminative acoustic features: audio signal representations produced by multilayer perceptrons, such as Tandem and bottleneck features used in state-of-the-art automatic speech recognition systems. Experimental results highlight the factors that influence performance in terms of accuracy and speed; novel approaches are introduced to provide improvement in both regards. The overall emphasis is on discovering techniques that are suitable for practical deployment, translating effectiveness beyond a traditional research setting. Applications include real-time low-latency audio stream processing on mobile devices – as well as systems that are built with low-quality training data and must encounter the diverse complications of "real-world" use cases.
Advisors: Nelson Morgan
BibTeX citation:
@phdthesis{Faria:EECS-2016-199, Author= {Faria, Arlo}, Title= {Discriminative Acoustic Features for Deployable Speech Recognition}, School= {EECS Department, University of California, Berkeley}, Year= {2016}, Month= {Dec}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-199.html}, Number= {UCB/EECS-2016-199}, Abstract= {This work explores discriminative acoustic features: audio signal representations produced by multilayer perceptrons, such as Tandem and bottleneck features used in state-of-the-art automatic speech recognition systems. Experimental results highlight the factors that influence performance in terms of accuracy and speed; novel approaches are introduced to provide improvement in both regards. The overall emphasis is on discovering techniques that are suitable for practical deployment, translating effectiveness beyond a traditional research setting. Applications include real-time low-latency audio stream processing on mobile devices – as well as systems that are built with low-quality training data and must encounter the diverse complications of "real-world" use cases.}, }
EndNote citation:
%0 Thesis %A Faria, Arlo %T Discriminative Acoustic Features for Deployable Speech Recognition %I EECS Department, University of California, Berkeley %D 2016 %8 December 13 %@ UCB/EECS-2016-199 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2016/EECS-2016-199.html %F Faria:EECS-2016-199