A High Accuracy, Low-latency, Scalable Microphone-array System for Conversation Analysis

David Qin Sun

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2012-266
December 16, 2012

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-266.pdf

Understanding and facilitating real-life social interaction is a high-impact goal for ubiquitous computing research. Microphone arrays offer the unique capability to provide continuous, calm capture of verbal interaction in large physical spaces, such as homes and especially open-plan offices. Most microphone array work has focused on arrays of custom sensors in small spaces, and a few recent works have tested small arrays of commodity sensors in single rooms. This thesis describes the first working scalable and cost-effective array infrastructure that offers high precision localization of conversational speech, and hence enables ongoing studies of verbal interactions in large semi-structured spaces. This work represents significant improvements over prior work in three facets – cost, scale and accuracy. It also achieves high throughput for real-time updates of tens of active sources using off-the- shelf components. This thesis describes the design rationale behind our system, the software and hardware modules, key localization algorithms, and a systematic performance evaluation. Finally, we discuss some preliminary conversation analysis results by showing that source location data can be usefully aggregated to reveal interesting patterns in group conversations, such as dominance and engagement.

Advisor: John F. Canny


BibTeX citation:

@mastersthesis{Sun:EECS-2012-266,
    Author = {Sun, David Qin},
    Title = {A High Accuracy, Low-latency, Scalable Microphone-array System for Conversation Analysis},
    School = {EECS Department, University of California, Berkeley},
    Year = {2012},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-266.html},
    Number = {UCB/EECS-2012-266},
    Abstract = {Understanding and facilitating real-life social interaction is a high-impact goal for ubiquitous computing research. Microphone arrays offer the unique capability to provide continuous, calm capture of verbal interaction in large physical spaces, such as homes and especially open-plan offices. Most microphone array work has focused on arrays of custom sensors in small spaces, and a few recent works have
tested small arrays of commodity sensors in single rooms. This thesis describes the first working scalable and cost-effective array infrastructure that offers high precision
localization of conversational speech, and hence enables ongoing studies of verbal interactions in large semi-structured spaces. This work represents significant
improvements over prior work in three facets – cost, scale and accuracy. It also achieves high throughput for real-time updates of tens of active sources using
off-the- shelf components. This thesis describes the design rationale behind our system, the software and hardware modules, key localization algorithms, and a systematic performance evaluation. Finally, we discuss some preliminary conversation analysis results by showing that source location data can be usefully aggregated to reveal interesting patterns in group conversations, such as dominance and engagement.}
}

EndNote citation:

%0 Thesis
%A Sun, David Qin
%T A High Accuracy, Low-latency, Scalable Microphone-array System for Conversation Analysis
%I EECS Department, University of California, Berkeley
%D 2012
%8 December 16
%@ UCB/EECS-2012-266
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-266.html
%F Sun:EECS-2012-266