Modeling Social Interactions from Multimodal Signals
Evonne Ng
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2024-101
May 14, 2024
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-101.pdf
As social agents, humans have social intelligence that enables them to engage with others through complex signals such as facial expressions, body motion, and speech, among other communication modalities. Building systems that can perceive and model these fine-grain interaction dynamics is therefore essential for advancing human-machine interactions in everyday life. In this talk, I will share three of our recent advancements in this research area. The first one explores modeling nonverbal dyadic conversational dynamics between the facial expressions of a speaker and listener. The second one builds upon this prior work, expanding to full-body dynamics. Our final work then explores how we can incorporate higher-level syntactic understanding by leveraging large language models.
Advisors: Trevor Darrell and Angjoo Kanazawa
BibTeX citation:
@phdthesis{Ng:EECS-2024-101, Author= {Ng, Evonne}, Editor= {Darrell, Trevor and Kanazawa, Angjoo and Malik, Jitendra and Gopnik, Alison}, Title= {Modeling Social Interactions from Multimodal Signals}, School= {EECS Department, University of California, Berkeley}, Year= {2024}, Month= {May}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-101.html}, Number= {UCB/EECS-2024-101}, Abstract= {As social agents, humans have social intelligence that enables them to engage with others through complex signals such as facial expressions, body motion, and speech, among other communication modalities. Building systems that can perceive and model these fine-grain interaction dynamics is therefore essential for advancing human-machine interactions in everyday life. In this talk, I will share three of our recent advancements in this research area. The first one explores modeling nonverbal dyadic conversational dynamics between the facial expressions of a speaker and listener. The second one builds upon this prior work, expanding to full-body dynamics. Our final work then explores how we can incorporate higher-level syntactic understanding by leveraging large language models.}, }
EndNote citation:
%0 Thesis %A Ng, Evonne %E Darrell, Trevor %E Kanazawa, Angjoo %E Malik, Jitendra %E Gopnik, Alison %T Modeling Social Interactions from Multimodal Signals %I EECS Department, University of California, Berkeley %D 2024 %8 May 14 %@ UCB/EECS-2024-101 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2024/EECS-2024-101.html %F Ng:EECS-2024-101