Detecting Synthetic Faces by Understanding Real Faces

Shruti Agarwal

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2021-171

August 4, 2021

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-171.pdf

The creation of sophisticated fake videos has been largely relegated to Hollywood studios or state actors. Recent advances in deep learning, however, have democratized the creation of sophisticated and compelling fake images, videos, and audios. This synthetically-generated media -- so-called deep fakes -- continue to capture the imagination of the computer-graphics and computer-vision communities. At the same time, the easy access to technology that can create deep fakes of anybody saying anything continues to be of concern because of its power to disrupt democratic elections, commit small to large-scale fraud, fuel dis- and mis-information campaigns, and create non-consensual pornography.

To contend with this growing threat, I describe a diverse set of techniques to detect state-of-the-art deep-fake videos. One set of these techniques are identity-specific, exploiting soft- and hard-biometric cues like dynamic facial motion and static facial appearance. Another set of these techniques are identity-independent, exploiting the dynamics of lip and ear motion.

Given the large-scale presence of deep fakes and the poor scalability of forensic techniques on the internet, the reliance on human perception to detect deep fakes is inevitable. Therefore, I also present several perceptual studies to understand the human visual system's ability to detect synthetic faces.

Advisors: Hany Farid

BibTeX citation:

@phdthesis{Agarwal:EECS-2021-171,
    Author= {Agarwal, Shruti},
    Title= {Detecting Synthetic Faces by Understanding Real Faces},
    School= {EECS Department, University of California, Berkeley},
    Year= {2021},
    Month= {Aug},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-171.html},
    Number= {UCB/EECS-2021-171},
    Abstract= {The creation of sophisticated fake videos has been largely relegated to Hollywood studios or state actors. Recent advances in deep learning, however, have democratized the creation of sophisticated and compelling fake images, videos, and audios. This synthetically-generated media -- so-called deep fakes -- continue to capture the imagination of the computer-graphics and computer-vision communities. At the same time, the easy access to technology that can create deep fakes of anybody saying anything continues to be of concern because of its power to disrupt democratic elections, commit small to large-scale fraud, fuel dis- and mis-information campaigns, and create non-consensual pornography. 

To contend with this growing threat, I describe a diverse set of techniques to detect state-of-the-art deep-fake videos. One set of these techniques are identity-specific, exploiting soft- and hard-biometric cues like dynamic facial motion and static facial appearance. Another set of these techniques are identity-independent, exploiting the  dynamics of lip and ear motion.

Given the large-scale presence of deep fakes and the poor scalability of forensic techniques on the internet, the reliance on human perception to detect deep fakes is inevitable. Therefore, I also present several perceptual studies to understand the human visual system's ability to detect synthetic faces.},
}

EndNote citation:

%0 Thesis
%A Agarwal, Shruti 
%T Detecting Synthetic Faces by Understanding Real Faces
%I EECS Department, University of California, Berkeley
%D 2021
%8 August 4
%@ UCB/EECS-2021-171
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-171.html
%F Agarwal:EECS-2021-171