Representation Learning in Video and Text - A Social Media Misinformation Perspective

Kehan Wang

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2022-140

May 18, 2022

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-140.pdf

Short videos have become the most popular form of social media in recent years. As popular short videos easily gather billions of views, recent studies find that misinformation spreads faster through videos, and viewers have trouble identifying misinformation in social media. In this work, we develop self-supervised and unsupervised methods to identify misinformation by detecting inconsistency across multiple modalities, namely video and text. We explore both contrastive learning and masked language modeling(MLM) on a dataset of one million Twitter posts spanning from 2021 to 2022. Our best-performing method outperforms state-of-the-art methods by over 9% in accuracy. We further show that the performance of random mismatch detection transfers on actual misinformation on a manually-labeled dataset of 401 posts. For this dataset, our method outperforms state-of-the-art methods by over 14% in accuracy.

Advisors: Avideh Zakhor

BibTeX citation:

@mastersthesis{Wang:EECS-2022-140,
    Author= {Wang, Kehan},
    Editor= {Zakhor, Avideh},
    Title= {Representation Learning in Video and Text - A Social Media Misinformation Perspective},
    School= {EECS Department, University of California, Berkeley},
    Year= {2022},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-140.html},
    Number= {UCB/EECS-2022-140},
    Abstract= {Short videos have become the most popular form of social media in recent years. As popular short videos easily gather billions of views, recent studies find that misinformation spreads faster through videos, and viewers have trouble identifying misinformation in social media. In this work, we develop self-supervised and unsupervised methods to identify misinformation by detecting inconsistency across multiple modalities, namely video and text. We explore both contrastive learning and masked language modeling(MLM) on a dataset of one million Twitter posts spanning from 2021 to 2022. Our best-performing method outperforms state-of-the-art methods by over 9% in accuracy. We further show that the performance of random mismatch detection transfers on actual misinformation on a manually-labeled dataset of 401 posts. For this dataset, our method outperforms state-of-the-art methods by over 14% in accuracy.},
}

EndNote citation:

%0 Thesis
%A Wang, Kehan 
%E Zakhor, Avideh 
%T Representation Learning in Video and Text - A Social Media Misinformation Perspective
%I EECS Department, University of California, Berkeley
%D 2022
%8 May 18
%@ UCB/EECS-2022-140
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2022/EECS-2022-140.html
%F Wang:EECS-2022-140