Detecting Backdoored Neural Networks with Structured Adversarial Attacks
Charles Yang
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2021-90
May 14, 2021
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-90.pdf
Deep Learning models are becoming increasingly enmeshed into our digital infrastructure, unlocking our phones and powering our social media feeds. It is critical to understand the security vulnerabilities of these black-box models before they are deployed in safety-critical applications, such as self-driving cars and biometric authentication. In particular, recent literature has demonstrated the ability to install backdoors into deep learning models. This thesis uses structured optimization constraints to find adversarial attacks in order to determine if a model is backdoored. We use the TrojAI dataset to benchmark our approach and achieve 0.83AUC on the challenging round 3 dataset.
Advisors: Michael William Mahoney
BibTeX citation:
@mastersthesis{Yang:EECS-2021-90, Author= {Yang, Charles}, Title= {Detecting Backdoored Neural Networks with Structured Adversarial Attacks}, School= {EECS Department, University of California, Berkeley}, Year= {2021}, Month= {May}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-90.html}, Number= {UCB/EECS-2021-90}, Abstract= { Deep Learning models are becoming increasingly enmeshed into our digital infrastructure, unlocking our phones and powering our social media feeds. It is critical to understand the security vulnerabilities of these black-box models before they are deployed in safety-critical applications, such as self-driving cars and biometric authentication. In particular, recent literature has demonstrated the ability to install backdoors into deep learning models. This thesis uses structured optimization constraints to find adversarial attacks in order to determine if a model is backdoored. We use the TrojAI dataset to benchmark our approach and achieve 0.83AUC on the challenging round 3 dataset.}, }
EndNote citation:
%0 Thesis %A Yang, Charles %T Detecting Backdoored Neural Networks with Structured Adversarial Attacks %I EECS Department, University of California, Berkeley %D 2021 %8 May 14 %@ UCB/EECS-2021-90 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-90.html %F Yang:EECS-2021-90