# Towards Characterizing Model Extraction Queries and How to Detect Them

### http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-126.pdf

Machine Learning as a Service (MLaaS) has become popular in cloud services as Deep Neural Networks (DNNs) are demonstrating high-performance in many domains and as the rapid growth in cloud computing. Meanwhile, developing enterprise MLaaS remains costly since training machine learning models typically requires large-scale data collection and labeling. However, researchers have shown that model extraction attacks are able to \textit{steal} functionality of models deployed on Cloud only through black-box access to victim's models and sending adversarial queries to application programming interface (API). This information leakage indicates potential threats to protecting enterprise machine learning models as a part of intellectual property. In this paper, we present two lines of our research on model extraction attacks: characterizing adversarial queries and building detectors against them. In our first line of research, we find that although adversarial queries help adversary explore victim's decision regions to some extent, they fail to extract properties of decision boundaries, which is most of the existing algorithms claim to be capable of. In our second line of research, we propose two ways to detect Jacobian-based and Data-free model extraction attacks: 1) a similarity-based detector to show the possibility of building a robust detector against model extraction attacks by adopting detectors for adversarial examples, and 2) a VAE-based detector that uses Variational Autoencoder to estimate whether queries are benign or not.

BibTeX citation:

@mastersthesis{Zhang:EECS-2021-126,
Author = {Zhang, Zhanyuan and Chen, Yizheng and Wagner, David A.},
Title = {Towards Characterizing Model Extraction Queries and How to Detect Them},
School = {EECS Department, University of California, Berkeley},
Year = {2021},
Month = {May},
URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-126.html},
Number = {UCB/EECS-2021-126},
Abstract = {Machine Learning as a Service (MLaaS) has become popular in cloud services as Deep Neural Networks (DNNs) are demonstrating high-performance in many domains and as the rapid growth in cloud computing. Meanwhile, developing enterprise MLaaS remains costly since training machine learning models typically requires large-scale data collection and labeling. However, researchers have shown that model extraction attacks are able to \textit{steal} functionality of models deployed on Cloud only through black-box access to victim's models and sending adversarial queries to application programming interface (API). This information leakage indicates potential threats to protecting enterprise machine learning models as a part of intellectual property. In this paper, we present two lines of our research on model extraction attacks: characterizing adversarial queries and building detectors against them. In our first line of research, we find that although adversarial queries help adversary explore victim's decision regions to some extent, they fail to extract properties of decision boundaries, which is most of the existing algorithms claim to be capable of. In our second line of research, we propose two ways to detect Jacobian-based and Data-free model extraction attacks: 1) a similarity-based detector to show the possibility of building a robust detector against model extraction attacks by adopting detectors
for adversarial examples, and 2) a VAE-based detector that uses Variational Autoencoder to estimate whether queries are benign or not.}
}


EndNote citation:

%0 Thesis
%A Zhang, Zhanyuan
%A Chen, Yizheng
%A Wagner, David A.
%T Towards Characterizing Model Extraction Queries and How to Detect Them
%I EECS Department, University of California, Berkeley
%D 2021
%8 May 14
%@ UCB/EECS-2021-126
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2021/EECS-2021-126.html
%F Zhang:EECS-2021-126