Compromising PCA-based Anomaly Detectors for Network-Wide Traffic

Benjamin I. P. Rubinstein and Blaine Nelson and Ling Huang and Anthony D. Joseph and Shing-hon Lau and Nina Taft and Doug Tygar

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2008-73

May 29, 2008

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-73.pdf

The use of machine learning techniques to improve network design is gaining popularity. When these techniques are applied to security problems, a fundamental problem arises; namely that they are susceptible to adversaries who poison the learning phase of such techniques. In this paper we focus on PCA-based anomaly detectors used to identify anomalies in backbone networks via a comprehensive view of the network's traffic. We present four data poisoning schemes and evaluate their effectiveness on increasing an attacker's chance of evading detection. Because machine learning techniques often require retraining when used on data that is evolving, this also opens the door for attackers to employ stealthy poisoning methods that perturb the PCA detector slowly and covertly over time. We demonstrate that some of these PCA-based attacks can increase the adversary's chance of success sixfold under relatively moderate attacks, and comment on possible directions for combating these types of attacks.

BibTeX citation:

@techreport{Rubinstein:EECS-2008-73,
    Author= {Rubinstein, Benjamin I. P. and Nelson, Blaine and Huang, Ling and Joseph, Anthony D. and Lau, Shing-hon and Taft, Nina and Tygar, Doug},
    Title= {Compromising PCA-based Anomaly Detectors for Network-Wide Traffic},
    Year= {2008},
    Month= {May},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-73.html},
    Number= {UCB/EECS-2008-73},
    Abstract= {The use of machine learning techniques to improve network design is gaining popularity. When these techniques are applied to security problems, a fundamental problem arises; namely that they are susceptible to adversaries who poison the learning phase of such techniques. In this paper we focus on  PCA-based anomaly detectors used to identify anomalies in backbone networks via a comprehensive view of the network's traffic. We present four data poisoning schemes and evaluate their effectiveness on increasing an attacker's chance of evading detection. Because machine learning techniques often require retraining when used on data that is evolving, this also opens the door for attackers to employ stealthy poisoning methods that perturb the PCA detector slowly and covertly over time. We demonstrate that some of these PCA-based attacks can increase the adversary's chance of success sixfold under relatively moderate attacks, and comment on possible directions for combating these types of attacks.},
}

EndNote citation:

%0 Report
%A Rubinstein, Benjamin I. P. 
%A Nelson, Blaine 
%A Huang, Ling 
%A Joseph, Anthony D. 
%A Lau, Shing-hon 
%A Taft, Nina 
%A Tygar, Doug 
%T Compromising PCA-based Anomaly Detectors for Network-Wide Traffic
%I EECS Department, University of California, Berkeley
%D 2008
%8 May 29
%@ UCB/EECS-2008-73
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-73.html
%F Rubinstein:EECS-2008-73