In-Network PCA and Anomaly Detection

Ling Huang, Xuanlong Nguyen, Minos Garofalakis, Michael Jordan, Anthony D. Joseph and Nina Taft

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2007-10
January 11, 2007

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-10.pdf

We consider the problem of network anomaly detection in large distributed systems. In this setting, Principal Component Analysis (PCA) has been proposed as a method for discovering anomalies by continuously tracking the projection of the data onto a residual subspace. This method was shown to work well empirically in highly aggregated networks, that is, those with a limited number of large nodes and at coarse time scales. This approach, however, has scalability limitations. To overcome these limitations, we develop a PCA-based anomaly detector in which adaptive local data filters send to a coordinator just enough data to enable accurate global detection. Our method is based on a stochastic matrix perturbation analysis that characterizes the tradeoff between the accuracy of anomaly detection and the amount of data communicated over the network.


BibTeX citation:

@techreport{Huang:EECS-2007-10,
    Author = {Huang, Ling and Nguyen, Xuanlong and Garofalakis, Minos and Jordan, Michael and Joseph, Anthony D. and Taft, Nina},
    Title = {In-Network PCA and Anomaly Detection},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2007},
    Month = {Jan},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-10.html},
    Number = {UCB/EECS-2007-10},
    Abstract = {We consider the problem of network anomaly detection in large distributed systems. In this setting, Principal Component Analysis (PCA) has been proposed as a method for discovering anomalies by continuously tracking the projection of the data onto a residual subspace. This method was shown to work well empirically in highly aggregated networks, that is, those with a limited number of large nodes and at coarse time scales. This approach, however, has scalability limitations. To overcome these limitations, we develop a PCA-based anomaly detector in which adaptive local data filters send to a coordinator just enough data to enable accurate global detection. Our method is based on a stochastic matrix perturbation analysis that characterizes the tradeoff between the accuracy of anomaly detection and the amount of data communicated over the network.}
}

EndNote citation:

%0 Report
%A Huang, Ling
%A Nguyen, Xuanlong
%A Garofalakis, Minos
%A Jordan, Michael
%A Joseph, Anthony D.
%A Taft, Nina
%T In-Network PCA and Anomaly Detection
%I EECS Department, University of California, Berkeley
%D 2007
%8 January 11
%@ UCB/EECS-2007-10
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-10.html
%F Huang:EECS-2007-10