In-Network PCA and Anomaly Detection

Ling Huang and Xuanlong Nguyen and Minos Garofalakis and Michael Jordan and Anthony D. Joseph and Nina Taft

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2007-10

January 11, 2007

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-10.pdf

We consider the problem of network anomaly detection in large distributed systems. In this setting, Principal Component Analysis (PCA) has been proposed as a method for discovering anomalies by continuously tracking the projection of the data onto a residual subspace. This method was shown to work well empirically in highly aggregated networks, that is, those with a limited number of large nodes and at coarse time scales. This approach, however, has scalability limitations. To overcome these limitations, we develop a PCA-based anomaly detector in which adaptive local data filters send to a coordinator just enough data to enable accurate global detection. Our method is based on a stochastic matrix perturbation analysis that characterizes the tradeoff between the accuracy of anomaly detection and the amount of data communicated over the network.

BibTeX citation:

@techreport{Huang:EECS-2007-10,
    Author= {Huang, Ling and Nguyen, Xuanlong and Garofalakis, Minos and Jordan, Michael and Joseph, Anthony D. and Taft, Nina},
    Title= {In-Network PCA and Anomaly Detection},
    Year= {2007},
    Month= {Jan},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-10.html},
    Number= {UCB/EECS-2007-10},
    Abstract= {We consider the problem of network anomaly detection in large distributed systems. In this setting, Principal Component Analysis (PCA) has been proposed as a method for discovering anomalies by continuously tracking the projection of the data onto a residual subspace. This method was shown to work well empirically in highly aggregated networks, that is, those with a limited number of large nodes and at coarse time scales. This approach, however, has scalability limitations. To overcome these limitations, we develop a PCA-based anomaly detector in which adaptive local data filters send to a coordinator just enough data to enable accurate global detection. Our method is based on a stochastic matrix perturbation analysis that characterizes the tradeoff between the accuracy of anomaly detection and the amount of data communicated over the network.},
}

EndNote citation:

%0 Report
%A Huang, Ling 
%A Nguyen, Xuanlong 
%A Garofalakis, Minos 
%A Jordan, Michael 
%A Joseph, Anthony D. 
%A Taft, Nina 
%T In-Network PCA and Anomaly Detection
%I EECS Department, University of California, Berkeley
%D 2007
%8 January 11
%@ UCB/EECS-2007-10
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-10.html
%F Huang:EECS-2007-10