System Problem Detection by Mining Console Logs

Wei Xu

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2010-112
August 1, 2010

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-112.pdf

The console logs generated by an application contain information that the developers believed would be useful in debugging or monitoring the application. Despite the ubiquity and large size of these logs, they are rarely exploited because they are not readily machine-parsable. We propose a fully automatic methodology for mining console logs using a combination of program analysis, information retrieval, data mining, and machine learning techniques. We use source code analysis to understand the structures from the console logs. We then extract features, such as execution traces, from logs and use data mining and machine learning methods to detect problems. We also use a decision tree to distill the detection results to a format readily understandable by operators who need not be familiar with the anomaly detection algorithms. The whole process requires no human intervention and can scale to large scale log data. We extend the methods to perform online analysis on console log streams. We evaluate the technique on several real-world systems and detected problems that are insightful to systems operators.

Advisor: David A. Patterson and Armando Fox


BibTeX citation:

@phdthesis{Xu:EECS-2010-112,
    Author = {Xu, Wei},
    Title = {System Problem Detection by Mining Console Logs},
    School = {EECS Department, University of California, Berkeley},
    Year = {2010},
    Month = {Aug},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-112.html},
    Number = {UCB/EECS-2010-112},
    Abstract = {The console logs generated by an application contain information that the developers believed would be useful in debugging or monitoring the application. Despite the ubiquity and large size of these logs, they are rarely exploited because they are not readily machine-parsable. We propose a fully automatic methodology for mining console logs using a combination of program analysis, information retrieval, data mining, and machine learning techniques. We use source code analysis to understand the structures from the console logs. We then extract features, such as execution traces, from logs and use data mining and machine learning methods to detect problems. We also use a decision tree to distill the detection results to a format readily understandable by operators who need not be familiar with the anomaly detection algorithms. The whole process requires no human intervention and can scale to large scale log data. We extend the methods to perform online analysis on
console log streams. We evaluate the technique on several real-world systems and detected problems that are insightful to systems operators.}
}

EndNote citation:

%0 Thesis
%A Xu, Wei
%T System Problem Detection by Mining Console Logs
%I EECS Department, University of California, Berkeley
%D 2010
%8 August 1
%@ UCB/EECS-2010-112
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-112.html
%F Xu:EECS-2010-112