Modeling Events in Time Using Cascades Of Poisson Processes

Aleksandr Simma

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2010-109
July 19, 2010

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-109.pdf

For many applications, the data of interest can be best thought of as events -- entities that occur at a particular moment in time, have features and may in turn trigger the occurrence of other events. This thesis presents techniques for modeling the temporal dynamics of events by making each event induce an inhomogeneous Poisson process of others following it. The collection of all events observed is taken to be a draw from the superposition of the induced Poisson processes, as well as a baseline process for some of the initial triggers. The magnitude and shape of the induced Poisson processes controls the number, timing and features of the triggered events. We provide techniques for parameterizing these processes and present efficient, scalable techniques for inference.

The framework is then applied to three different domains that demonstrate the power of the approach. First, we consider the problem of identifying dependencies in a computer network through passive observation and provide a technique based on hypothesis testing for accurately discovering interactions between machines. Then, we look at the relationships between Twitter messages about stocks, using the application as a test-bed to experiment with different parameterizations of induced processes. Finally, we apply these tools to build a model of the revision history of Wikipedia, identifying how the community propagates edits from a page to its neighbors and demonstrating the scalability of our approach to very large datasets.

Advisor: Michael Jordan


BibTeX citation:

@phdthesis{Simma:EECS-2010-109,
    Author = {Simma, Aleksandr},
    Title = {Modeling Events in Time Using Cascades Of Poisson Processes},
    School = {EECS Department, University of California, Berkeley},
    Year = {2010},
    Month = {Jul},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-109.html},
    Number = {UCB/EECS-2010-109},
    Abstract = {For many applications, the data of interest can be best thought of as events -- entities that occur at a particular moment in time, have features and may in turn trigger the occurrence of other events.  This thesis presents techniques for modeling the temporal dynamics of events by making each event induce an inhomogeneous Poisson process of others following it.  The collection of all events observed is taken to be a draw from the superposition of the induced Poisson processes, as well as a baseline process for some of the initial triggers.  The magnitude and shape of the induced Poisson processes controls the number, timing and features of the triggered events.  We provide techniques for parameterizing these processes and present efficient, scalable techniques for inference.

The framework is then applied to three different domains that demonstrate the power of the approach.  First, we consider the problem of identifying dependencies in a computer network through passive observation and provide a technique based on hypothesis testing for accurately discovering interactions between machines.  Then, we look at the relationships between  Twitter messages about stocks, using the application as a test-bed to experiment with different parameterizations of induced processes.  Finally, we apply these tools to build a model of the revision history of Wikipedia, identifying how the community propagates edits from a page to its neighbors and demonstrating the scalability of our approach to very large datasets.}
}

EndNote citation:

%0 Thesis
%A Simma, Aleksandr
%T Modeling Events in Time Using Cascades Of Poisson Processes
%I EECS Department, University of California, Berkeley
%D 2010
%8 July 19
%@ UCB/EECS-2010-109
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-109.html
%F Simma:EECS-2010-109