Rising Stars 2020:

Kexin Rong

PhD Candidate

Stanford University


Areas of Interest

  • Database Management Systems

Poster

Prioritizing Computation Over Input Data with Locality Sensitive Hashing

Abstract

The exponential growth of data, fueled in large part by machine generated data, significantly outpaces our computational power to process it. To enable analytics on large data with limited computation, I build systems and design algorithms that use synopses and sampling techniques to prioritize computation over inputs that have the most impact on downstream analytics tasks.

I present two results from one area of my work, prioritizing computation via novel uses of Locality Sensitive Hashing (LSH). LSH hashes similar inputs into the same "buckets" with high probability. First, I describe an end-to-end earthquake detection system I built based on high waveform similarity of repeating earthquakes using LSH. An off-the-shelf LSH-based similarity search does not scale; I incorporated seismology domain knowledge into the pipeline to improve efficiency and result quality. The system has directly enabled the discovery of 597 new earthquakes near a nuclear power plant in California and continues to get interest from the seismology community. Second, I describe a theoretical result that shows how to reduce the computational cost of kernel density estimation. Using LSH as a smart sampler, I and my collaborators develop the first practical algorithm that provably improves upon random sampling for the Gaussian kernel in high dimensions.

Bio

Kexin Rong is a Ph.D. student in Computer Science at Stanford University, co-advised by Professor Peter Bailis and Professor Philip Levis. She designs and builds systems to enable data analytics at scale, supporting applications including scientific analysis, infrastructure monitoring, and analytical queries on big-data clusters. Prior to Stanford, she received her bachelor’s degree from California Institute of Technology.

Personal home page