Focus Replay Debugging Effort On the Control Plane
Gautam Altekar and Ion Stoica
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2010-88
May 29, 2010
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-88.pdf
Replay debugging systems enable the reproduction and debugging of non-deterministic failures in production application runs. However, no existing replay system is suitable for datacenter applications like Cassandra, Hadoop, and Hypertable. On these large scale, distributed, and data intensive programs, existing replay methods either incur excessive production recording overheads or are unable to provide high fidelity replay.
In this position paper, we hypothesize and empirically verify that control plane determinism is the key to recordefficient and high-fidelity replay of datacenter applications. The key idea behind control plane determinism is that debugging does not always require a precise replica of the original application run. Instead, it often suffices to produce some run that exhibits the original behavior of the control-plane–the application code responsible for controlling and managing data flow through a datacenter system.
BibTeX citation:
@techreport{Altekar:EECS-2010-88, Author= {Altekar, Gautam and Stoica, Ion}, Title= {Focus Replay Debugging Effort On the Control Plane}, Year= {2010}, Month= {May}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-88.html}, Number= {UCB/EECS-2010-88}, Abstract= {Replay debugging systems enable the reproduction and debugging of non-deterministic failures in production application runs. However, no existing replay system is suitable for datacenter applications like Cassandra, Hadoop, and Hypertable. On these large scale, distributed, and data intensive programs, existing replay methods either incur excessive production recording overheads or are unable to provide high fidelity replay. In this position paper, we hypothesize and empirically verify that control plane determinism is the key to recordefficient and high-fidelity replay of datacenter applications. The key idea behind control plane determinism is that debugging does not always require a precise replica of the original application run. Instead, it often suffices to produce some run that exhibits the original behavior of the control-plane–the application code responsible for controlling and managing data flow through a datacenter system.}, }
EndNote citation:
%0 Report %A Altekar, Gautam %A Stoica, Ion %T Focus Replay Debugging Effort On the Control Plane %I EECS Department, University of California, Berkeley %D 2010 %8 May 29 %@ UCB/EECS-2010-88 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-88.html %F Altekar:EECS-2010-88