Haryadi S. Gunawi, Thanh Do, Pallavi Joshi, Peter Alvaro, Jungmin Yun, Jin-su Oh, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Koushik Sen and Dhruba Borthakur
EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2010-127
September 27, 2010
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.pdf
As the cloud era begins, the fate and destiny of availability, reliability and performance are in the hands of failure recovery. Unfortunately, recovery problems still take place, causing downtimes, data loss, and many other problems. We propose a new testing framework for cloud recovery: FATE (Failure Testing Service) and DESTINI (Declarative Testing Specifications). With FATE, recovery is systematically tested against multiple failures. With DTS, recovery is specified clearly, concisely, and precisely. We have deployed our framework to three cloud systems (HDFS, ZooKeeper, and Cassandra), explored over 40,000 failure scenarios, wrote 74 specifications, found 16 new bugs, and reproduced 51 old bugs.
BibTeX citation:
@techreport{Gunawi:EECS-2010-127, Author = {Gunawi, Haryadi S. and Do, Thanh and Joshi, Pallavi and Alvaro, Peter and Yun, Jungmin and Oh, Jin-su and Hellerstein, Joseph M. and Arpaci-Dusseau, Andrea C. and Arpaci-Dusseau, Remzi H. and Sen, Koushik and Borthakur, Dhruba}, Title = {FATE and DESTINI: A Framework for Cloud Recovery Testing}, Institution = {EECS Department, University of California, Berkeley}, Year = {2010}, Month = {Sep}, URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.html}, Number = {UCB/EECS-2010-127}, Abstract = {As the cloud era begins, the fate and destiny of availability, reliability and performance are in the hands of failure recovery. Unfortunately, recovery problems still take place, causing downtimes, data loss, and many other problems. We propose a new testing framework for cloud recovery: FATE (Failure Testing Service) and DESTINI (Declarative Testing Specifications). With FATE, recovery is systematically tested against multiple failures. With DTS, recovery is specified clearly, concisely, and precisely. We have deployed our framework to three cloud systems (HDFS, ZooKeeper, and Cassandra), explored over 40,000 failure scenarios, wrote 74 specifications, found 16 new bugs, and reproduced 51 old bugs.} }
EndNote citation:
%0 Report %A Gunawi, Haryadi S. %A Do, Thanh %A Joshi, Pallavi %A Alvaro, Peter %A Yun, Jungmin %A Oh, Jin-su %A Hellerstein, Joseph M. %A Arpaci-Dusseau, Andrea C. %A Arpaci-Dusseau, Remzi H. %A Sen, Koushik %A Borthakur, Dhruba %T FATE and DESTINI: A Framework for Cloud Recovery Testing %I EECS Department, University of California, Berkeley %D 2010 %8 September 27 %@ UCB/EECS-2010-127 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.html %F Gunawi:EECS-2010-127