FATE and DESTINI: A Framework for Cloud Recovery Testing
Haryadi S. Gunawi and Thanh Do and Pallavi Joshi and Peter Alvaro and Jungmin Yun and Jin-su Oh and Joseph M. Hellerstein and Andrea C. Arpaci-Dusseau and Remzi H. Arpaci-Dusseau and Koushik Sen and Dhruba Borthakur
EECS Department, University of California, Berkeley
Technical Report No. UCB/EECS-2010-127
September 27, 2010
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.pdf
As the cloud era begins, the fate and destiny of availability, reliability and performance are in the hands of failure recovery. Unfortunately, recovery problems still take place, causing downtimes, data loss, and many other problems. We propose a new testing framework for cloud recovery: FATE (Failure Testing Service) and DESTINI (Declarative Testing Specifications). With FATE, recovery is systematically tested against multiple failures. With DTS, recovery is specified clearly, concisely, and precisely. We have deployed our framework to three cloud systems (HDFS, ZooKeeper, and Cassandra), explored over 40,000 failure scenarios, wrote 74 specifications, found 16 new bugs, and reproduced 51 old bugs.
BibTeX citation:
@techreport{Gunawi:EECS-2010-127, Author= {Gunawi, Haryadi S. and Do, Thanh and Joshi, Pallavi and Alvaro, Peter and Yun, Jungmin and Oh, Jin-su and Hellerstein, Joseph M. and Arpaci-Dusseau, Andrea C. and Arpaci-Dusseau, Remzi H. and Sen, Koushik and Borthakur, Dhruba}, Title= {FATE and DESTINI: A Framework for Cloud Recovery Testing}, Year= {2010}, Month= {Sep}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.html}, Number= {UCB/EECS-2010-127}, Abstract= {As the cloud era begins, the fate and destiny of availability, reliability and performance are in the hands of failure recovery. Unfortunately, recovery problems still take place, causing downtimes, data loss, and many other problems. We propose a new testing framework for cloud recovery: FATE (Failure Testing Service) and DESTINI (Declarative Testing Specifications). With FATE, recovery is systematically tested against multiple failures. With DTS, recovery is specified clearly, concisely, and precisely. We have deployed our framework to three cloud systems (HDFS, ZooKeeper, and Cassandra), explored over 40,000 failure scenarios, wrote 74 specifications, found 16 new bugs, and reproduced 51 old bugs.}, }
EndNote citation:
%0 Report %A Gunawi, Haryadi S. %A Do, Thanh %A Joshi, Pallavi %A Alvaro, Peter %A Yun, Jungmin %A Oh, Jin-su %A Hellerstein, Joseph M. %A Arpaci-Dusseau, Andrea C. %A Arpaci-Dusseau, Remzi H. %A Sen, Koushik %A Borthakur, Dhruba %T FATE and DESTINI: A Framework for Cloud Recovery Testing %I EECS Department, University of California, Berkeley %D 2010 %8 September 27 %@ UCB/EECS-2010-127 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.html %F Gunawi:EECS-2010-127