FATE and DESTINI: A Framework for Cloud Recovery Testing

Haryadi S. Gunawi, Thanh Do, Pallavi Joshi, Peter Alvaro, Jungmin Yun, Jin-su Oh, Joseph M. Hellerstein, Andrea C. Arpaci-Dusseau, Remzi H. Arpaci-Dusseau, Koushik Sen and Dhruba Borthakur

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2010-127
September 27, 2010

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.pdf

As the cloud era begins, the fate and destiny of availability, reliability and performance are in the hands of failure recovery. Unfortunately, recovery problems still take place, causing downtimes, data loss, and many other problems. We propose a new testing framework for cloud recovery: FATE (Failure Testing Service) and DESTINI (Declarative Testing Specifications). With FATE, recovery is systematically tested against multiple failures. With DTS, recovery is specified clearly, concisely, and precisely. We have deployed our framework to three cloud systems (HDFS, ZooKeeper, and Cassandra), explored over 40,000 failure scenarios, wrote 74 specifications, found 16 new bugs, and reproduced 51 old bugs.


BibTeX citation:

@techreport{Gunawi:EECS-2010-127,
    Author = {Gunawi, Haryadi S. and Do, Thanh and Joshi, Pallavi and Alvaro, Peter and Yun, Jungmin and Oh, Jin-su and Hellerstein, Joseph M. and Arpaci-Dusseau, Andrea C. and Arpaci-Dusseau, Remzi H. and Sen, Koushik and Borthakur, Dhruba},
    Title = {FATE and DESTINI: A Framework for Cloud Recovery Testing},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2010},
    Month = {Sep},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.html},
    Number = {UCB/EECS-2010-127},
    Abstract = {As the cloud era begins, the fate and destiny of availability,
reliability and performance are in the hands of failure recovery.
Unfortunately, recovery problems still take place, causing downtimes,
data loss, and many other problems.  We propose a new testing
framework for cloud recovery: FATE (Failure Testing Service) and
DESTINI (Declarative Testing Specifications).  With FATE, recovery is
systematically tested against multiple failures.  With DTS, recovery
is specified clearly, concisely, and precisely.  We have deployed our
framework to three cloud systems (HDFS, ZooKeeper, and Cassandra),
explored over 40,000 failure scenarios, wrote 74 specifications, found
16 new bugs, and reproduced 51 old bugs.}
}

EndNote citation:

%0 Report
%A Gunawi, Haryadi S.
%A Do, Thanh
%A Joshi, Pallavi
%A Alvaro, Peter
%A Yun, Jungmin
%A Oh, Jin-su
%A Hellerstein, Joseph M.
%A Arpaci-Dusseau, Andrea C.
%A Arpaci-Dusseau, Remzi H.
%A Sen, Koushik
%A Borthakur, Dhruba
%T FATE and DESTINI: A Framework for Cloud Recovery Testing
%I EECS Department, University of California, Berkeley
%D 2010
%8 September 27
%@ UCB/EECS-2010-127
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2010/EECS-2010-127.html
%F Gunawi:EECS-2010-127