Minuet: Rethinking Concurrency Control in Storage Area Networks

Andrey Ermolinskiy, Daekyeong Moon, Byung-Gon Chun and Scott Shenker

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2008-57
May 19, 2008

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-57.pdf

Clustered applications in storage area networks (SANs), widely adopted in enterprise datacenters, have traditionally relied on distributed locking protocols to coordinate concurrent access to shared storage devices. In this report, we examine the semantics of traditional lock services for SAN environments and ask whether they are sufficient to guarantee data safety at the application level. We argue that a traditional lock service design that enforces strict mutual exclusion and a globally-consistent view of locking state is neither sufficient nor strictly necessary for ensuring application-level correctness in the presence of asynchrony and failures. We also argue that in some cases, strongly-consistent locking imposes an additional and unnecessary constraint on application availability. Armed with these observations, we develop a set of novel concurrency control and recovery protocols for clustered SAN applications that achieve safety and liveness in the face of arbitrary asynchrony, process failures, and network partitions. Finally, we present and evaluate our implementation of Minuet, a new synchronization primitive based on these protocols that can serve as a foundational building block for safe and highly available SAN applications.


BibTeX citation:

@techreport{Ermolinskiy:EECS-2008-57,
    Author = {Ermolinskiy, Andrey and Moon, Daekyeong and Chun, Byung-Gon and Shenker, Scott},
    Title = {Minuet: Rethinking Concurrency Control in Storage Area Networks},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2008},
    Month = {May},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-57.html},
    Number = {UCB/EECS-2008-57},
    Abstract = {Clustered applications in storage area networks (SANs), widely adopted in enterprise datacenters, have traditionally relied on distributed locking protocols to coordinate concurrent access to shared storage devices. In this report, we examine the semantics of traditional lock services for SAN environments and ask whether they are sufficient to guarantee data safety at the application level. We argue that a traditional lock service design that enforces strict mutual exclusion and a globally-consistent view of locking state is neither sufficient nor strictly necessary for ensuring application-level correctness in the presence of asynchrony and failures. We also argue that in some cases, strongly-consistent locking imposes an additional and unnecessary constraint on application availability. Armed with these observations, we develop a set of novel concurrency control and recovery protocols for clustered SAN applications that achieve safety and liveness in the face of arbitrary asynchrony, process failures, and network partitions. Finally, we present and evaluate our implementation of Minuet, a new synchronization primitive based on these protocols that can serve as a foundational building block for safe and highly available SAN applications.}
}

EndNote citation:

%0 Report
%A Ermolinskiy, Andrey
%A Moon, Daekyeong
%A Chun, Byung-Gon
%A Shenker, Scott
%T Minuet: Rethinking Concurrency Control in Storage Area Networks
%I EECS Department, University of California, Berkeley
%D 2008
%8 May 19
%@ UCB/EECS-2008-57
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2008/EECS-2008-57.html
%F Ermolinskiy:EECS-2008-57