Zebra: A Striped Network File System

John H. Hartman and John K. Ousterhout

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-92-683
April 1992

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1992/CSD-92-683.pdf

This paper presents the design of Zebra, a striped network file system. Zebra applies ideas from log-structured file system (LFS) and RAID research to network file systems, resulting in a network file system that has scalable performance, uses its servers efficiently even when its applications are using small files, and provides high availability. Zebra stripes file data across multiple servers, so that the file transfer rate is not limited by the performance of a single server. High availability is achieved by maintaining parity information for the file system. If a server fails its contents can be reconstructed using the contents of the remaining servers and the parity information. Zebra differs from existing striped file systems in the way it stripes file data: Zebra does not stripe on a per-file basis; instead it stripes the stream of bytes written by each client. Clients write to the servers in units called stripe fragments, which are analogous to segments in an LFS. Stripe fragments contain file blocks that were written recently, without regard to which file they belong. This method of striping has numerous advantages over per-file striping, including increased server efficiency, efficient parity computation, and elimination of parity update.


BibTeX citation:

@techreport{Hartman:CSD-92-683,
    Author = {Hartman, John H. and Ousterhout, John K.},
    Title = {Zebra: A Striped Network File System},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {1992},
    Month = {Apr},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1992/6138.html},
    Number = {UCB/CSD-92-683},
    Abstract = {This paper presents the design of Zebra, a striped network file system. Zebra applies ideas from log-structured file system (LFS) and RAID research to network file systems, resulting in a network file system that has scalable performance, uses its servers efficiently even when its applications are using small files, and provides high availability. Zebra stripes file data across multiple servers, so that the file transfer rate is not limited by the performance of a single server. High availability is achieved by maintaining parity information for the file system. If a server fails its contents can be reconstructed using the contents of the remaining servers and the parity information. Zebra differs from existing striped file systems in the way it stripes file data: Zebra does not stripe on a per-file basis; instead it stripes the stream of bytes written by each client. Clients write to the servers in units called stripe fragments, which are analogous to segments in an LFS. Stripe fragments contain file blocks that were written recently, without regard to which file they belong. This method of striping has numerous advantages over per-file striping, including increased server efficiency, efficient parity computation, and elimination of parity update.}
}

EndNote citation:

%0 Report
%A Hartman, John H.
%A Ousterhout, John K.
%T Zebra: A Striped Network File System
%I EECS Department, University of California, Berkeley
%D 1992
%@ UCB/CSD-92-683
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1992/6138.html
%F Hartman:CSD-92-683