Efficient Analysis of Caching Systems

James Gordon Thompson

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-87-374
September 1987

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-374.pdf

This dissertation describes innovative techniques for efficiently analyzing a wide variety of cache designs, and uses these techniques to study caching in a network file system. The techniques are significant extensions to the stack analysis technique (Mattson et al., 1970) which computes the read miss ratio for all cache sizes in a single trace-driven simulation. Stack analysis is extended to allow the one-pass analysis of:
1) writes in a write-back cache, including periodic write-back and deletions, important factors in file system cache performance.
2) sub-block or sector caches, including load-forward prefetching.
3) multi-processor caches in a shared-memory system, for an entire class of consistency protocols, including all of the well-known protocols.
4) client caches in a network file system, using a new class of consistency protocols.

The techniques are completely general and apply to all levels of the memory hierarchy, from processor caches to disk and file system caches. The dissertation also discusses the use of hash tables and binary trees within the simulator to further improve performance for some types of traces. Using these techniques, the performance of all cache sizes can be computed in little more than twice the time required to simulate a single cache size, and often in just 10% more time.

In addition to presenting techniques, this dissertation also demonstrates their use by studying client caching in a network file system. It first reports the extent of file sharing in a UNIX environment, showing that a few shared files account for two-thirds of all accesses, and nearly half of these are to files which are both read and written.

It then studies different cache consistency protocols, write policies, and fetch policies, reporting the miss ratio and file server utilization for each. Four cache consistency protocols are considered: a polling protocol that uses the server for all consistency controls; a protocol designed for single-user files; one designed for read-only files; and one using write-broadcast to maintain consistency. It finds that the choice of consistency protocol has a substantial effect on performance; both the read-only and write-broadcast protocols showed half the misses and server load of the polling protocol. The choice of write or fetch policy made a much smaller difference.

Advisor: Alan J. Smith


BibTeX citation:

@phdthesis{Thompson:CSD-87-374,
    Author = {Thompson, James Gordon},
    Title = {Efficient Analysis of Caching Systems},
    School = {EECS Department, University of California, Berkeley},
    Year = {1987},
    Month = {Sep},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/6224.html},
    Number = {UCB/CSD-87-374},
    Abstract = {This dissertation describes innovative techniques for efficiently analyzing a wide variety of cache designs, and uses these techniques to study caching in a network file system.  The techniques are significant extensions to the stack analysis technique (Mattson et al., 1970) which computes the read miss ratio for all cache sizes in a single trace-driven simulation.  Stack analysis is extended to allow the one-pass analysis of:  <br /> 1) writes in a write-back cache, including periodic write-back and deletions, important factors in file system cache performance.   <br /> 2) sub-block or sector caches, including load-forward prefetching.   <br /> 3) multi-processor caches in a shared-memory system, for an entire class of consistency protocols, including all of the well-known protocols.   <br /> 4) client caches in a network file system, using a new class of consistency protocols.  <p>The techniques are completely general and apply to all levels of the memory hierarchy, from processor caches to disk and file system caches.  The dissertation also discusses the use of hash tables and binary trees within the simulator to further improve performance for some types of traces.  Using these techniques, the performance of all cache sizes can be computed in little more than twice the time required to simulate a single cache size, and often in just 10% more time.    <p>In addition to presenting techniques, this dissertation also demonstrates their use by studying client caching in a network file system.  It first reports the extent of file sharing in a UNIX environment, showing that a few shared files account for two-thirds of all accesses, and nearly half of these are to files which are both read and written.   <p>It then studies different cache consistency protocols, write policies, and fetch policies, reporting the miss ratio and file server utilization for each.  Four cache consistency protocols are considered: a polling protocol that uses the server for all consistency controls; a protocol designed for single-user files; one designed for read-only files; and one using write-broadcast to maintain consistency.  It finds that the choice of consistency protocol has a substantial effect on performance; both the read-only and write-broadcast protocols showed half the misses and server load of the polling protocol.  The choice of write or fetch policy made a much smaller difference.}
}

EndNote citation:

%0 Thesis
%A Thompson, James Gordon
%T Efficient Analysis of Caching Systems
%I EECS Department, University of California, Berkeley
%D 1987
%@ UCB/CSD-87-374
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/6224.html
%F Thompson:CSD-87-374