The Effect of Sharing on the Cache and Bus Performance of Parallel Programs

Susan J. Eggers and Randy H. Katz

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-88-475
December 1988

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-475.pdf

Bus bandwidth ultimately limits the performance, and therefore the scale, of bus-based, shared memory multiprocessors. Previous studies have extrapolated from uniprocessor measurements and simulations to estimate the performance of these machines. In this study, we use traces of parallel programs to evaluate the cache and bus performance of shared memory multiprocessors, in which coherency is maintained by a write-invalidate protocol. In particular, we analyze the effect of sharing overhead on cache-miss ratio and bus utilization.

Our studies show that parallel programs incur substantially higher miss ratios and bus utilization than comparable uniprocessor programs. The sharing component of these metrics proportionally increases with both cache and block size, and for some cache configurations determines both their magnitude and trend. The amount of overhead depends on the memory reference pattern to the shared data. Programs that exhibit good per-processor-locality perform better than those with fine grain-sharing. This suggests that parallel software writers and better compiler technology can improve program performance through better memory organization of shared data.


BibTeX citation:

@techreport{Eggers:CSD-88-475,
    Author = {Eggers, Susan J. and Katz, Randy H.},
    Title = {The Effect of Sharing on the Cache and Bus Performance of Parallel Programs},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {1988},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/6056.html},
    Number = {UCB/CSD-88-475},
    Abstract = {Bus bandwidth ultimately limits the performance, and therefore the scale, of bus-based, shared memory multiprocessors. Previous studies have extrapolated from uniprocessor measurements and simulations to estimate the performance of these machines. In this study, we use traces of parallel programs to evaluate the cache and bus performance of shared memory multiprocessors, in which coherency is maintained by a write-invalidate protocol. In particular, we analyze the effect of sharing overhead on cache-miss ratio and bus utilization.   <p>Our studies show that parallel programs incur substantially higher miss ratios and bus utilization than comparable uniprocessor programs. The sharing component of these metrics proportionally increases with both cache and block size, and for some cache configurations determines both their magnitude and trend. The amount of overhead depends on the memory reference pattern to the shared data. Programs that exhibit good per-processor-locality perform better than those with fine grain-sharing. This suggests that parallel software writers and better compiler technology can improve program performance through better memory organization of shared data.}
}

EndNote citation:

%0 Report
%A Eggers, Susan J.
%A Katz, Randy H.
%T The Effect of Sharing on the Cache and Bus Performance of Parallel Programs
%I EECS Department, University of California, Berkeley
%D 1988
%@ UCB/CSD-88-475
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/6056.html
%F Eggers:CSD-88-475