Experimental Evaluation of On-Chip Microprocessor Cache Memories
Mark D. Hill and Alan Jay Smith
EECS Department, University of California, Berkeley
Technical Report No. UCB/CSD-84-175
, 1984
http://www2.eecs.berkeley.edu/Pubs/TechRpts/1984/CSD-84-175.pdf
Advances in integrated circuit density are permitting the implementation on a single chip of functions and performance enhancements beyond those of a basic processors. One performance enhancement of proven value is a cache memory; placing a cache on the processor chip can reduce both mean memory access time and bus traffic. In this paper we use trace driven simulation to study design tradeoffs for small (on-chip) caches. Miss ratio and traffic ratio (bus traffic) are the metrics for cache performance. Particular attention is paid to sub-block caches (also known as sector caches), in which address tags are associated with blocks, each of which contains multiple sub-blocks; sub-blocks are the transfer unit. Using traces from two 16-bit architectures (Z8000,PDP-11) and two 32-bit architectures (VAX-11, System/370), we find that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well; typical miss and traffic ratios for a 1024 byte (net size) cache, 4-way set associative with 8 byte blocks are: PDP-11: .039, .156, .060, VAX 11: .080, .160, Sys/370: .244, .489. (These figures are based on traces of user programs and the performance obtained in practice is likely to be less good.) The use of sub-blocks allows tradeoffs between miss ratio and traffic ratio for a given cache size. Load forward is quite useful. Extensive simulation results are presented.
BibTeX citation:
@techreport{Hill:CSD-84-175, Author= {Hill, Mark D. and Smith, Alan Jay}, Title= {Experimental Evaluation of On-Chip Microprocessor Cache Memories}, Year= {1984}, Month= {Apr}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1984/5964.html}, Number= {UCB/CSD-84-175}, Abstract= {Advances in integrated circuit density are permitting the implementation on a single chip of functions and performance enhancements beyond those of a basic processors. One performance enhancement of proven value is a cache memory; placing a cache on the processor chip can reduce both mean memory access time and bus traffic. In this paper we use trace driven simulation to study design tradeoffs for small (on-chip) caches. Miss ratio and traffic ratio (bus traffic) are the metrics for cache performance. Particular attention is paid to sub-block caches (also known as sector caches), in which address tags are associated with blocks, each of which contains multiple sub-blocks; sub-blocks are the transfer unit. Using traces from two 16-bit architectures (Z8000,PDP-11) and two 32-bit architectures (VAX-11, System/370), we find that general purpose caches of 64 bytes (net size) are marginally useful in some cases, while 1024-byte caches perform fairly well; typical miss and traffic ratios for a 1024 byte (net size) cache, 4-way set associative with 8 byte blocks are: PDP-11: .039, .156, .060, VAX 11: .080, .160, Sys/370: .244, .489. (These figures are based on traces of user programs and the performance obtained in practice is likely to be less good.) The use of sub-blocks allows tradeoffs between miss ratio and traffic ratio for a given cache size. Load forward is quite useful. Extensive simulation results are presented.}, }
EndNote citation:
%0 Report %A Hill, Mark D. %A Smith, Alan Jay %T Experimental Evaluation of On-Chip Microprocessor Cache Memories %I EECS Department, University of California, Berkeley %D 1984 %@ UCB/CSD-84-175 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1984/5964.html %F Hill:CSD-84-175