High-Bandwidth/Low-Latency Temporary Storage for Supercomputers

John Alan Swensen

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-87-383
December 1987

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-383.pdf

The traditional use of memory and a symmetrical set of registers for storage of temporary results of scientific programs requires more execution time, hardware, and instruction-stream bandwidth than necessary. Novel register organizations that can be easily integrated into traditional supercomputer architectures can reduce all of these requirements.

Execution speed can be more than doubled by storing temporary results in an asymmetrical set of general-purpose registers or an asymmetrical set of vector registers, instead of in memory and a small register-set. Faster access and a hardware cost one fourth that of traditional vector registers can be had by using a vector register that incorporates a pipelined, random-access-memory chip. If a large enough set of registers is used, the need to store temporary results in memory and then reload them for later use can be eliminated; this saves both instruction-stream bandwidth and execution time.

Advisor: Alvin M. Despain and Yale N. Patt


BibTeX citation:

@phdthesis{Swensen:CSD-87-383,
    Author = {Swensen, John Alan},
    Title = {High-Bandwidth/Low-Latency Temporary Storage for Supercomputers},
    School = {EECS Department, University of California, Berkeley},
    Year = {1987},
    Month = {Dec},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/6219.html},
    Number = {UCB/CSD-87-383},
    Abstract = {The traditional use of memory and a symmetrical set of registers for storage of temporary results of scientific programs requires more execution time, hardware, and instruction-stream bandwidth than necessary.  Novel register organizations that can be easily integrated into traditional supercomputer architectures can reduce all of these requirements.   <p>Execution speed can be more than doubled by storing temporary results in an asymmetrical set of general-purpose registers or an asymmetrical set of vector registers, instead of in memory and a small register-set. Faster access and a hardware cost one fourth that of traditional vector registers can be had by using a vector register that incorporates a pipelined, random-access-memory chip. If a large enough set of registers is used, the need to store temporary results in memory and then reload them for later use can be eliminated; this saves both instruction-stream bandwidth and execution time.}
}

EndNote citation:

%0 Thesis
%A Swensen, John Alan
%T High-Bandwidth/Low-Latency Temporary Storage for Supercomputers
%I EECS Department, University of California, Berkeley
%D 1987
%@ UCB/CSD-87-383
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/6219.html
%F Swensen:CSD-87-383