High-Bandwidth/Low-Latency Temporary Storage for Supercomputers
John Alan Swensen
EECS Department, University of California, Berkeley
Technical Report No. UCB/CSD-87-383
, 1987
http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/CSD-87-383.pdf
The traditional use of memory and a symmetrical set of registers for storage of temporary results of scientific programs requires more execution time, hardware, and instruction-stream bandwidth than necessary. Novel register organizations that can be easily integrated into traditional supercomputer architectures can reduce all of these requirements. <p>Execution speed can be more than doubled by storing temporary results in an asymmetrical set of general-purpose registers or an asymmetrical set of vector registers, instead of in memory and a small register-set. Faster access and a hardware cost one fourth that of traditional vector registers can be had by using a vector register that incorporates a pipelined, random-access-memory chip. If a large enough set of registers is used, the need to store temporary results in memory and then reload them for later use can be eliminated; this saves both instruction-stream bandwidth and execution time.
Advisors: Alvin M. Despain and Yale N. Patt
BibTeX citation:
@phdthesis{Swensen:CSD-87-383, Author= {Swensen, John Alan}, Title= {High-Bandwidth/Low-Latency Temporary Storage for Supercomputers}, School= {EECS Department, University of California, Berkeley}, Year= {1987}, Month= {Dec}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/6219.html}, Number= {UCB/CSD-87-383}, Abstract= {The traditional use of memory and a symmetrical set of registers for storage of temporary results of scientific programs requires more execution time, hardware, and instruction-stream bandwidth than necessary. Novel register organizations that can be easily integrated into traditional supercomputer architectures can reduce all of these requirements. <p>Execution speed can be more than doubled by storing temporary results in an asymmetrical set of general-purpose registers or an asymmetrical set of vector registers, instead of in memory and a small register-set. Faster access and a hardware cost one fourth that of traditional vector registers can be had by using a vector register that incorporates a pipelined, random-access-memory chip. If a large enough set of registers is used, the need to store temporary results in memory and then reload them for later use can be eliminated; this saves both instruction-stream bandwidth and execution time.}, }
EndNote citation:
%0 Thesis %A Swensen, John Alan %T High-Bandwidth/Low-Latency Temporary Storage for Supercomputers %I EECS Department, University of California, Berkeley %D 1987 %@ UCB/CSD-87-383 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1987/6219.html %F Swensen:CSD-87-383