Generating Permutation Instructions from a High-Level Description
Manikandan Narayanan and Katherine A. Yelick
EECS Department, University of California, Berkeley
Technical Report No. UCB/CSD-03-1287
2003
This publication is archived. It is kept only for reference purposes, so it is no longer being updated and may not meet accessibility standards. If you need this content in a different format, please email webteam@eecs.berkeley.edu.
http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/Archive/CSD-03-1287.pdf
Fine-grained data parallelism, from media extensions to full streaming or vector instruction sets, offer enormous performance potential, if they can be effectively used from the application level. One critical aspect of their design is the organization of the registers and the generality of operations that move data between registers. In this paper we focus on this data-movement problem and demonstrate that starting with a high-level description of a data-parallel application, we can automatically map certain data-movements in the program onto a regular set of vector permutation instructions. Our language and compiler are based on StreamIt from MIT, and our target machine is the VIRAM processor from Berkeley. We devise new intermediate representations and operators for analysing data-movements, and demonstrate our technique on two benchmarks. We show that data-movement operations give an enormous performance boost for the benchmarks, and the performance of our technique is close to, and sometimes better than, hand-coded assembly.
BibTeX citation:
@techreport{Narayanan:CSD-03-1287,
Author= {Narayanan, Manikandan and Yelick, Katherine A.},
Title= {Generating Permutation Instructions from a High-Level Description},
Year= {2003},
Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/5695.html},
Number= {UCB/CSD-03-1287},
Abstract= {Fine-grained data parallelism, from media extensions to full streaming or vector instruction sets, offer enormous performance potential, if they can be effectively used from the application level. One critical aspect of their design is the organization of the registers and the generality of operations that move data between registers. In this paper we focus on this data-movement problem and demonstrate that starting with a high-level description of a data-parallel application, we can automatically map certain data-movements in the program onto a regular set of vector permutation instructions. Our language and compiler are based on StreamIt from MIT, and our target machine is the VIRAM processor from Berkeley. We devise new intermediate representations and operators for analysing data-movements, and demonstrate our technique on two benchmarks. We show that data-movement operations give an enormous performance boost for the benchmarks, and the performance of our technique is close to, and sometimes better than, hand-coded assembly.},
}
EndNote citation:
%0 Report %A Narayanan, Manikandan %A Yelick, Katherine A. %T Generating Permutation Instructions from a High-Level Description %I EECS Department, University of California, Berkeley %D 2003 %@ UCB/CSD-03-1287 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/5695.html %F Narayanan:CSD-03-1287