Generating Permutation Instructions from a High-Level Description

Manikandan Narayanan and Katherine A. Yelick

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-03-1287
2003

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/CSD-03-1287.pdf

Fine-grained data parallelism, from media extensions to full streaming or vector instruction sets, offer enormous performance potential, if they can be effectively used from the application level. One critical aspect of their design is the organization of the registers and the generality of operations that move data between registers. In this paper we focus on this data-movement problem and demonstrate that starting with a high-level description of a data-parallel application, we can automatically map certain data-movements in the program onto a regular set of vector permutation instructions. Our language and compiler are based on StreamIt from MIT, and our target machine is the VIRAM processor from Berkeley. We devise new intermediate representations and operators for analysing data-movements, and demonstrate our technique on two benchmarks. We show that data-movement operations give an enormous performance boost for the benchmarks, and the performance of our technique is close to, and sometimes better than, hand-coded assembly.


BibTeX citation:

@techreport{Narayanan:CSD-03-1287,
    Author = {Narayanan, Manikandan and Yelick, Katherine A.},
    Title = {Generating Permutation Instructions from a High-Level Description},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2003},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/5695.html},
    Number = {UCB/CSD-03-1287},
    Abstract = {Fine-grained data parallelism, from media extensions to full streaming or vector instruction sets, offer enormous performance potential, if they can be effectively used from the application level. One critical aspect of their design is the organization of the registers and the generality of operations that move data between registers. In this paper we focus on this data-movement problem and demonstrate that starting with a high-level description of a data-parallel application, we can automatically map certain data-movements in the program onto a regular set of vector permutation instructions. Our language and compiler are based on StreamIt from MIT, and our target machine is the VIRAM processor from Berkeley. We devise new intermediate representations and operators for analysing data-movements, and demonstrate our technique on two benchmarks. We show that data-movement operations give an enormous performance boost for the benchmarks, and the performance of our technique is close to, and sometimes better than, hand-coded assembly.}
}

EndNote citation:

%0 Report
%A Narayanan, Manikandan
%A Yelick, Katherine A.
%T Generating Permutation Instructions from a High-Level Description
%I EECS Department, University of California, Berkeley
%D 2003
%@ UCB/CSD-03-1287
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/5695.html
%F Narayanan:CSD-03-1287