Generating Permutation Instructions from a High-Level Description

Manikandan Narayanan and Katherine A. Yelick

EECS Department, University of California, Berkeley

Technical Report No. UCB/CSD-03-1287

, 2003

Fine-grained data parallelism, from media extensions to full streaming or vector instruction sets, offer enormous performance potential, if they can be effectively used from the application level. One critical aspect of their design is the organization of the registers and the generality of operations that move data between registers. In this paper we focus on this data-movement problem and demonstrate that starting with a high-level description of a data-parallel application, we can automatically map certain data-movements in the program onto a regular set of vector permutation instructions. Our language and compiler are based on StreamIt from MIT, and our target machine is the VIRAM processor from Berkeley. We devise new intermediate representations and operators for analysing data-movements, and demonstrate our technique on two benchmarks. We show that data-movement operations give an enormous performance boost for the benchmarks, and the performance of our technique is close to, and sometimes better than, hand-coded assembly.

BibTeX citation:

@techreport{Narayanan:CSD-03-1287,
    Author= {Narayanan, Manikandan and Yelick, Katherine A.},
    Title= {Generating Permutation Instructions from a High-Level Description},
    Year= {2003},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/5695.html},
    Number= {UCB/CSD-03-1287},
    Abstract= {Fine-grained data parallelism, from media extensions to full streaming or vector instruction sets, offer enormous performance potential, if they can be effectively used from the application level. One critical aspect of their design is the organization of the registers and the generality of operations that move data between registers. In this paper we focus on this data-movement problem and demonstrate that starting with a high-level description of a data-parallel application, we can automatically map certain data-movements in the program onto a regular set of vector permutation instructions. Our language and compiler are based on StreamIt from MIT, and our target machine is the VIRAM processor from Berkeley. We devise new intermediate representations and operators for analysing data-movements, and demonstrate our technique on two benchmarks. We show that data-movement operations give an enormous performance boost for the benchmarks, and the performance of our technique is close to, and sometimes better than, hand-coded assembly.},
}

EndNote citation:

%0 Report
%A Narayanan, Manikandan 
%A Yelick, Katherine A. 
%T Generating Permutation Instructions from a High-Level Description
%I EECS Department, University of California, Berkeley
%D 2003
%@ UCB/CSD-03-1287
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2003/5695.html
%F Narayanan:CSD-03-1287