Compiling Lenient Languages for Parallel Asynchronous Execution

Klaus Erik Schauser

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-94-832
1994

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1994/CSD-94-832.pdf

High-level programming languages and exotic architectures have often been developed together, because the languages seem to require complex mechanisms realized by specialized hardware. The implicitly parallel language Id and dataflow architectures are a prime example. The language allows an elegant formulation of a broad class of problems while exposing substantial parallelism, however, its non-strict semantics require fine-grain dynamic scheduling and synchronization making an efficient implementation on conventional parallel machines challenging.

This thesis presents novel compilation techniques for implicitly parallel languages, focusing on techniques addressing dynamic scheduling. It shows that it is possible to implement the lenient functional language Id efficiently by partitioning the program into regions that can be scheduled statically as sequential threads, and realizing the remaining dynamic scheduling through compiler-controlled multithreading. Partitioning the program into sequential threads requires substantial compiler analysis because the evaluation order is not specified by the programmer.

The main contribution of the thesis is the development and implementation of powerful thread partitioning algorithms. In addition, a new theoretical framework is formulated for proving the correctness of the partitioning algorithms. The thesis presents a new basic block partitioning algorithm, separation constraint partitioning, which groups nodes in situations where previous algorithms failed. It presents a new solution for interprocedural partitioning which works in the presence of recursion. All of the partitioning algorithms have been implemented and form the basis of a compiler of Id for workstations and several parallel machines. This execution vehicle is used to quantify the effectiveness of the different partitioning schemes on whole applications. The experimental data, collected from our implementation of Id for workstations and the CM-5 multiprocessor, shows the effectiveness of the partitioning algorithm and compiler-controlled multithreading.

Advisor: David E. Culler


BibTeX citation:

@phdthesis{Schauser:CSD-94-832,
    Author = {Schauser, Klaus Erik},
    Title = {Compiling Lenient Languages for Parallel Asynchronous Execution},
    School = {EECS Department, University of California, Berkeley},
    Year = {1994},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1994/5850.html},
    Number = {UCB/CSD-94-832},
    Abstract = {High-level programming languages and exotic architectures have often been developed together, because the languages seem to require complex mechanisms realized by specialized hardware. The implicitly parallel language Id and dataflow architectures are a prime example. The language allows an elegant formulation of a broad class of problems while exposing substantial parallelism, however, its non-strict semantics require fine-grain dynamic scheduling and synchronization making an efficient implementation on conventional parallel machines challenging. <p>This thesis presents novel compilation techniques for implicitly parallel languages, focusing on techniques addressing dynamic scheduling. It shows that it is possible to implement the lenient functional language Id efficiently by partitioning the program into regions that can be scheduled statically as sequential threads, and realizing the remaining dynamic scheduling through compiler-controlled multithreading. Partitioning the program into sequential threads requires substantial compiler analysis because the evaluation order is not specified by the programmer. <p>The main contribution of the thesis is the development and implementation of powerful thread partitioning algorithms. In addition, a new theoretical framework is formulated for proving the correctness of the partitioning algorithms.  The thesis presents a new basic block partitioning algorithm, separation constraint partitioning, which groups nodes in situations where previous algorithms failed. It presents a new solution for interprocedural partitioning which works in the presence of recursion. All of the partitioning algorithms have been implemented and form the basis of a compiler of Id for workstations and several parallel machines. This execution vehicle is used to quantify the effectiveness of the different partitioning schemes on whole applications. The experimental data, collected from our implementation of Id for workstations and the CM-5 multiprocessor, shows the effectiveness of the partitioning algorithm and compiler-controlled multithreading.}
}

EndNote citation:

%0 Thesis
%A Schauser, Klaus Erik
%T Compiling Lenient Languages for Parallel Asynchronous Execution
%I EECS Department, University of California, Berkeley
%D 1994
%@ UCB/CSD-94-832
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1994/5850.html
%F Schauser:CSD-94-832