Descriptional Composition of Compiler Components

John Tang Boyland

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-96-916
September 1996

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1996/CSD-96-916.pdf

New machine architectures and new programming languages are always appearing, and thus the need for new compilers continues unabated. Even experimental languages and machines need compilers. Compiler writers developing new and/or experimental compilers face competing pressures when designing their large-scale structure. On the one hand, a more modular structure will make it easier to maintain, modify or reuse pieces of the compiler. A more modular compiler is more likely to be correct, and reusable compiler components lead to consistent semantics among the compilers using them. On the other hand, a highly modular structure may lead to inefficiencies in implementation.

Suppose one uses an intermediate representation and divides up the compiler into two parts, one which compiles the source to the intermediate representation and another which translates a program in the intermediate representation to the target machine language. Doing so may make the compiler easier to understand, and furthermore, a well-chosen intermediate representation may prove a suitable target for other source languages, or a suitable source for translating to other machines. On the other hand, the need to create and then traverse this intermediate representation may slow a compiler significantly. If the two parts are described in a high-level declarative formalism, descriptional composition can be used to combine the two parts automatically so as to avoid creating and traversing the intermediate structure.

This dissertation presents a declarative compiler description language, APS, and a new method for descriptional composition. The language, based on a variant of attribute grammars, contains a number of features that aid compiler writers in factoring descriptions so that each concept can be expressed separately. Both a compiler for Oberon2 and a front-end for APS itself have been written in APS. The back-end of the Oberon2 compiler consists of a translation to a form of the "GCC tree" intermediate representation. Another module gives the translation from this form to source-level C text.

A prototype compiler has been developed for APS that supports descriptional composition. The descriptionally composed version of the Oberon2 back-end with the translation to C text is no larger than the sum of the sizes of the modules from which it is composed, yet it runs almost twice as fast. The Oberon2 compiler is the first successful use of descriptional composition for a realistically complex system, and demonstrates the effectiveness of combining the new APS description language and the new algorithm for descriptional composition presented in this dissertation.

Advisor: Susan L. Graham


BibTeX citation:

@phdthesis{Boyland:CSD-96-916,
    Author = {Boyland, John Tang},
    Title = {Descriptional Composition of Compiler Components},
    School = {EECS Department, University of California, Berkeley},
    Year = {1996},
    Month = {Sep},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1996/5816.html},
    Number = {UCB/CSD-96-916},
    Abstract = {New machine architectures and new programming languages are always appearing, and thus the need for new compilers continues unabated. Even experimental languages and machines need compilers. Compiler writers developing new and/or experimental compilers face competing pressures when designing their large-scale structure. On the one hand, a more modular structure will make it easier to maintain, modify or reuse pieces of the compiler. A more modular compiler is more likely to be correct, and reusable compiler components lead to consistent semantics among the compilers using them. On the other hand, a highly modular structure may lead to inefficiencies in implementation. <p>Suppose one uses an intermediate representation and divides up the compiler into two parts, one which compiles the source to the intermediate representation and another which translates a program in the intermediate representation to the target machine language. Doing so may make the compiler easier to understand, and furthermore, a well-chosen intermediate representation may prove a suitable target for other source languages, or a suitable source for translating to other machines. On the other hand, the need to create and then traverse this intermediate representation may slow a compiler significantly. If the two parts are described in a high-level declarative formalism, descriptional composition can be used to combine the two parts automatically so as to avoid creating and traversing the intermediate structure. <p>This dissertation presents a declarative compiler description language, APS, and a new method for descriptional composition. The language, based on a variant of attribute grammars, contains a number of features that aid compiler writers in factoring descriptions so that each concept can be expressed separately. Both a compiler for Oberon2 and a front-end for APS itself have been written in APS. The back-end of the Oberon2 compiler consists of a translation to a form of the "GCC tree" intermediate representation. Another module gives the translation from this form to source-level C text. <p>A prototype compiler has been developed for APS that supports descriptional composition. The descriptionally composed version of the Oberon2 back-end with the translation to C text is no larger than the sum of the sizes of the modules from which it is composed, yet it runs almost twice as fast. The Oberon2 compiler is the first successful use of descriptional composition for a realistically complex system, and demonstrates the effectiveness of combining the new APS description language and the new algorithm for descriptional composition presented in this dissertation.}
}

EndNote citation:

%0 Thesis
%A Boyland, John Tang
%T Descriptional Composition of Compiler Components
%I EECS Department, University of California, Berkeley
%D 1996
%@ UCB/CSD-96-916
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1996/5816.html
%F Boyland:CSD-96-916