Rewrite Systems, Pattern Matching, and Code Generation

Eduardo Pelegri-Llopart

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-88-423
June 1988

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-423.pdf

Trees are convenient representations because of their hierarchical structure, which models many situations, and the ease with which they can be manipulated. A rewrite system is a collection of rewrite rules of the form Alpha->Beta where Alpha and Beta are tree patterns. A rewrite system defines a transformation between trees by the repeated application of its rewrite rules.

Two research directions are pursued in this dissertation: augmenting the expressive power of individual rewrite rules by using new types of patterns, and analyzing the interaction of the rewrite rules. The dissertation contains new algorithms for linear and non-linear patterns, for a new type of non-local pattern, and for typed patterns in which the variables are restricted to tree languages.

The REACHABILITY problem for a rewrite system R is, given an input tree T and a fixed goal tree G, to determine whether there exists a rewrite sequence in R, rewriting T into G and, if so, to obtain one such sequence. REACHABILITY can be used to solve problems related to the mapping between concrete and abstract syntax trees, to construct a pattern matching algorithm for typed non-local patterns, and to provide algorithms for compiler code generation. A new class of rewrite system called finite bottom-up rewrite system (finite-BURS) is introduced for which the REACHABILITY problem can be solved efficiently with a table-driven algorithm.

The C-REACHABILITY problem is similar to REACHABILITY except that rewrite sequences are assigned costs, and the obtained sequence is required to have minimum cost over all candidates. If the cost of a rewrite sequence is defined as the sum of the costs of its rewrite rules, the algorithm for REACHABILITY can be modified for a subclass of finite-BURS to solve C-REACHABILITY in such a way that all cost manipulation is done at table-creation time. The subclass extends the machine grammars used by Graham and Glanville for code generation. A code generator based on this approach has been implemented and tested with several machine descriptions. The code generators obtained produce locally optimal code, are faster than comparable ones based on Graham-Glanville techniques, and are significantly faster than other recent proposals that manipulate costs explicitly at code generation time. Table size is comparable to the Graham-Glanville code generator.

Advisor: Susan L. Graham


BibTeX citation:

@phdthesis{Pelegri-Llopart:CSD-88-423,
    Author = {Pelegri-Llopart, Eduardo},
    Title = {Rewrite Systems, Pattern Matching, and Code Generation},
    School = {EECS Department, University of California, Berkeley},
    Year = {1988},
    Month = {Jun},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5507.html},
    Number = {UCB/CSD-88-423},
    Abstract = {Trees are convenient representations because of their hierarchical structure, which models many situations, and the ease with which they can be manipulated. A rewrite system is a collection of rewrite rules of the form Alpha->Beta where Alpha and Beta are tree patterns. A rewrite system defines a transformation between trees by the repeated application of its rewrite rules. <p>Two research directions are pursued in this dissertation: augmenting the expressive power of individual rewrite rules by using new types of patterns, and analyzing the interaction of the rewrite rules. The dissertation contains new algorithms for linear and non-linear patterns, for a new type of non-local pattern, and for typed patterns in which the variables are restricted to tree languages. <p>The REACHABILITY problem for a rewrite system <i><b>R</b></i> is, given an input tree <i>T</i> and a fixed goal tree <i>G</i>, to determine whether there exists a rewrite sequence in <i><b>R</b></i>, rewriting <i>T</i> into <i>G</i> and, if so, to obtain one such sequence. REACHABILITY can be used to solve problems related to the mapping between concrete and abstract syntax trees, to construct a pattern matching algorithm for typed non-local patterns, and to provide algorithms for compiler code generation. A new class of rewrite system called finite bottom-up rewrite system (finite-BURS) is introduced for which the REACHABILITY problem can be solved efficiently with a table-driven algorithm. <p>The C-REACHABILITY problem is similar to REACHABILITY except that rewrite sequences are assigned costs, and the obtained sequence is required to have minimum cost over all candidates. If the cost of a rewrite sequence is defined as the sum of the costs of its rewrite rules, the algorithm for REACHABILITY can be modified for a subclass of finite-BURS to solve C-REACHABILITY in such a way that all cost manipulation is done at table-creation time. The subclass extends the machine grammars used by Graham and Glanville for code generation. A code generator based on this approach has been implemented and tested with several machine descriptions. The code generators obtained produce locally optimal code, are faster than comparable ones based on Graham-Glanville techniques, and are significantly faster than other recent proposals that manipulate costs explicitly at code generation time. Table size is comparable to the Graham-Glanville code generator.}
}

EndNote citation:

%0 Thesis
%A Pelegri-Llopart, Eduardo
%T Rewrite Systems, Pattern Matching, and Code Generation
%I EECS Department, University of California, Berkeley
%D 1988
%@ UCB/CSD-88-423
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5507.html
%F Pelegri-Llopart:CSD-88-423