A Multiple-Representation Paradigm for Document Development

Pehong Chen

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-88-436
July 1988

http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/CSD-88-436.pdf

Powerful personal workstations with high-resolution displays, pointing devices, and windowing environments have created many new possibilities in presenting information, accessing data, and efficient computing in general. In the context of document preparation, this workstation-based technology has made it possible for the user to directly manipulate a document in its final form. The central idea is that a document is immediately reprocessed as it is edited; no syntactic constructs are explicitly used to express the desired operations. This so-called direct manipulation approach differs substantially from the traditional source language model, in which document semantics (structures and appearances) are specified with interspersed markup commands. In the source language model, a document is first prepared with a text editor, its formatting and other related processors are then executed, usually in batch mode, and the result is obtained.

A complete document development process involves a number of subtasks ranging from authoring, reading, filing, to printing. There are certain aspects of document development that are best-suited to a source-language approach while others are easier to deal with using direct-manipulation techniques. A hybrid paradigm combining the best of both approaches seems most desirable. In such a hybrid system, a document has at least two representations: a source representation with embedded commands that yields flexible high-level abstractions, and a target representation displaying an object's final appearance that gives precise placement and orientation in response to direct manipulation.

Simultaneously maintainig more than one user-manipulable representation of the same document is not an easy task. In particular, the historically batch-oriented processors that correspond to source-to-target transformations would have to be made incremental. Furthermore, there must be a systematic way of mapping changes from the target representation back to the source representation. Finally, an effective intermediate representation needs to be derived in order to make transformations in both directions possible.

Another interesting issue concerns the integration of system components. Because a complete document-development environment involves many tools and processors, it is important to make the system "seamless". A coherent set of user interfaces is also imperative so that context switches between different subtasks can be reduced to a minimum.

In this dissertation, the concept of multiple representations is first examined. A complete document development environment's task domain is then identified and several aspects of such an environment under both source-language and direct-manipulation paradigms are compared and analyzed. A simple but robust framework is introduced to model multiple-representation systems in general. Based upon this framework, a top-down design methodology is derived. As a case study of this methodology, the design of VORTEX (Visually-ORiented TEX), a multiple-representation environment for document development, is described. Focuses are on design options and decisions to solving the problem mentioned above.

Specifically, the design and implementation of VORTEX's underlying representation transformation mechanisms in both the forward and backward directions (i.e, the incremental formatter in the forward direction and the reverse mapping engine in the backward direction) and the integration techniques used are discussed in detail. A prototype of the VORTEX system has been implemented and it works. Finally, this multiple representation paradigm for document development is evaluated and the underlying principles with implications to other application domains are discussed. Some research directions are pointed out at the end.

Advisor: Michael A. Harrison


BibTeX citation:

@phdthesis{Chen:CSD-88-436,
    Author = {Chen, Pehong},
    Title = {A Multiple-Representation Paradigm for Document Development},
    School = {EECS Department, University of California, Berkeley},
    Year = {1988},
    Month = {Jul},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5485.html},
    Number = {UCB/CSD-88-436},
    Abstract = {Powerful personal workstations with high-resolution displays, pointing devices, and windowing environments have created many new possibilities in presenting information, accessing data, and efficient computing in general. In the context of document preparation, this workstation-based technology has made it possible for the user to directly manipulate a document in its final form. The central idea is that a document is immediately reprocessed as it is edited; no syntactic constructs are explicitly used to express the desired operations. This so-called direct manipulation approach differs substantially from the traditional source language model, in which document semantics (structures and appearances) are specified with interspersed markup commands. In the source language model, a document is first prepared with a text editor, its formatting and other related processors are then executed, usually in batch mode, and the result is obtained. <p>A complete document development process involves a number of subtasks ranging from authoring, reading, filing, to printing. There are certain aspects of document development that are best-suited to a source-language approach while others are easier to deal with using direct-manipulation techniques. A hybrid paradigm combining the best of both approaches seems most desirable. In such a hybrid system, a document has at least two representations: a source representation with embedded commands that yields flexible high-level abstractions, and a target representation displaying an object's final appearance that gives precise placement and orientation in response to direct manipulation. <p>Simultaneously maintainig more than one user-manipulable representation of the same document is not an easy task. In particular, the historically batch-oriented processors that correspond to source-to-target transformations would have to be made incremental. Furthermore, there must be a systematic way of mapping changes from the target representation back to the source representation. Finally, an effective intermediate representation needs to be derived in order to make transformations in both directions possible. <p>Another interesting issue concerns the integration of system components. Because a complete document-development environment involves many tools and processors, it is important to make the system "seamless". A coherent set of user interfaces is also imperative so that context switches between different subtasks can be reduced to a minimum. <p>In this dissertation, the concept of multiple representations is first examined. A complete document development environment's task domain is then identified and several aspects of such an environment under both source-language and direct-manipulation paradigms are compared and analyzed. A simple but robust framework is introduced to model multiple-representation systems in general. Based upon this framework, a top-down design methodology is derived. As a case study of this methodology, the design of VORTEX (Visually-ORiented TEX), a multiple-representation environment for document development, is described. Focuses are on design options and decisions to solving the problem mentioned above. <p>Specifically, the design and implementation of VORTEX's underlying representation transformation mechanisms in both the forward and backward directions (i.e, the incremental formatter in the forward direction and the reverse mapping engine in the backward direction) and the integration techniques used are discussed in detail. A prototype of the VORTEX system has been implemented and it works. Finally, this multiple representation paradigm for document development is evaluated and the underlying principles with implications to other application domains are discussed. Some research directions are pointed out at the end.}
}

EndNote citation:

%0 Thesis
%A Chen, Pehong
%T A Multiple-Representation Paradigm for Document Development
%I EECS Department, University of California, Berkeley
%D 1988
%@ UCB/CSD-88-436
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1988/5485.html
%F Chen:CSD-88-436