Turbo Recognition: An Approach to Decoding Page Layout

Taku Andrew Tokuyasu

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-02-1172
January 2002

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/CSD-02-1172.pdf

Turbo recognition (TR) is an approach to layout analysis of scanned document images inspired by turbo decoding from communication theory. The TR algorithm is based on a generative model of image production in which two regular grammars simultaneously describe structure in horizontal and vertical directions. The TR model thus embodies non-local constraints while retaining many of the features of local statistical methods. This grammatical basis allows TR to be quickly retargeted to new domains. While TR, like turbo decoding, is not guaranteed to recover the statistically optimal solution, we present experimental evidence of its ability to produce near-optimal results for a non-trivial synthetic problem. We explore the expressiveness of TR for describing abstract structure in two dimensions, and develop a hierarchy of grammars of increasing complexity. We demonstrate the application of the TR framework to the analysis of simple text documents. We discuss how TR can be applied to the analysis of composite documents and images corrupted with extreme amounts of noise, and show how it can be applied to problems such as the layout analysis of journal article title pages.

Advisor: Richard A. Fateman


BibTeX citation:

@phdthesis{Tokuyasu:CSD-02-1172,
    Author = {Tokuyasu, Taku Andrew},
    Title = {Turbo Recognition: An Approach to Decoding Page Layout},
    School = {EECS Department, University of California, Berkeley},
    Year = {2002},
    Month = {Jan},
    URL = {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/5417.html},
    Number = {UCB/CSD-02-1172},
    Abstract = {Turbo recognition (TR) is an approach to layout analysis of scanned document images inspired by turbo decoding from communication theory. The TR algorithm is based on a generative model of image production in which two regular grammars simultaneously describe structure in horizontal and vertical directions. The TR model thus embodies non-local constraints while retaining many of the features of local statistical methods. This grammatical basis allows TR to be quickly retargeted to new domains. While TR, like turbo decoding, is not guaranteed to recover the statistically optimal solution, we present experimental evidence of its ability to produce near-optimal results for a non-trivial synthetic problem. We explore the expressiveness of TR for describing abstract structure in two dimensions, and develop a hierarchy of grammars of increasing complexity.  We demonstrate the application of the TR framework to the analysis of simple text documents. We discuss how TR can be applied to the analysis of composite documents and images corrupted with extreme amounts of noise, and show how it can be applied to problems such as the layout analysis of journal article title pages.}
}

EndNote citation:

%0 Thesis
%A Tokuyasu, Taku Andrew
%T Turbo Recognition: An Approach to Decoding Page Layout
%I EECS Department, University of California, Berkeley
%D 2002
%@ UCB/CSD-02-1172
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2002/5417.html
%F Tokuyasu:CSD-02-1172