Contextualizing Retrieval of Full-Length Documents
Marti A. Hearst
EECS Department, University of California, Berkeley
Technical Report No. UCB/CSD-94-789
, 1994
http://www2.eecs.berkeley.edu/Pubs/TechRpts/1994/CSD-94-789.pdf
We address some issues relating to retrieval from unfamiliar text collections consisting of full-length documents. We claim that displaying query results in terms of inter-document similarity is inappropriate with long texts, and suggest instead that the results of simple initial queries should be contextualized according to category sets that correspond to the main topics of the texts. We argue that main topics of long texts should be represented by multiple categories, since in most cases one category cannot adequately classify a text. We describe a new automatic categorization algorithm that does not require pre-labeled texts and a prototype browsing interface that presents a simple mechanism for displaying multi-dimensional information.
BibTeX citation:
@techreport{Hearst:CSD-94-789, Author= {Hearst, Marti A.}, Title= {Contextualizing Retrieval of Full-Length Documents}, Year= {1994}, Month= {Jan}, Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/1994/5388.html}, Number= {UCB/CSD-94-789}, Abstract= {We address some issues relating to retrieval from unfamiliar text collections consisting of full-length documents. We claim that displaying query results in terms of inter-document similarity is inappropriate with long texts, and suggest instead that the results of simple initial queries should be contextualized according to category sets that correspond to the main topics of the texts. We argue that main topics of long texts should be represented by multiple categories, since in most cases one category cannot adequately classify a text. We describe a new automatic categorization algorithm that does not require pre-labeled texts and a prototype browsing interface that presents a simple mechanism for displaying multi-dimensional information.}, }
EndNote citation:
%0 Report %A Hearst, Marti A. %T Contextualizing Retrieval of Full-Length Documents %I EECS Department, University of California, Berkeley %D 1994 %@ UCB/CSD-94-789 %U http://www2.eecs.berkeley.edu/Pubs/TechRpts/1994/5388.html %F Hearst:CSD-94-789