Text Mining Methods for Hierarchical Document Indexing

Text Mining Methods for Hierarchical Document Indexing

Han-Joon Kim (The University of Seoul, Korea)
Copyright: © 2005 |Pages: 7
DOI: 10.4018/978-1-59140-557-3.ch209
OnDemand PDF Download:
$37.50

Abstract

We have recently seen a tremendous growth in the volume of online text documents from networked resources such as the Internet, digital libraries, and company-wide intranets. One of the most common and successful methods of organizing such huge amounts of documents is to hierarchically categorize documents according to topic (Agrawal, Bayardo, & Srikant, 2000; Kim & Lee, 2003). The documents indexed according to a hierarchical structure (termed ‘topic hierarchy’ or ‘taxonomy’) are kept in internal categories as well as in leaf categories, in the sense that documents at a lower category have increasing specificity. Through the use of a topic hierarchy, users can quickly navigate to any portion of a document collection without being overwhelmed by a large document space. As is evident from the popularity of Web directories such as Yahoo (http://www.yahoo.com/) and Open Directory Project (http://dmoz.org/), topic hierarchies have increased in importance as a tool for organizing or browsing a large volume of electronic text documents.

Complete Chapter List

Search this Book:
Reset