Deriving Taxonomy from Documents at Sentence Level

Deriving Taxonomy from Documents at Sentence Level

Ying Liu (Hong Kong Polytechnic University, Hong Kong SAR, China), Han Tong Loh (National University of Singapore, Singapore) and Wen Feng Lu (National University of Singapore, Singapore)
Copyright: © 2008 |Pages: 21
DOI: 10.4018/978-1-59904-373-9.ch005
OnDemand PDF Download:
No Current Special Offers


This chapter introduces an approach of deriving taxonomy from documents using a novel document profile model that enables document representations with the semantic information systematically generated at the document sentence level. A frequent word sequence method is proposed to search for the salient semantic information and has been integrated into the document profile model. The experimental study of taxonomy generation using hierarchical agglomerative clustering has shown a significant improvement in terms of Fscore based on the document profile model. A close examination reveals that the integration of semantic information has a clear contribution compared to the classic bag-of-words approach. This study encourages us to further investigate the possibility of applying document profile model over a wide range of text based mining tasks.

Complete Chapter List

Search this Book: