Reference Hub2
Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps

Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps

Hsin-Chang Yang, Chung-Hong Lee
Copyright: © 2003 |Pages: 21
ISBN13: 9781591400516|ISBN10: 1591400511|ISBN13 Softcover: 9781931777834|EISBN13: 9781591400950
DOI: 10.4018/978-1-59140-051-6.ch008
Cite Chapter Cite Chapter

MLA

Yang, Hsin-Chang, and Chung-Hong Lee. "Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps." Data Mining: Opportunities and Challenges, edited by John Wang, IGI Global, 2003, pp. 199-219. https://doi.org/10.4018/978-1-59140-051-6.ch008

APA

Yang, H. & Lee, C. (2003). Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps. In J. Wang (Ed.), Data Mining: Opportunities and Challenges (pp. 199-219). IGI Global. https://doi.org/10.4018/978-1-59140-051-6.ch008

Chicago

Yang, Hsin-Chang, and Chung-Hong Lee. "Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps." In Data Mining: Opportunities and Challenges, edited by John Wang, 199-219. Hershey, PA: IGI Global, 2003. https://doi.org/10.4018/978-1-59140-051-6.ch008

Export Reference

Mendeley
Favorite

Abstract

Recently, many approaches have been devised for mining various kinds of knowledge from texts. One important application of text mining is to identify themes and the semantic relations among these themes for text categorization. Traditionally, these themes were arranged in a hierarchical manner to achieve effective searching and indexing as well as easy comprehension for human beings. The determination of category themes and their hierarchical structures was mostly done by human experts. In this work, we developed an approach to automatically generate category themes and reveal the hierarchical structure among them. We also used the generated structure to categorize text documents. The document collection was trained by a self-organizing map to form two feature maps. We then analyzed these maps and obtained the category themes and their structure. Although the test corpus contains documents written in Chinese, the proposed approach can be applied to documents written in any language, and such documents can be transformed into a list of separated terms.

Request Access

You do not own this content. Please login to recommend this title to your institution's librarian or purchase it from the IGI Global bookstore.