Text Mining-Machine Learning on Documents

Text Mining-Machine Learning on Documents

Dunja Mladenic (Jozef Stefan Institute, Slovenia)
Copyright: © 2005 |Pages: 4
DOI: 10.4018/978-1-59140-557-3.ch208
OnDemand PDF Download:
No Current Special Offers


Intensive usage and growth of the World Wide Web and the daily increasing amount of text information in electronic form have resulted in a growing need for computer-supported ways of dealing with text data. One of the most popular problems addressed with text mining methods is document categorization. Document categorization aims to classify documents into pre-defined categories, based on their content. Other important problems addressed in text mining include document search, based on the content, automatic document summarization, automatic document clustering and construction of document hierarchies, document authorship detection, identification of plagiarism of documents, topic identification and tracking, information extraction, hypertext analysis, and user profiling. If we agree on text mining being a fairly broad area dealing with computer-supported analysis of text, then the list of problems that can be addressed is rather long and open. Here we adopt this fairly open view but concentrate on the parts related to automatic data analysis and data mining.

Complete Chapter List

Search this Book: