Handbook of Research on Text and Web Mining Technologies (2 Volumes)

Handbook of Research on Text and Web Mining Technologies (2 Volumes)

Min Song (New Jersey Institute of Technology, USA) and Yi-Fang Brook Wu (New Jersey Institute of Technology, USA)
Release Date: September, 2008|Copyright: © 2009 |Pages: 901
ISBN13: 9781599049908|ISBN10: 1599049902|EISBN13: 9781599049915|DOI: 10.4018/978-1-59904-990-8

Description

The massive daily overflow of electronic data to information seekers creates the need for better ways to digest and organize this information to make it understandable and useful. Text mining, a variation of data mining, extracts desired information from large, unstructured text collections stored in electronic forms.

The Handbook of Research on Text and Web Mining Technologies is the first comprehensive reference to the state of research in the field of text mining, serving a pivotal role in educating practitioners in the field. This compendium of pioneering studies from leading experts is essential to academic reference collections and introduces researchers and students to cutting-edge techniques for gaining knowledge discovery from unstructured text.

Topics Covered

The many academic areas covered in this publication include, but are not limited to:

  • Core text mining operations
  • Data mining techniques in text mining
  • Data preprocessing techniques
  • Emerging directions of text mining
  • Evaluation techniques of text mining
  • Information Extraction
  • Information Retrieval
  • Link analysis
  • Natural Language Processing (NLP)
  • Taxonomy of text mining
  • Text categorization
  • Text clustering
  • Text mining applications
  • Text mining case studies
  • Visualization approaches

Reviews and Testimonials

This handbook presents most recent advances and survey of applications in text and web mining which should be of interests to researchers and end-users alike.

– Min Song, Old Dominion University, USA

In addition to providing an in-depth examination of core text and Web mining algorithms and operations, this book examines advanced pre-processing techniques, knowledge representation considerations, and visual approaches.

– Book News Inc. (March 2009)

Table of Contents and List of Contributors

Search this Book:
Reset

Preface

With abundance of textual information on the web, text and web mining is increasingly important. Although search technologies have matured and getting relevant documents or web pages is not difficult any more, information overload has never ceased to be a roadblock for users. More advanced text applications are needed to bring out novel and useful information or knowledge hidden in the sea of documents. The purpose of this handbook is to present most recent advances and survey of applications in text and web mining which should be of interests to researchers and end-users alike. With that in mind, we invited submissions to Handbook of Research in Text and Web Mining. Based on the content, we organized this handbook into five sections which represent the major topic areas in text and web mining.

Section titles and their highlights:

Section 1 Document Preprocessing, concerns steps on obtaining key textual elements and their weights before mining occurs. This section covers various operations to transform text into the next step including lexical analysis, elimination of functional words, stemming, identification of key terms and phrases, and document representation.

Section 2 Classification and Clustering, discusses two popular mining methods and their applications in text and web mining. In this section, we present state-of-the-art classification and clustering techniques applied to several interesting problem domains such as syllabus, protein, and image classification.

Section 3 Database, Ontology and the Web, presents topics relating to three types of objects and their use in the mining processing either as data or supplemental information to improve mining performance. This section presents a variety of research issues and problems associated with database, ontology, and the web from text and web mining perspective.

Section 4 Information Retrieval and Extraction, illustrates how mining techniques can be used to enhance performance of information retrieval and extraction. This section presents that how text and web mining techniques contribute to resolve difficult problems of information retrieval and extraction.

Section 5 Applications and Survey, concludes the book with surveys on latest research and end-user applications.

All the chapters are opened with an overview and concluded with references and author biography. It will be beneficial for readers of this handbook to have basic understanding of natural language processing and college statistics, since we consider both subjects the foundation of text and web mining. The research in this area is developing rapidly. Therefore, by no means that this handbook is the ultimate research report on what text and web mining can achieve. It is hoped that this handbook will serve as catalyst to innovative ideas and thus make exciting research in this and complimentary research areas fruitful in the near future.

Author(s)/Editor(s) Biography

Min Song is an assistant professor of Department of Information Systems at NJIT. He received his M.S. in School of Information Science from Indiana University in 1996 and received Ph.D. degree in Information Systems from Drexel University in 2005. Min has a background in Text Mining, Bioinfomatics, Information Retrieval and Information Visualization.

Min received the Drexel Dissertation Award in 2005. In 2006, Min’s work received an honorable mention award in the 2006 Greater Philadelphia Bioinformatics Symposium. In addition, The paper entitled “Extracting and Mining Protein-protein interaction Network from Biomedical Literature” has received the best paper award from 2004 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology, which was held in San Diego, USA, Oct. 7-8, 2004. In addition, another paper entitled “Ontology-based Scalable and Portable Information Extraction System to Extract Biological Knowledge from Huge Collection of Biomedical Web Documents” was nominated as the best paper at 2004 IEEE/ACM Web Intelligence Conference, which was held in Beijing, China, Sept, 20-24, 2004.

Dr. Brook Wu is an Associate Professor in the Information Systems Department at New Jersey Institute of Technology. Her current research interests include: text mining, information extraction, knowledge organization, information retrieval, and natural language processing. Her projects have been supported by National Science Foundation and Institute of Museum of Library Services. Her research has appeared in journals such as Journal of the American Society for Information Science and Technology, Journal of Biomedical Informatics and Information Retrieval.

Indices

Editorial Board

  • Zoran Obradovic, Temple University, USA
  • Alexander Yates, Temple University, USA
  • Il-Yeol Song, Drexel University, USA
  • Xiaohua Tony Hu , Drexel University, USA
  • Hongfang Liu, Georgetown University Medical Center, USA
  • Illhoi Yoo, University of Missouri-Columbia School of Medicine, USA
  • Jason T.L. Wang, New Jersey Institute of Technology, USA