Information Extraction: Methodologies and Applications

Information Extraction: Methodologies and Applications

Jie Tang (Tsinghua University, China), Mingcai Hong (Tsinghua University, China), Duo Liang Zhang (Tsinghua University, China) and Juanzi Li (NEC Labs, China)
Copyright: © 2008 |Pages: 33
DOI: 10.4018/978-1-59904-373-9.ch001
OnDemand PDF Download:


This chapter is concerned with the methodologies and applications of information extraction. Information is hidden in the large volume of web pages and thus it is necessary to extract useful information from the web content, called Information Extraction. In information extraction, given a sequence of instances, we identify and pull out a sub-sequence of the input that represents information we are interested in. In the past years, there was a rapid expansion of activities in the information extraction area. Many methods have been proposed for automating the process of extraction. However, due to the heterogeneity and the lack of structure of Web data, automated discovery of targeted or unexpected knowledge information still presents many challenging research problems. In this chapter, we will investigate the problems of information extraction and survey existing methodologies for solving these problems. Several real-world applications of information extraction will be introduced. Emerging challenges will be discussed.

Complete Chapter List

Search this Book:
Table of Contents
Cláudio Chauke Nehme
Hercules Antonio do Prado, Edilson Ferneda
Hercules Antonio do Prado, Edilson Ferneda
Chapter 1
Jie Tang, Mingcai Hong, Duo Liang Zhang, Juanzi Li
This chapter is concerned with the methodologies and applications of information extraction. Information is hidden in the large volume of web pages... Sample PDF
Information Extraction: Methodologies and Applications
Chapter 2
Roberto Penteado, Eric Boutin
The information overload demands that organizations set up new capabilities concerning the analysis of data and texts to create the necessary... Sample PDF
Creating Strategic Information for Oranizations with Structured Text
Chapter 3
Christian Aranha, Emmanuel Passos
This chapter integrates elements from Natural Language Processing, Information Retrieval, Data Mining and Text Mining to support competitive... Sample PDF
Automatic NLP for Competitive Intelligence
Chapter 4
Horacio Saggion
Free text is a main repository of human knowledge, therefore methods and techniques to access this unstructured source of knowledge are of paramount... Sample PDF
Mining Profiles and Definitions with Natural Language Processing
Chapter 5
Ying Liu, Han Tong Loh, Wen Feng Lu
This chapter introduces an approach of deriving taxonomy from documents using a novel document profile model that enables document representations... Sample PDF
Deriving Taxonomy from Documents at Sentence Level
Chapter 6
Shigeaki Sakurai
This chapter introduces knowledge discovery methods based on a fuzzy decision tree from textual data. It argues that the methods extract features of... Sample PDF
Rule Discovery from Textual Data
Chapter 7
Edson Takashi Matsubara, Maria Carolina Monard, Ronaldo Cristiano Prati
This chapter presents semi-supervised multi-view learning in the context of text mining. Semi-supervised learning uses both labelled and unlabelled... Sample PDF
Exploring Unclassified Texts Using Multiview Semisupervised Learning
Chapter 8
Lean Yu, Shouyang Wang, Kin Keung Lai
With the rapid increase of the huge amount of online information, there is a strong demand for Web text mining which helps people discover some... Sample PDF
A Multi-Agent Neural Network System for Web Text Mining
Chapter 9
Jon Atle Gulla, Hans Olaf Borch, Jon Espen Ingvaldsen
Due to the large amount of information on the web and the difficulties of relating user’s expressed information needs to document content... Sample PDF
Contextualized Clustering in Exploratory Web Search
Chapter 10
Li Weigang, Wu Man Qi
This chapter presents a study of Ant Colony Optimization (ACO) to Interlegis Web portal, Brazilian legislation Website. The approach of AntWeb is... Sample PDF
AntWeb—Web Search Based on Ant Behavior: Approach and Implementation in Case of Interlegis
Chapter 11
Leandro Krug Wives, José Palazzo Moreira de Oliveira, Stanley Loh
This chapter introduces a technique to cluster textual documents using concepts. Document clustering is a technique capable of organizing large... Sample PDF
Conceptual Clustering of Textual Documents and Some Insights for Knowledge Discovery
Chapter 12
Domonkos Tikk, György Biro, Attila Törcsvári
Abstract: Patent categorization (PC) is a typical application area of text categorization (TC). TC can be applied in different scenarios at the work... Sample PDF
A Hierarchical Online Classifier for Patent Categorization
Chapter 13
Patricia Bintzler Cerrito
The purpose of this chapter is to demonstrate how text mining can be used to reduce the number of levels in a categorical variable to then use the... Sample PDF
Text Mining to Define a Validated Model of Hospital Rankings
Chapter 14
Wagner Francisco Castilho, Gentil José de Lucena Filho, Hércules Antonio do Prado, Edilson Ferneda
Clustering analysis (CA) techniques consist in, given a set of objects, estimating dense regions of points separated by sparse regions, according to... Sample PDF
An Interpretation Process for Clustering Analysis Based on the Ontology of Language
About the Contributors