Mining Free Text for Structure

Mining Free Text for Structure

Vladimir A. Kulyukin (Utah State University, USA) and Robin Burke (DePaul University, USA)
Copyright: © 2003 |Pages: 23
DOI: 10.4018/978-1-59140-051-6.ch012
OnDemand PDF Download:


Knowledge of the structural organization of information in documents can be of significant assistance to information systems that use documents as their knowledge bases. In particular, such knowledge is of use to information retrieval systems that retrieve documents in response to user queries. This chapter presents an approach to mining free-text documents for structure that is qualitative in nature. It complements the statistical and machine-learning approaches, insomuch as the structural organization of information in documents is discovered through mining free text for content markers left behind by document writers. The ultimate objective is to find scalable data mining (DM) solutions for free-text documents in exchange for modest knowledge-engineering requirements. The problem of mining free text for structure is addressed in the context of finding structural components of files of frequently asked questions (FAQs) associated with many USENET newsgroups. The chapter describes a system that mines FAQs for structural components. The chapter concludes with an outline of possible future trends in the structural mining of free text.

Complete Chapter List

Search this Book:
Table of Contents
John Wang
John Wang
Chapter 1
Stefan Arnborg
This chapter reviews the fundamentals of inference, and gives a motivation for Bayesian analysis. The method is illustrated with dependency tests in... Sample PDF
A Survey of Bayesian Data Mining
Chapter 2
William H. Hsu
In this chapter, I discuss the problem of feature subset selection for supervised inductive learning approaches to knowledge discovery in databases... Sample PDF
Control of Inductive Bias in Supervised Learning Using Evolutionary Computation: A Wrapper-Based Approach
Chapter 3
Herna Viktor, Eric Paquet, Gys le Roux
Data mining concerns the discovery and extraction of knowledge chunks from large data repositories. In a cooperative datamining environment, more... Sample PDF
Cooperative Learning and Virtual Reality-Based Visualization for Data Mining
Chapter 4
Yong Seong Kim, W. Nick Street, Filippo Menczer
Feature subset selection is an important problem in knowledge discovery, not only for the insight gained from determining relevant modeling... Sample PDF
Feature Selection in Data Mining
Chapter 5
Massimo Coppola, Marco Vanneschi
We consider the application of parallel programming environments to develop portable and efficient high performance data mining (DM) tools. We first... Sample PDF
Parallel and Distributed Data Mining through Parallel Skeletons and Distributed Objects
Chapter 6
Jerzy W. Grzymala-Busse, Wojciech Ziarko
The chapter is focused on the data mining aspect of the applications of rough set theory. Consequently, the theoretical part is minimized to... Sample PDF
Data Mining Based on Rough Sets
Chapter 7
Marvin L. Brown, John F. Kros
Data mining is based upon searching the concatenation of multiple databases that usually contain some amount of missing data along with a variable... Sample PDF
The Impact of Missing Data on Data Mining
Chapter 8
Hsin-Chang Yang, Chung-Hong Lee
Recently, many approaches have been devised for mining various kinds of knowledge from texts. One important application of text mining is to... Sample PDF
Mining Text Documents for Thematic Hierarchies Using Self-Organizing Maps
Chapter 9
John Wang, Alan Oppenheim
Although Data Mining (DM) may often seem a highly effective tool for companies to be using in their business endeavors, there are a number of... Sample PDF
The Pitfalls of Knowledge Discovery in Databases and Data Mining
Chapter 10
Marvin D. Troutt, Donald W. Gribbin, Murali S. Shanker, Aimao Zhang
Data mining is increasingly being used to gain competitive advantage. In this chapter, we propose a principle of maximum performance efficiency... Sample PDF
Maximum Performance Efficiency Approaches for Estimating Best Practice Costs
Chapter 11
Eitel J.M. Lauria, Giri Kumar Tayi
One of the major problems faced by data-mining technologies is how to deal with uncertainty. The prime characteristic of Bayesian methods is their... Sample PDF
Bayesian Data Mining and Knowledge Discovery
Chapter 12
Vladimir A. Kulyukin, Robin Burke
Knowledge of the structural organization of information in documents can be of significant assistance to information systems that use documents as... Sample PDF
Mining Free Text for Structure
Chapter 13
Michael Johnson, Farshad Fotouhi, Sorin Draghici
This chapter presents three systems that incorporate document structure information into a search of the Web. These systems extend existing Web... Sample PDF
Query-By-Structure Approach for the Web
Chapter 14
Tomas Eklund, Barbro Back, Hannu Vanharanta, Ari Visa
Performing financial benchmarks in today’s information-rich society can be a daunting task. With the evolution of the Internet, access to massive... Sample PDF
Financial Benchmarking Using Self-Organizing Maps - Studying the International Pulp and Paper Industry
Chapter 15
Fay Cobb Payton
Recent attention has turned to the healthcare industry and its use of voluntary community health information network (CHIN) models for e-health and... Sample PDF
Data Mining in Health Care Applications
Chapter 16
Lori K. Long, Mavin D. Troutt
This chapter focuses on the potential contributions that Data Mining (DM) could make within the Human Resource (HR) function in organizations. We... Sample PDF
Data Mining for Human Resource Information Systems
Chapter 17
Yao Chen, Joe Zhu
Information technology (IT) has become the key enabler of business process expansion if an organization is to survive and continue to prosper in a... Sample PDF
Data Mining in Information Technology and Banking Performance
Chapter 18
Jack S. Cook, Laura L. Cook
This chapter highlights both the positive and negative aspects of Data Mining (DM). Specifically, the social, ethical, and legal implications of DM... Sample PDF
Social, Ethical and Legal Issues of Data Mining
Chapter 19
Christian Bohm, Maria R. Galli, Omar Chiotti
The aim of this work is to present a data-mining application to software engineering. Particularly, we describe the use of data mining in different... Sample PDF
Data Mining in Designing an Agent-Based DSS
Chapter 20
Jeffrey Hsu
Every day, enormous amounts of information are generated from all sectors, whether it be business, education, the scientific community, the World... Sample PDF
Critical and Future Trends in Data Mining: A Review of Key Data Mining Technologies/Applications
About the Authors