Semi-Structured Data Extraction from Heterogenous Sources

Semi-Structured Data Extraction from Heterogenous Sources

Xiaoying Gao (University of Melbourne, Australia) and Leon Sterling (University of Melbourne, Australia)
Copyright: © 2000 |Pages: 20
DOI: 10.4018/978-1-878289-82-7.ch005


The World Wide Web is known as the “universe of network-accessible information, the embodiment of human knowledge” (W3C, 1999). Internet-based knowledge management aims to use the Internet as the world wide environment for knowledge publishing, searching, sharing, reusing, and integration, and to support collaboration and decision making. However, knowledge on the Internet is buried in documents. Most of the documents are written in languages for human readers. The knowledge contained therein cannot be easily accessed by computer programs such as knowledge management systems. In order to make the Internet “machine readable,” information extraction from Web pages becomes a crucial research problem.

Complete Chapter List

Search this Book: