Utilizing Past Web for Knowledge Discovery

Utilizing Past Web for Knowledge Discovery

Adam Jatowt (Kyoto University, Japan), Yukiko Kawai (Kyoto Sangyo University, Japan) and Katsumi Tanaka (Kyoto University, Japan)
Copyright: © 2010 |Pages: 19
DOI: 10.4018/978-1-60566-982-3.ch132

Abstract

The Web is a useful data source for knowledge extraction, as it provides diverse content virtually on any possible topic. Hence, a lot of research has been recently done for improving mining in the Web. However, relatively little research has been done taking directly into account the temporal aspects of the Web. In this chapter, we analyze data stored in Web archives, which preserve content of the Web, and investigate the methodology required for successful knowledge discovery from this data. We call the collection of such Web archives past Web; a temporal structure composed of the past copies of Web pages. First, we discuss the character of the data and explain some concepts related to utilizing the past Web, such as data collection, analysis and processing. Next, we introduce examples of two applications, temporal summarization and a browser for the past Web.

Complete Chapter List

Search this Book:
Reset