Web Mining: Creating Structure out of Chaos

Roderick L. Lee (Pennsylvania State University at Harrisburg, USA)
DOI: 10.4018/978-1-59140-057-8.ch013
This chapter presents an overview of web mining. The three areas of web mining—Web content mining, Web usage mining, and Web structure mining—are identified. In this chapter specific attention is paid to Web structure mining, which is the study of the link topology. The link topology of the Web is analyzed in the context of a cyber-community in order to explore the connection between the link topology and conferral of authority. Millions, soon to be billions, of people are annotating Web documents, which results in an abundance of information. Herein lies the problem: topic distillation—searching through the sea of documents for relevant information. To address the problem of overabundance and relevancy, models are needed that can assist in creating order at the local level. The hub and spoke model identified in this chapter takes a proactive approach to creating an online community in a centralized or planned fashion and provides control over the architecture of the Web graph. In the end users can be assured with a certain level of confidence that the Web content contained in a hyperlinked community is both accurate and relevant.

Table of Contents
