Combining Data Warehousing and Data Mining Techniques for Web Log Analysis

Combining Data Warehousing and Data Mining Techniques for Web Log Analysis

Torben Bach Pedersen (Aalborg University, Denmark), Jesper Thorhauge (Conzentrate, Denmark) and Søren E. Jespersen (Linkage, Denmark)
DOI: 10.4018/978-1-59904-951-9.ch212
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Enormous amounts of information about Web site user behavior are collected in Web server logs. However, this information is only useful if it can be queried and analyzed to provide high-level knowledge about user navigation patterns, a task that requires powerful techniques. This chapter presents a number of approaches that combine data warehousing and data mining techniques in order to analyze Web logs. After introducing the well-known click and session data warehouse (DW) schemas, the chapter presents the subsession schema, which allows fast queries on sequences of page visits. Then, the chapter presents the so-called “hybrid” technique, which combines DW Web log schemas with a data mining technique called Hypertext Probabilistic Grammars, hereby providing fast and flexible constraint-based Web log analysis. Finally, the chapter presents a “post-check enhanced” improvement of the hybrid technique.

Complete Chapter List

Search this Book:
Reset