Indexing Techniques for Web Access Logs

Indexing Techniques for Web Access Logs

Yannis Manolopoulos (Aristotle University of Thessaloniki, Greece), Mikolaj Morzy (Poznan University of Technology, Poland), Tadeusz Morzy (Poznan University of Technology, Poland), Alexandros Nanopoulos (Aristotle University of Thessaloniki, Greece), Marek Wojciechowski (Poznan University of Technology, Poland) and Maciej Zakrzewicz (Poznan University of Technology, Poland)
Copyright: © 2004 |Pages: 30
DOI: 10.4018/978-1-59140-208-4.ch009
OnDemand PDF Download:
No Current Special Offers


Access histories of users visiting a web server are automatically recorded in web access logs. Conceptually, the web-log data can be regarded as a collection of clients’ access-sequences, where each sequence is a list of pages accessed by a single user in a single session. This chapter presents novel indexing techniques that support efficient processing of so-called pattern queries, which consist of finding all access sequences that contain a given subsequence. Pattern queries are a key element of advanced analyses of web-log data, especially those concerning typical navigation schemes. In this chapter, we discuss the particularities of efficiently processing user access-sequences with pattern queries, compared to the case of searching unordered sets. Extensive experimental results are given, which examine a variety of factors and illustrate the superiority of the proposed methods over indexing techniques for unordered data adapted to access sequences.

Complete Chapter List

Search this Book: