This chapter aims at providing an overview about the use of statistical methods supporting the Web Usage Mining. Within the first part is described the framework of the Web Usage Mining as a branch of the Web Mining committed to the study of how to use a Website. Then, the data (object of the analysis) are detailed together with the problems linked to the pre-processing. Once clarified, the data origin and their treatment for a correct development of a Web Usage analysis,the focus shifts on the statistical techniques that can be applied to the analysis background, with reference to binary segmentation methods. Those latter allow the discrimination through a response variable that determines the affiliation of the users to a group by considering some characteristics detected on the same users.
Web Usage Mining
Web Usage Mining has been defined as the application of data mining techniques to large Web data repositories in order to extract usage patterns, namely the visitor behavior. As further step, pattern discovery and patter analysis allow for profiling users and their preferences. For that statistical methods play a fundamental role. It is definitively possible to identify suitable attributes and main features characterizing a typology of users, thus providing a Web personalization.
This chapter deals with Web Usage Mining focalizing the attention on statistical methods for user profiling, among them binary segmentation or tree-based modeling will be considered in details.Top
Web Mining Branches
In the framework of Web Mining, User Profiling represents a fundamental application. Nowadays, huge information run in internet as well as in the Web Server during various types of its activities. The knowledge discovery process and the extraction of useful information in internet depend on both the ability of the Web navigator and the performance of the searching engine. Recently, the scientific research has focalized the attention on suitable procedures to profile different typologies of users by analyzing similarities and dissimilarities in their internet behavior (Berendt, Hotho, Mladenic, van Someren, Spiliopoulou, Stumme, 2003).
User Profiling is conceptually the act of building up a profile of who are the users and what they want to do. These profiles are used to group and priorities in their activities are identified. Knowing who are the users and what they want is a vital step in meeting their needs.
From the methodological point of view, User Profiling is one of the main purposes of Web Usage Mining, which is the process of applying data mining techniques to the discovery of usage patterns from Web data in various context applications. Web Usage Mining is one branch of Web Mining, namely data mining on Web data. Other branches of Web Mining are Web Content Mining, which is the process to analyze various aspects related to the contents of a Web site such text, graphics etc., and Web Structure Mining, which is the process to analyze the structure of the Web site in terms of organization of the Web pages and their design.