Applied Sequence Clustering Techniques for Process Mining

Applied Sequence Clustering Techniques for Process Mining

Diogo R. Ferreira (Technical University of Lisbon, Portugal)
Copyright: © 2009 |Pages: 22
DOI: 10.4018/978-1-60566-288-6.ch022
OnDemand PDF Download:
$37.50

Abstract

This chapter introduces the principles of sequence clustering and presents two case studies where the technique is used to discover behavioral patterns in event logs. In the first case study, the goal is to understand the way members of a software team perform their daily work, and the application of sequence clustering reveals a set of behavioral patterns that are related to some of the main processes being carried out by that team. In the second case study, the goal is to analyze the event history recorded in a technical support database in order to determine whether the recorded behavior complies with a predefined issue handling process. In this case, the application of sequence clustering confirms that all behavioral patterns share a common trend that resembles the original process. Throughout the chapter, special attention is given to the need for data preprocessing in order to obtain results that provide insight into the typical behavior of business processes.
Chapter Preview
Top

1. Introduction

The field of process mining (van der Aalst & Weijters, 2004) is a new and exciting area of research, whose purpose is to develop techniques to gain insight into business processes based on the behavior recorded in event logs. There are a number of process mining techniques already available and most of them focus on discovering control-flow models (van der Aalst et al, 2003). There are also techniques that take into account data dependencies (Rozinat et al, 2006), and techniques to discover other kinds of models such as social networks among workflow participants (van der Aalst et al, 2005).

Process mining techniques such as the α-algorithm (van der Aalst et al, 2004), the inference methods proposed by (Cook & Wolf, 1995), the directed acyclic graphs of (Agrawal et al, 1998), the inductive workflow acquisition by (Herbst & Karagiannis, 1998), the hierarchical clustering of (Greco et al, 2005), the genetic algorithms of (Alves de Medeiros et al, 2007) and the instance graphs of (van Dongen & van der Aalst, 2004), to cite only a few, are all techniques that aim at extracting the control-flow behavior of a business process and representing it according to different kinds of models. All of these techniques take an event log as input and as the starting point for the discovery of underlying process.

In many practical applications, however, the events that belong to a particular process can only be found among the events of other processes that are running within the same system. For example, events recorded in a CRM (Customer Relationship Management) system may belong to different processes such as creating a new customer or handling a claim submitted by an existing customer. Furthermore, even when focusing on a single process, the behavior in set of instances may be so diverse that it becomes appropriate to study different behaviors as separate workflows. Either way, the amount and diversity of activities recorded in an event log may be such that it becomes necessary to sort out the different existing processes before applying one of the above process mining techniques.

Sequence clustering is a particularly useful technique for this purpose, as it provides the means to partition a number of sequences into a set of clusters or groups of similar sequences. Although the development of sequence clustering techniques has been an active field of research especially in the area of bioinformatics—see for example (Enright et al, 2002), (Jaroszewski & Godzik, 2002) and (Chen et al, 2006)—its principles are equally applicable to other kinds of sequence data. For example, in applications such as user click-stream analysis it is possible to use sequence clustering to discover the typical navigation patterns on a Web site (Cadez et al, 2003). The same approach can be used to discover the typical behavior of different processes, or to distinguish between different behaviors within a single process, for example to identify what is considered to be the normal flow and what is deemed to be exceptional behavior.

The use of clustering algorithms in association with process mining techniques has received increased attention in recent years: in (Greco et al, 2004), the authors represent each trace in a vectorial space in order to make use of the k-means algorithm to cluster workflow traces; (Alves de Medeiros et al, 2008) make use of a similar approach in order to perform hierarchical clustering; (Jung et al, 2008) also address hierarchical clustering by means of a special-purpose algorithm based on a cosine similarity measure; in (Song et al, 2008) the authors make use of several clustering algorithms, including k-means and self-organizing maps; (Ceglowski et al, 2005) make use of self-organizing maps in order to cluster hospital emergency data. This means that there are several techniques available for clustering workflow traces. In this chapter we focus specifically on the use of sequence clustering techniques.

Key Terms in this Chapter

Cluster Model: The model that represents the dominant behavior within a cluster.

Process Mining: Field of research that studies techniques to discover business process models automatically from recorded behavior.

Behavioral Pattern: A behavior that has been observed to be common to multiple sequences.

Preprocessing: A series of steps applied to a dataset in order to facilitate its analysis.

Event Log: A file that contains recorded run-time behavior.

Sequence Clustering: A data mining technique that groups sequences into clusters according to their similarity.

Parameters: A set of variables that can be configured in order to change the behavior of an algorithm.

Complete Chapter List

Search this Book:
Reset
Table of Contents
Preface
Jorge Cardoso, Wil van der Aalst
Chapter 1
Tiziana Margaria, Bernhard Steffen
The one thing approach is designed to overcome the classical communication hurdles between application experts and the various levels of IT experts.... Sample PDF
Business Process Modeling in the jABC: The One-Thing Approach
$37.50
Chapter 2
Huy Tran, Ta’id Holmes, Uwe Zdun, Schahram Dustdar
This chapter introduces a view-based, model-driven approach for process-driven, service-oriented architectures. A typical business process consists... Sample PDF
Modeling Process-Driven SOAs: A View-Based Approach
$37.50
Chapter 3
Stefan Jablonski
This chapter presents a process modeling approach for holistic process management. The main idea is that domain specific process models are required... Sample PDF
Process Modeling for Holistic Process Management
$37.50
Chapter 4
Matthias Kloppmann, Dieter Koenig, Simon Moser
This chapter introduces a set of languages intended to model and run business processes. The Business Process Modeling Notation 1.1 (BPMN) is a... Sample PDF
The Dichotomy of Modeling and Execution: BPMN and WS-BPEL
$37.50
Chapter 5
Chun Ouyang, Michael Adams, Arthur H.M. ter Hofstede
Due to the absence of commonly accepted conceptual and formal foundations for workflow management, and more generally Business Process Management... Sample PDF
Yet Another Workflow Language: Concepts, Tool Support, and Application
$37.50
Chapter 6
Modelling Constructs  (pages 122-141)
Ekkart Kindler
There are many different notations and formalisms for modelling business processes and workflows. These notations and formalisms have been... Sample PDF
Modelling Constructs
$37.50
Chapter 7
Kwanghoon Kim, Clarence A. Ellis
This chapter introduces the basic concepts of information control net (ICN) and its workflow models. In principle, a workflow model is the... Sample PDF
ICN-Based Workflow Model and its Advances
$37.50
Chapter 8
Manfred Reichert, Peter Dadam
In dynamic environments it must be possible to quickly implement new business processes, to enable ad-hoc deviations from the defined business... Sample PDF
Enabling Adaptive Process-Aware Information Systems with ADEPT2
$37.50
Chapter 9
Macello La Rosa, Marlon Dumas, Arthur H.M. ter Hofstede
A reference process model represents multiple variants of a common business process in an integrated and reusable manner. It is intended to be... Sample PDF
Modeling Business Process Variability for Design-Time Configuration
$37.50
Chapter 10
Cesare Pautasso
Model-driven architecture (MDA), design and transformation techniques can be applied with success to the domain of business process modeling (BPM)... Sample PDF
Compiling Business Process Models into Executable Code
$37.50
Chapter 11
Cinzia Cappiello, Barbara Pernici
This chapter illustrates the concept of repairable processes and self-healing functionalities and discusses about their design requirements.... Sample PDF
Design of Repairable Processes
$37.50
Chapter 12
Web Process Adaptation  (pages 245-253)
Kunal Verma
Adaptation is an important concept for Web processes. The author provides an overview of adaptation with respect to control theory and how it is... Sample PDF
Web Process Adaptation
$37.50
Chapter 13
Carlo Combi, Giuseppe Pozzi
Time is a very important dimension of any aspect in human life, affecting also information and information management. As such, time must be dealt... Sample PDF
Temporalities for Workflow Management Systems
$37.50
Chapter 14
Karsten Ploesser, Nick Russell
This chapter discusses the challenges associated with integrating work performed by human agents into automated workflows. It briefly recounts the... Sample PDF
The People Integration Challenge
$37.50
Chapter 15
Dimka Karastoyanova, Tammo van Lessen, Frank Leymann, Zhilei Ma, Joerg Nitzche, Branimir Wetzstein
Even though process orientation/BPM is a widely accepted paradigm with heavy impact on industry and research the available technology does not... Sample PDF
Semantic Business Process Management: Applying Ontologies in BPM
$37.50
Chapter 16
Hernani Mourao, Pedro Antunes
In this chapter the authors propose a solution to handle unexpected exceptions in WfMS. They characterize these events deeply and recognize that... Sample PDF
Using WfMS to Support Unstructured Activities
$37.50
Chapter 17
Guillermo Jimenez
In this chapter the authors introduce the role of a business process engineer (BPE) and necessary competencies to define, simulate, analyze, and... Sample PDF
Business Process Engineering
$37.50
Chapter 18
Christoph Bussler
This chapter introduces the application of process management to business-to-business (B2B) integration and enterprise application integration... Sample PDF
B2B and EAI with Business Process Management
$37.50
Chapter 19
Paul Grefen
This chapter is devoted to automated support for interorganizational business process management, that is, formation and enactment of business... Sample PDF
Systems for Interorganizational Business Process Management
$37.50
Chapter 20
Guido Governatori, Shazia Sadiq
It is a typical scenario that many organisations have their business processes specified independently of their business obligations (which includes... Sample PDF
The Journey to Business Process Compliance
$37.50
Chapter 21
M. Castellanos, A.K. Alves de Medeiros, J. Mendling, B. Weber, A.J.M.M. Weijters
Business Process Intelligence (BPI) is an emerging area that is getting increasingly popular for enterprises. The need to improve business process... Sample PDF
Business Process Intelligence
$37.50
Chapter 22
Diogo R. Ferreira
This chapter introduces the principles of sequence clustering and presents two case studies where the technique is used to discover behavioral... Sample PDF
Applied Sequence Clustering Techniques for Process Mining
$37.50
Chapter 23
Kamal Bhattacharya, Richard Hull, Jianwen Su
This chapter describes a design methodology for business processes and workflows that focuses first on “business artifacts”, which represent key... Sample PDF
A Data-Centric Design Methodology for Business Processes
$37.50
Chapter 24
Laura Sanchez, Andrea Delgado, Francisco Ruiz, Felix Garcia, Mario Piattini
The underlying premise of process management is that the quality of products and services is largely determined by the quality of the processes used... Sample PDF
Measurement and Maturity of Business Processes
$37.50
About the Editors
About the Contributors