An Information-Theoretic Framework for Process Structure and Data Mining
Gianluigi Greco (University of Calabria, Italy), Antonella Guzzo (University of Calabria, Italy) and Luigi Pontieri (Institute of High Performance Computing and Networks, Italy)
Copyright: © 2008
Mining process logs has been increasingly attracting the data mining community, due to the chances the development of process mining techniques can offer to the analysis and design of complex processes. Currently, these techniques focus on “structural” aspects by only considering which activities were executed and in which order, and disregard any other kind of data usually kept by real systems (e.g., activity executors, parameter values, and time-stamps). In this article, we aim at discovering different process variants by clustering process logs. To this purpose, an information-theoretic framework is used to simultaneously cluster the logged process traces, encoding structural information, as well as a number of performance metrics associated with them. Each cluster is equipped with a specific model, so providing the analyst with a compact and handy description of major execution scenarios for the process.