Efficient Implementation of Hadoop MapReduce-Based Dataflow

Efficient Implementation of Hadoop MapReduce-Based Dataflow

Ishak H. A. Meddah (Oran University of Science and Technology – Mohamed Boudiaf, Algeria) and Khaled Belkadi (Oran University of Science and Technology – Mohamed Boudiaf, Algeria)
DOI: 10.4018/978-1-5225-3004-6.ch020


MapReduce is a solution for the treatment of large data. With it we can analyze and process data. It does this by distributing the computation in a large set of machines. Process mining provides an important bridge between data mining and business process analysis. This technique allows for the extraction of information from event logs. Firstly, the chapter mines small patterns from log traces. Those patterns are the representation of the traces execution from a business process. The authors use existing techniques; the patterns are represented by finite state automaton; the final model is the combination of only two types of patterns that are represented by the regular expressions. Secondly, the authors compute these patterns in parallel, and then combine those patterns using MapReduce. They have two parties. The first is the Map Step. The authors mine patterns from execution traces. The second is the combination of these small patterns as reduce step. The results are promising; they show that the approach is scalable, general, and precise. It minimizes the execution time by the use of MapReduce.
Chapter Preview

Many techniques are suggested in the domain of process mining, we quote:

M. Gabel and al (Gabel & Su, 2008) present a new general technique for mining temporal specification, they realized their work in two steps, firstly they discovered the simple patterns using existing techniques, then combine these patterns using the composition and some rules like Branching and Sequencing rules.

Temporal specification expresses formal correctness requirement of an application’s ordering of specific actions and events during execution, they discovered patterns from traces of execution or program source code; The simples patterns are represented using regular expression (ab)* or (ab*c)* and their representation using finite state automaton, after they combine simple patterns to construct a temporal specification using a finite state automaton.

Complete Chapter List

Search this Book: