An Efficient Pruning and Filtering Strategy to Mine Partial Periodic Patterns from a Sequence of Event Sets

An Efficient Pruning and Filtering Strategy to Mine Partial Periodic Patterns from a Sequence of Event Sets

Kung-Jiuan Yang (Department of Information Management, Fortune Institute of Technology, Daliao District, Kaohsiung, Taiwan), Tzung-Pei Hong (Department of Computer Science and Information Engineering, National University of Kaohsiung, Nan-Tzu District, Kaohsiung, Taiwan), Yuh-Min Chen (Institute of Manufacturing Information and Systems, National Cheng Kung University, Tainan City, Taiwan) and Guo-Cheng Lan (Computational Intelligence Technology Center, Industrial Technology Research Institute, Hsinchu County, Taiwan)
Copyright: © 2014 |Pages: 21
DOI: 10.4018/ijdwm.2014040102
OnDemand PDF Download:
$30.00
List Price: $37.50

Abstract

Partial periodic patterns are commonly seen in real-world applications. The major problem of mining partial periodic patterns is the efficiency problem due to a huge set of partial periodic candidates. Although some efficient algorithms have been developed to tackle the problem, the performance of the algorithms significantly drops when the mining parameters are set low. In the past, the authors have adopted the projection-based approach to discover the partial periodic patterns from single-event time series. In this paper, the authors extend it to mine partial periodic patterns from a sequence of event sets which multiple events concurrently occur at the same time stamp. Besides, an efficient pruning and filtering strategy is also proposed to speed up the mining process. Finally, the experimental results on a synthetic dataset and real oil price dataset show the good performance of the proposed approach.
Article Preview

Introduction

In the past, periodic pattern mining has been one of the important issues in sequential pattern mining due to the discovery of regularity in time series or event sequences. Periodic pattern mining techniques have widely been applied to various business applications, such as analyzing web usage every day, tracing the regularity of company’s stock prices every week, analyzing the product re-order points every month, tracing product sales volume every season, etc (Aref et al., 2004; Elfeky et al., 2004). However, the limitation of mining frequent full periodic patterns is a strict constraint since all events in a full periodic pattern have to be known, and their positions in the pattern are fixed and frequently appeared in the long-term time-series data with a specific periodic length. To solve this problem, Han et al. (1998) thus proposed another kind of periodic pattern mining called partial periodic pattern mining, which considered most but not all points in a period contributing to an approximate cyclic behavior of a time series. For example, “The price of the stock goes up on Monday” is a partial periodicity because it only considers Mondays and says nothing about the price fluctuations. In this example, a unit of the period length is set as a week (seven days), and “the stock price goes up” can be defined as an “event”. To effectively find such patterns, Han et al. subsequently proposed a partial periodic mining algorithm called a max-subpattern hit set (Han et al., 1999), which mainly used a max-subpattern tree structure to store the hit counts and to represent candidate patterns, and then all partial periodic patterns could be found from the built tree structure. Afterward, most of existing studies related to partial periodic pattern mining directly adopted Han et al.’s max-subpattern hit set algorithm to various data applications.

To efficiently find partial periodic patterns, Yang et al. (2013) proposed a projection-based partial periodic pattern mining algorithm (abbreviated as PPA) to find partial periodic patterns with the consideration of only single event in a time stamp. Chen et al. (2011) applied FP-tree’s concept (Han et al., 2000) to proposed PFP-growth partial periodic pattern mining algorithm with multiple minimum supports. In both Yang et al.’s study and Chen et al.’s study, their algorithms adopted the ‘cycle’ encoding method used in Han et al.’s study (1998) to re-encode events into new event representation by the positions of the events in the corresponding periodic segments. However, in terms of finding partial periodic patterns with consideration of multiple events in a time stamp, a great deal of unpromising candidates was still generated in mining. Thus, it is desirable to effectively handle the problem of partial periodic pattern mining with the consideration of multiple events in a time stamp.

Based on the above reasons, this work thus presents an efficient projection-based candidate reduction algorithm (abbreviated as PRA) to find partial periodic patterns in a sequence of event sets (instead of a single event). Different from the PPA algorithm in Yang et al.’s study (2013), two effective strategies, pruning and filtering, are designed to reduce unpromising candidates in mining. With the help of the two strategies, all partial periodic patterns can efficiently be explored by the proposed PRA algorithm from the set of period sub-sequences. Finally, the experimental results on synthetic and real datasets reveal that the proposed PRA algorithm outperforms the other compared algorithms, traditional max-subpattern algorithm and the PFP-growth algorithm, in terms of execution efficiency.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 15: 4 Issues (2019): Forthcoming, Available for Pre-Order
Volume 14: 4 Issues (2018): 1 Released, 3 Forthcoming
Volume 13: 4 Issues (2017)
Volume 12: 4 Issues (2016)
Volume 11: 4 Issues (2015)
Volume 10: 4 Issues (2014)
Volume 9: 4 Issues (2013)
Volume 8: 4 Issues (2012)
Volume 7: 4 Issues (2011)
Volume 6: 4 Issues (2010)
Volume 5: 4 Issues (2009)
Volume 4: 4 Issues (2008)
Volume 3: 4 Issues (2007)
Volume 2: 4 Issues (2006)
Volume 1: 4 Issues (2005)
View Complete Journal Contents Listing