A System for Predictive Data Analytics Using Sequential Rule Mining

A System for Predictive Data Analytics Using Sequential Rule Mining

Sandipkumar Chandrakant Sagare (D.K.T.E. Society's Textile and Engineering Institute, Ichalkaranji, India), Suresh Kallu Shirgave (D.K.T.E. Society's Textile and Engineering Institute, Ichalkaranji, India) and Dattatraya Vishnu Kodavade (D.K.T.E. Society's Textile and Engineering Institute, Ichalkaranji, India)
Copyright: © 2020 |Pages: 17
DOI: 10.4018/IJSI.2020100107

Abstract

In the current scenario of the business world, the importance of data analytics is quite large. It certainly benefits the businesses in the decision-making process. Sequential rule mining can be widely utilized to extract important data having variety of applications like e-commerce, stock market analysis, etc. Predictive data analytics using the sequential rule mining consists of analyzing input sequences and finding sequential rules that can help businesses in decision making. This article presents an approach called M_TRuleGrowth that generates partially-ordered sequential rules efficiently. The authors conducted an experimental evaluation on real world dataset that provides strong evidence that M_TRuleGrowth performs better in terms of execution time.
Article Preview
Top

1. Introduction

There are various application areas of data analytics as well as machine learning where the data to be analyzed is organized in terms of sequence of events. It is useful to identify relationships between event occurrences hidden in database as it provides a good understanding of relations of events for prediction of the next event (Mannila et al., 1999). In data mining, one of the useful techniques for discovery of temporal relations between events in discrete time series is sequential pattern mining (Mannila et al., 1999; Agrawal & Srikant 1995; Pei et al., 2004). Sequential pattern mining discovers sequences of events that frequently appear in a sequence database. That is the subsequences which appear in sequence database having support greater than or equal to threshold value of support set by the user can be found using Sequential pattern mining (Mannila et al., 1999; Agrawal & Srikant 1995; Pei et al., 2004).

There are wide varieties of algorithms developed for mining standard sequential rules. CMRules (Fournier-Viger et al., 2012) is the algorithm that mines sequential rules common to several sequences in a sequence database. The algorithm is based on association rule mining and is very efficient. It can be used to find both sequential rules and association rules in a database.

For Partially Ordered Sequential Rules (POSR), RuleGrowth algorithm (Fournier-Viger et al., 2011) was used which utilizes pattern-growth approach to find POSR that are common to several sequences. RuleGrowth (Fournier-Viger et al., 2011) does not use the existing techniques of discovering candidate rules and then testing them. Rules are discovered in incremental fashion by RuleGrowth. The process of rule discovery starts with two items and then rules grow by scanning the database for expanding the left and right part of rule. TRuleGrowth algorithm (Fournier-Viger et al., 2015) takes an extra parameter window size compared to RuleGrowth (Fournier-Viger et al., 2011). TRuleGrowth algorithm (Fournieir-Viger et al., 2015) makes use of window size for discovering the rules that occur within the sliding window. Rules of size 1*1 are enforced by this constraint. Left and right side of the sequential rule is modified accordingly. This makes TRuleGrowth algorithm (Fournier-Viger et al., 2015) an extension of RuleGrowth (Fournier-Viger, 2011) which ensures that the constraint of sliding window is taken into the consideration while generating rules. Finding rules occurring in a sliding-window has several useful advantages. First is it can reduce the time required for execution by reducing the search space. Second is it can generate a much smaller set of sequential rules which minimizes the requirement of disk space for storing sequential rules generated and makes it easy to analyze results (Fournier-Viger, 2015).

Thus, the System for mining POSR is an extension of the TRuleGrowth algorithm (Fournier-Viger et al., 2015). It uses M_TRuleGrowth approach which is multithreaded version of the preprocessing part of existing TRuleGrowth algorithm (Fournier-Viger et al., 2015). This approach analyzes the input and applies the multithreading technique. Use of multithreading minimizes the time required for preprocessing and in turn the overall execution time. Then the sequential rules generated can be used for the decision making in applications such as e-commerce, stock market analysis, etc.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2021): Forthcoming, Available for Pre-Order
Volume 8: 4 Issues (2020)
Volume 7: 4 Issues (2019)
Volume 6: 4 Issues (2018)
Volume 5: 4 Issues (2017)
Volume 4: 4 Issues (2016)
Volume 3: 4 Issues (2015)
Volume 2: 4 Issues (2014)
Volume 1: 4 Issues (2013)
View Complete Journal Contents Listing