Discovery of Characteristic Sequential Patterns Based on Two Types of Constraints

Discovery of Characteristic Sequential Patterns Based on Two Types of Constraints

Shigeaki Sakurai (Toshiba Digital Solutions, Kawasaki, Japan)
DOI: 10.4018/IJEACH.2019010105

Abstract

This article proposes a method for discovering characteristic sequential patterns from sequential data by using background knowledge. In the case of the tabular structured data, each item is composed of an attribute and an attribute value. This article focuses on two types of constraints describing background knowledge. The first one is time constraints. It can flexibly describe relationships related to the time between items. The second one is item constraints, it can select items included in sequential patterns. These constraints can represent the background knowledge representing the interests of analysts. Therefore, they can easily discover sequential patterns coinciding the interests as characteristic sequential patterns. Lastly, this article verifies the effect of the pattern discovery method based on both the evaluation criteria of sequential patterns and the background knowledge. The method can be applied to the analysis of the healthcare data.
Article Preview
Top

Introduction

Owing to the progress of computer and network environments, it is easy to collect data with time information such as daily business reports, weblog data, and physiological information. This is the context in which methods of analyzing data with time information have been studied. This paper focuses on a sequential pattern discovery method from discrete sequential data. The research expands the pattern discovery task (Agrawal & Srikant, 1994). The methods proposed by (Garofalakis et al., 2010), (Pei et al., 2001), (Srikant & Agrawal, 1996), and (Zaki, 2001) efficiently discover the frequent patterns as characteristic patterns. However, the discovered patterns do not always correspond to the interests of analysts, because the patterns are common and are not a source of new knowledge for the analysts.

The problem has been pointed out in connection with the discovery of associative rules. Blanchard et al. (2005), Brin et al. (1997), Silberschatz et al. (1996), and Suzuki et al. (2005) propose other evaluation criteria in order to discover other kinds of characteristic patterns. The patterns discovered by the criteria are not always frequent but are characteristic with some viewpoints. The criteria may be applicable to discovery methods of sequential patterns. However, these criteria do not satisfy the Apriori property. It is difficult for the methods based on the criteria to efficiently discover the patterns. Also, methods that use the background knowledge brought by analysts have been proposed in order to discover sequential patterns corresponding to their interests (Garofalakis et al., 1999), (Pei et al., 2002), (Sakurai et al., 2008b), (Yen, 2005). In addition, methods that limit the number of sequential patterns (Fournier-Viger et al, 2013), (Hathi & Ambasana, 2015), (Maciag, 2017), (Sakurai & Nisihizawa, 2015), (Tzvetkov et al., 2003) have been proposed in order to avoid discovering large amounts of patterns.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 2: 2 Issues (2020): Forthcoming, Available for Pre-Order
Volume 1: 2 Issues (2019)
View Complete Journal Contents Listing