The explosive growth of collected and stored data has generated a need for new techniques transforming these large amounts of data into useful comprehensible knowledge. Among these techniques, referred to as data mining, sequential pattern approaches handle sequence databases, extracting frequently occurring patterns related to time. Since most real-world databases consist of historical and quantitative data, some works have been done for mining the quantitative information stored within such sequence databases, uncovering fuzzy sequential patterns. In this chapter, we first introduce the various fuzzy sequential pattern approaches and the general principles they are based on. Then, we focus on a complete framework for mining fuzzy sequential patterns handling different levels of consideration of quantitative information. This framework is then applied to two real-life data sets: Web access logs and a textual database. We conclude on a discussion about future trends in fuzzy pattern mining.
Key Terms in this Chapter
Knowledge Discovery in Databases (KDD): KDD is the automated process of turning raw data into useful information by which intelligent computer systems sift and sort through data to look for patterns or to predict trends. It is generally considered to be the nontrivial extraction of implicit, previously unknown, and potentially useful information from data.
Sequence: Database: It is any database that consists of sequences of ordered events, with or without concrete notions of time.
Sequential Pattern Mining: It is the mining of frequently occurring patterns related to time or other sequences.
Attribute: An attribute is a single characteristic of records in a database. An attribute domain is the set of possible values of this attribute. The attribute type characterizes the type of values in the attribute domain. This type may be, for instance, Boolean, textual, or numerical.
Data Preprocessing: It is one of the first steps of the KDD process. It refers to data preparation, where variable samples are selected, new variables are constructed, and existing ones are transformed, depending on the goal of the discovery process. During this step data are converted to the input format of the mining algorithm.
Record: A record (also a tuple) is the collection of attribute values that represents one object in a database.
Association Rule Mining: It is the mining of rules showing attribute value conditions frequently occurring together in a given set of data.
Quantitative Data: It is data for which the attributes are described on numerical domains.