Association Rules-Based Analysis in Multidimensional Clusters

Association Rules-Based Analysis in Multidimensional Clusters

Neelu Khare, Dharmendra S. Rajput, Preethi D
Copyright: © 2017 |Pages: 17
DOI: 10.4018/978-1-5225-1776-4.ch003
OnDemand:
(Individual Chapters)
Available
$37.50
No Current Special Offers
TOTAL SAVINGS: $37.50

Abstract

Many approaches for identifying potentially interesting items exploiting commonly used techniques of multidimensional data analysis. There is a great need for designing association-rule mining algorithms that will be scalable not only with the number of records (number of rows) in a cluster but also among domain's size (number of dimensions) in a cluster to focus on the domains. Where the items belong to domain is correlated with each other in a way that the domain is clustered into classes with a maximum intra-class similarity and a minimum inter-class similarity. This property can help to significantly used to prune the search space to perform efficient association-rule mining. For finding the hidden correlation in the obtained clusters effectively without losing the important relationship in the large database clustering techniques can be followed by association rule mining to provide better evaluated clusters.
Chapter Preview
Top

Introduction

Clustering partitions data into Clusters that are meaningful and useful to analyze and describe the real world. It is known as a process of grouping physical or abstract objects into conceptually meaningful classes of similar objects. If meaningful cluster is the objective, then the cluster should extract the natural structure of the data. In many cases, cluster analysis only a useful starting point for other purposes, such as data summarization.

Clustering played an important role in discovering interesting data distributions and patterns for understanding, in areas such as, information retrieval, pattern recognition in biological or other sequence data, climate, psychology and medicine, business, machine learning, data mining or as pre processing step of data mining, etc. In context of utility, clustering is a technique to find out the most appropriate cluster prototype, these prototypes can serve as the base of various data analysis or data processing techniques such as, summarization, Compression and to efficiently finding nearest neighbour. Existing clustering approaches are divided into four categories:

  • Partitioning,

  • Hierarchical,

  • Grid-Based, and

  • Density-Based.

All these approaches are suffering from rapid degeneration of performance with increase in dimensions; particularly those are designed for low-dimensional data and due to ineffective cluster evaluation and analysis of multidimensional data owing to inherent uncertainties.

Association rule mining is a useful technique for discovering interesting relationships hidden in large data sets. Such hidden relationships can be extracted in the form of association rules or sets of frequent patterns. These association rules (AR) leads to potential knowledge to detect the presence of regularities and path in large databases. Rules represent the relations (in terms of co-occurrence) between pairs of items or among the items from different dimensions of large databases. Strength of rules are measured by: support and confidence, the rules which satisfy the minimum support and minimum confidence criteria, should be fixed in order to remove both that only trivial rules are retained and also that interesting rules are focused.

Top

Multidimensional Data Model

The Multidimensional data model consists of three types; they are:

  • Logical Multidimensional Data Model,

  • Relational Multidimensional Data Model,

  • Analytic Workspace Implementation Multidimensional Data Model.

Top

Relational Multidimensional Data Model

The Relational implementation of the multidimensional data model is typically a star schema or a snowflake schema. A star schema is a convention for organizing the data into dimension tables, fact tables, and materialized views. Ultimately, all of the data is stored in columns, and metadata is required to identify the columns that function as multidimensional objects.

Complete Chapter List

Search this Book:
Reset