Clustering and Query Optimization in Fuzzy Object-Oriented Database

Clustering and Query Optimization in Fuzzy Object-Oriented Database

Thuan Tan Nguyen (Duy Tan University, Da Nang, Viet Nam), Ban Van Doan (Institute of Information Technology – VAST, Hà Nội, Viet Nam), Chau Ngoc Truong (Information Technology Department – Danang University of Technology, Da Nang, Viet Nam) and Trinh Thi Thuy Tran (Information Technology Department – Duy Tan University, Da Nang, Viet Nam)
Copyright: © 2019 |Pages: 17
DOI: 10.4018/IJNCR.2019010101

Abstract

The purpose of the clustering method is to provide some meaningful partitioning of the data set. In general, finding separate clusters with similar members is essential. A problem in clustering is how to determine the number of optimal clusters that best fits the data set. Most clustering algorithms generate a partition based on input parameters (for example, cluster number, minimum density) which results in limiting the number of clusters. Therefore, the article proposes an improved EMC clustering algorithm that is more flexible in handling and manipulating those clusters, where input parameter values are assumed to be different clusters for different partitions of a data set. In addition, based on the above partitioning results, this article proposes a new approach to processing and optimizing fuzzy queries to improve efficiency in the manipulation and processing of specific data such as (less time consuming, less resource consuming)
Article Preview
Top

Introduction

In data clusters, especially for data with many different attributes and grouped into multiple groups, the GMM model is considered appropriate for this choice. Therefore, the authors (Wang et al., 2017) have performed optimization of clustering parameters for this model. Moreover, based on the flexibility in clustering as well as the distribution of data, database models also use this model to clustered data.

Aim for increase the efficiency of query processing in relational and object-oriented database models. Most of them perform pre-processing steps such as data clustering, query optimization before query execution, and returns results to the user. For example, for object-oriented databases to increase the efficiency of query processing, the authors proposed a method for discriminating horizontal data based on the C-means fuzzy clustering algorithm (Darabant et al., 2005). Such a fuzzy object-oriented database should also perform the same (FOODB) (Shrivastava, 2013; Yan & Ma, 2013; Alhaji & Arkun, 1993; Kumar et al., 2014; Isran & Israni, 2017; Wedashwara et al., 2015; Yan & Ma, 2013; Pons & Vila, 2013). This paper proposes new approaches such as:

  • Optimize for clustering flexibly using advanced EMC algorithm. The Expectation Maximization Coefficient (EMC) algorithm is improve by the Expectation Maximization (EM) algorithm (Vila & Schniter, 2013; Ahmed et al., 2017; Hao et al., 2014; Jung et al., 2014; Long et al., 2014) by adding step (C). In this (C) step, author use the coefficient of variation to increase the softness in the clustering process. More specifically, author partition the clusters as well as calculate the density distribution of the elements in each cluster based on (coefficient of variation in the distance between elements in a cluster) as efficiently as possible. In addition, the EMC algorithm reduces local optimization and increases global optimization and is covered in section 1.

  • The output of the MEC algorithm as an input to the algorithm for identifying fuzzy interval by applying statistical methods, author use both standard deviation and mean to calculate the upper and lower boundary for fuzzy interval.

  • Finally, this paper proposes methods for optimizing and processing queries based on fuzzy (FOA) (Nguyen et al., 2018; Alhaji & Arkun, 1993; Yan et al., 2014; Kumar et al., 2014) and rules equivalent conversion, in order to increase the efficiency of extracting data based on the fuzzy interval proposed above.

The structure of the paper is divided into the following main parts: part 2 presents about clustering optimization and fuzzy foundations, part 3 presents a query optimization approach based on equivalent transformations and a proposed heuristic algorithm, and the conclusion is stated in part 4.

Complete Article List

Search this Journal:
Reset
Open Access Articles: Forthcoming
Volume 9: 4 Issues (2020): 1 Released, 3 Forthcoming
Volume 8: 4 Issues (2019)
Volume 7: 4 Issues (2018)
Volume 6: 2 Issues (2017)
Volume 5: 4 Issues (2015)
Volume 4: 4 Issues (2014)
Volume 3: 4 Issues (2012)
Volume 2: 4 Issues (2011)
Volume 1: 4 Issues (2010)
View Complete Journal Contents Listing