Clustering and Bayesian Networks

Clustering and Bayesian Networks

Bhanu Chander (Pondicherry University, India)
Copyright: © 2020 |Pages: 24
DOI: 10.4018/978-1-7998-0106-1.ch004


The goal of this chapter is to present an outline of clustering and Bayesian schemes used in data mining, machine learning communities. Standardized data into sensible groups is the preeminent mode of understanding as well as learning. A cluster constitutes a set regarding entities that are alike and entities from different clusters are not alike. Representing data by fewer clusters inevitably loses certain fine important information but achieves better simplification. There is no training stage in clustering; mostly, it's used when the classes are not well-known. Bayesian network is one of the best classification methods and is frequently used. Generally, Bayesian network is a form of graphical probabilistic representation model that consists of a set of interconnected nodes, where each node represents a variable, and inter-link connection represents a causal relationship of those variables. Belief networks are graph symbolized models that successfully model familiarity via transmitting probabilistic information to a variety of assumptions.
Chapter Preview


Clustering is one of the classification task introduced on a finite set of objects. Clustering is unsupervised learning model; therefore like other unsupervised machine learning models, it will generate a definite model structure in collected works of unlabelled data/information. Clearly, clustering could be the progression of systematizing objects in the direction of groups whose members are look-alike in the same way. Systematizing data points into reasonable groups is the major fundamental task of understanding and learning. In 1970 Tyron stated regarding clustering as “understanding our real-world requires conceptualizing the similarities and difference among entities that compose it”. Following that In 1974 Everett stated definition for cluster as “A Cluster is aggregation of points in the test space such that the distance among any two points in the cluster is less than the distance among any points in the cluster and any point not in it” or “Cluster may be described as connected regions of a multi-dimensional space containing a relatively high density of points, separated from other such regions by a region containing a relatively low density of points”. From the above-mentioned definitions, the objects to be organized as a cluster are characterized as points in the measurement field. If we see a cluster in the paper it is easily recognizable but we cannot clear how it is made, because of clusters made due to different objects harmonized into clusters with consideration of special intentions or measurement metrics in intellect. It is trouble-free to give a determined explanation for cluster however it is complicated to give a practical explanation for the cluster. Shapes and sizes of clusters different based on data reveal methods moreover timely cluster membership also changes. Numbers of clusters depend on the resolution by means of which we analysis data (Rokash 2005; Pavel et al 2002; Satya et al., 2013; Kaur et al., 2014; suman et al., 2014; Mehrotra et al., 2017; Anil k et al., 1990).

Complete Chapter List

Search this Book: