Scaled Fuzzy Graph for Cluster Analysis in DNA Sequence of Olfactory Receptors

Scaled Fuzzy Graph for Cluster Analysis in DNA Sequence of Olfactory Receptors

Satya Ranjan Dash (School of Computer Application, KIIT University, Bhubaneswar, India), Satchidananda Dehuri (Department of Systems Engineering, Ajou University, Suwon, Korea) and Uma Kant Sahoo (School of Computer Application, KIIT University, Bhubaneswar, India)
DOI: 10.4018/jhisi.2013010104
OnDemand PDF Download:
No Current Special Offers


Olfactory receptors (ORs) are responsible for recognition of odor molecules. The deoxyribonucleic acid (DNA) sequences of these receptors are severely affected by local mutations. Therefore, to study the changes among affected and non-affected ORs, the authors attempted to use unsupervised learning (clustering) algorithm. In this paper, they have used a scaled fuzzy graph model for clustering to study the changes before and after the local mutation on DNA sequences of ORs. Their simulation study at the fractional dimensional level confirms its accuracy.
Article Preview

1. Introduction

The non-trivial method of identifying suitable, narrative, potentially helpful, and ultimately clear/understandable patterns in data is Knowledge Discovery in Databases (KDD) (Cao, 2012). Data mining algorithms find patterns in bulky amounts of data by appropriate models that are not essentially statistical models. In general, the data mining method is to extract information from a data set and convert it into an understandable composition for additional use, the key term is discovery, normally defined as “extracting something new”. Discovered patterns should be true on fresh data with some amount of confidence. Simplify to the future use in some other data. Some of the important tasks of data mining includes: association rule mining, classification, clustering, regression, etc.

At the other hand, Life science research has witnessed a paradigm shift over the last decade with a data rich environment (Frawley, Piatetsky-Shapiro, & Matheus, 1992). The ongoing incursion of these data, the inherent uncertainties in data collection processes, and the gap between data collection and knowledge curation have created ample of scopes for data mining researchers. In fact, most of recent researches in various life-science related disciplines such as personalized genomics, functional genomics, proteomics and structural genomics, DNA sequence analysis are data driven, where knowledge discovery and data mining processes are playing increasingly important roles.

While tremendous progress has been made over the years, many of the fundamental problems in bioinformatics, such as protein structure prediction, gene–environment interaction, and regulatory pathway mapping are still open. Data mining will continue to play an essential role in understanding these fundamental problems and in the development of novel therapeutic/diagnostic solutions in post-genome medicine.

The olfactory system (Firestein, 2001) has the notable capability to discriminate a wide range of odor molecules. In humans, smell is rather considered to be an esthetic sense in contrast to most other species, which rely on olfaction to detect food, predators, and mates. Terrestrial animals, including humans, smell air-borne molecules, whereas aquatic animals smell water-soluble molecules with low volatility, such as amino acids. Humans are thought to have a poor olfactory ability compared with other animals such as dog or rodents, and yet they can perceive a vast number of volatile chemicals. Of the millions of volatile molecular species that have been catalogued by chemists, hundreds of thousands of distinct odors can be detected by the human nose.

Odorants, typically small organic molecules of less than 400 Da, can vary in size, shape, functional groups and charge (Sosinsky, Glusman, & Lancet, 2000). They include a set of various alcohols, aliphatic acids, aldehydes, ketones and esters; chemicals with aromatic, alicyclic, polycyclic or heterocyclic ring structures; and innumerable substituted chemicals of each of these types, as well as combinations of them.

However, subtle differences in the structure of an odorant, even between two enantiomers, can lead to pronounced modifications in odor quality. In this paper our special focus is to study the changes of OR before or after local mutations, through difference in clustering results on DNA sequences of ORs this work can detect the changes.

Clustering is a widely used knowledge discovery technique. It helps to uncover structures in data that were unknown in the past. The clustering of huge data sets has been the center of attraction in recent years; however, clustering is a difficult task since many existing algorithms fail to do well in scaling with the data size and the amount of dimensions that describe the points. In this paper, we present new clustering techniques, based on the fractal properties of the data sets, using fuzzy graphs and its cohesive matrix.

Clustering has been a fundamental problem in areas like bioinformatics (Jiang, Jang, & Zhang, 2004), data mining, pattern recognition (Jain, Murty, & Flynn, 1999), and image analysis. Clustering techniques used in many applications are either dominated by distance based clustering or connectivity based clustering. A few a-like algorithms have been used in Yeh and Bang (1975). However, the other popular category of clustering based on graph theory approach too igniting many researchers, for application in bioinformatics field.

Complete Article List

Search this Journal:
Volume 17: 2 Issues (2022): 1 Released, 1 Forthcoming
Volume 16: 4 Issues (2021)
Volume 15: 4 Issues (2020)
Volume 14: 4 Issues (2019)
Volume 13: 4 Issues (2018)
Volume 12: 4 Issues (2017)
Volume 11: 4 Issues (2016)
Volume 10: 4 Issues (2015)
Volume 9: 4 Issues (2014)
Volume 8: 4 Issues (2013)
Volume 7: 4 Issues (2012)
Volume 6: 4 Issues (2011)
Volume 5: 4 Issues (2010)
Volume 4: 4 Issues (2009)
Volume 3: 4 Issues (2008)
Volume 2: 4 Issues (2007)
Volume 1: 4 Issues (2006)
View Complete Journal Contents Listing