A Combinational Fuzzy Clustering Approach for Microarray Spot Segmentation

A Combinational Fuzzy Clustering Approach for Microarray Spot Segmentation

Ong Pauline (Universiti Tun Hussein Onn Malaysia (UTHM), Malaysia) and Zarita Zainuddin (Universiti Sains Malaysia (USM), Malaysia)
Copyright: © 2017 |Pages: 19
DOI: 10.4018/978-1-5225-1776-4.ch012
OnDemand PDF Download:
List Price: $37.50


Due to microarray experiment imperfection, spots with various artifacts are often found in microarray image. A more rigorous spot recognition approach in ensuring successful image analysis is crucial. In this paper, a novel hybrid algorithm was proposed. A wavelet approach was applied, along with an intensity-based shape detection simultaneously to locate the contour of the microarray spots. The proposed algorithm segmented all the imperfect spots accurately. Performance assessment with the classical methods, i.e., the fixed circle, adaptive circle, adaptive shape and histogram segmentation showed that the proposed hybrid approach outperformed these methods.
Chapter Preview


Microarray study, consisting of a glass slide that contains samples with thousands of genes arranged in a rectangular grid, makes it possible to monitor the expression levels of thousands of genes simultaneously (Zainuddin & Ong, 2011). This technique has emerged as cutting-edge technology in bioinformatics. It is particularly important in distinguishing between the subtypes of the heterogeneous tumors. Different genes are expressed in the cells of heterogeneous tumors. By studying and contrasting the gene expression profiles from a microarray experiment, information about the types and amounts of mRNA present in the tumors can be obtained. This variation makes it possible to discriminate among the subtypes of tumors. Hence, gaining insight into the cellular mechanism and determining the pathway of the biological reaction are no longer obstacles.

The typical microarray experiment runs as follows (Amaratunga & Cabrera, 2004):

  • 1.

    Microarray Experiment Preparation: The extracted mRNA from control samples and experimental samples are labeled with fluorescent dyes, cy3 (green color) and cy5 (red color), respectively. Both labeled samples are mixed and poured onto the microarray slides, where the hybridization will take place based on the base-pair complementarities.

  • 2.

    Microarray Image Scanning: After the hybridization, the slides are scanned by a laser, and the fluorescent dye in the labeled samples is excited by the laser. The emitted detectable light is captured by a scanner. Samples with more bound, labeled probes will fluoresce more intensely. A gene expression matrix with rows corresponding to the single gene and columns correspond to the single sample is obtained by using image processing software to quantify the fluorescence intensities.

  • 3.

    Microarray Data Normalization: Transformation and normalization of the microarray data are completed to improve the comparability of different gene expression level values between different microarrays within an experiment.

  • 4.

    Gene Selection: The gene expression matrix consists of an overwhelming number of genes relative to the number of samples. A majority of such genes is probably irrelevant in discriminating between the subclasses of the heterogeneous cancers. To overcome the problem of over-fitting, various statistical and clustering approaches have been proposed to select the most discriminative genes as the input features to the classifier (Algamal & Lee, 2015a, 2015b; Park, Jung, Lee, & Lim, 2015; Ray, Ganivada, & Pal, 2015; Zainuddin & Ong, 2011).

  • 5.

    Classification: Different classifiers such as machine learning, decision trees, and statistical approaches have been developed to separate the subclasses of the heterogeneous cancers (Algamal & Lee, 2015a; Garro, Rodríguez, & Vázquez, 2016; Tsai, Wang, Lee, Lin, & Chiu, 2015; Zainuddin & Ong, 2011).

It thus can be seen that preparation of the microarray experiment as well as the data analysis involved many interdependent phases. All processing stages at current state highly rely on the precision from earlier steps. Occurrence of erroneous measurements and vagueness at any interdependent phases that may deteriorate the quality of a microarray experiment is undesirable. Specifically, preciseness in finding the representative gene expression values for the spots in a microarray image is the most crucial. Compensation for the miscalculation of the spot intensities during the image analysis at later stages is difficult. Therefore, the primary task in microarray image manipulation is to extract the information from the image, including the spot size and shape, image orientation, spot intensities and distance between the spots. In this study, segmentation of the gene spots is the main concern. An intuitive mean in extracting representative quantifiers for the regular spots as well as the faulty spots is proposed.

The objectives of this study can be summarized as follows:

Complete Chapter List

Search this Book: