Article Preview
Top1. Introduction
Image segmentation is to simplify and/or change the representation of an image into something that is more meaningful and easier to analyze (Shapiro & Stockman, 2011). Pixels within a segment have high coherence with respect to certain features, such as color, texture, intensity, and spatial information, while pixels from different segments have significant difference in the same feature space or subspace of the space. Image segmentation has been widely applied in a wide spectrum of applications. For example, in the medical field, digital images of histological slides can be used to classify skin biopsies as either melanoma or nevi (Osborne, Gao, Chen, Andea & Zhang, 2011). Image segmentation as the first step is performed to recognize different tissues in the slide. Another example is the Content-based Image Retrieval (CBIR). Users can submit a query based on an uploaded image, or further filter the results, which are returned by the keywords searching, according to their contents. Image segmentation partitions an image into more meaningful contents as “visual keywords” to allow for more powerful queries.
Clustering (Everitt, 2012; Jain, Murty & Flynn, 1999) provides a good way for image segmentation since pixels within the same segment share some common characteristics. The most well-known and classical partitioning algorithm is K-means (Han & Kamber, 2005; MacKay, 2003). Given a set of data points X = (X1, X2,…, Xn), where each data point is a d-dimensional vector. The goal of K-means is to partition the n data points into K groups (K ≤ n) G = (G1, G2, …, GK) so as to minimize the within-cluster sum of squares (criterion function), namely , where mi is the mean of cluster Gi. First, it randomly selects K data points as the initial means or centers of K clusters. Then, the rest of data points are assigned to the most similar clusters, based on a similarity measure (e.g., Euclidean distance) between data points and cluster means. The new mean (center) of each cluster is updated based on assigned data points to the cluster. This process iterates until the criterion function converges or is smaller than a threshold. Most clustering algorithms in image segmentation domain work on the full feature space including single or multiple modalities. Spectral clustering (Ng, Jordan, & Weiss, 2001; Shi & Malik, 2000) starts from transforming the original data matrix into the eigenvectors of matrix, and then detects k well-separated clusters on the surface of the k-sphere. Hill-climbing K-means algorithm (HC) (Ohashi, Aghbari & Makinouchi, 2003) initially discretizes the full feature space (e.g., LUV or HSV color space) in order to find the local maxima by using the hill-climbing technique (Russell & Norvig, 2003). Each local maximum is treated as one of the initial cluster seeds which are the input for the later K-means clustering. On one hand, HC can automatically determine the number of clusters, and effectively generate the segmentation result. On the other hand, HC is a global method, so it cannot find clusters that are best represented in subspaces as shown in Figure 1.
Figure 1. Optimal segmentation results from certain subspace of HSV by using HC algorithm