Kernel Clustering for Knowledge Discovery in Clinical Microarray Data Analysis
Nathalie L.M.M. Pochet (Katholieke Universiteit Leuven, Belgium), Fabian Ojeda (Katholieke Universiteit Leuven, Belgium and National Alliance of Christian Mutualities, Belgium), Frank De Smet (Katholieke Universiteit Leuven, Belgium), Tijl De Bie (Katholieke Universiteit Leuven, Belgium) and Johan A.K. Suykens (Katholieke Universiteit Leuven, Belgium)
Copyright: © 2007
Clustering techniques like k-means and hierarchical clustering have shown to be useful when applied to microarray data for the identification of clinical classes, for example, in oncology. This chapter discusses the application of nonlinear techniques like kernel k-means and spectral clustering, which are based on kernel functions like linear and radial basis function (RBF) kernels. External validation techniques (e.g., the Rand index and the adjusted Rand index) can immediately be applied to these methods for the assessment of clustering results. Internal validation methods like the global silhouette index, the distortion score, and the Calinski-Harabasz index (F-statistic), which have been commonly used in the input space, are reformulated in this chapter for usage in a kernel-induced feature space.