Receive a 20% Discount on All Purchases Directly Through IGI Global's Online Bookstore

Kevin E. Voges (University of Canterbury, New Zealand)

Copyright: © 2009
|Pages: 5

DOI: 10.4018/978-1-60566-026-4.ch091

Chapter Preview

TopIn the *k*-means approach, the number of clusters (*k*) in each partition of the data set is decided *prior to* the analysis, and data points are randomly selected as the initial estimates of the cluster centers (referred to as centroids). The remaining data points are assigned to the closest centroid on the basis of the distance between them, usually using a Euclidean distance measure. The aim is to obtain maximal homogeneity within clusters (i.e., members of the same cluster are most similar to each other) and maximal heterogeneity between clusters (i.e., members of different clusters are most dissimilar to each other).

*K*-means cluster analysis has been shown to be quite robust (Punj & Stewart, 1983). Despite this, the approach suffers from many of the problems associated with all traditional multivariate statistical analysis methods. These methods were developed for use with variables that are normally distributed and have an equal variance-covariance matrix in all groups. In most realistic data sets, neither of these conditions necessarily holds.

Rough Set: The concept of rough, or approximation, sets was introduced by Pawlak and is based on the single assumption that information is associated with every object in an information system. This information is expressed through attributes that describe the objects; objects that cannot be distinguished on the basis of a selected attribute are referred to as indiscernible. A rough set is defined by two sets, the lower approximation and the upper approximation.

K-Means Clustering: A cluster analysis technique in which clusters are formed by randomly selecting k data points as initial seeds or centroids, and the remaining data points are assigned to the closest cluster on the basis of the distance between the data point and the cluster centroid.

Cluster Analysis: A data analysis technique involving the grouping of objects into sub-groups or clusters so that objects in the same cluster are more similar to one another than they are to objects in other clusters.

Market Segmentation: A central concept in marketing theory and practice; involves identifying homogeneous sub-groups of buyers within a heterogeneous market. It is most commonly conducted using cluster analysis of the measured demographic or psychographic characteristics of consumers. Forming groups that are homogenous with respect to these measured characteristics segments the market.

Search this Book:

Reset

Copyright © 1988-2018, IGI Global - All Rights Reserved