Cluster Analysis: A Statistical Approach for E-Governance for Better Policy Decisions

Cluster Analysis: A Statistical Approach for E-Governance for Better Policy Decisions

Pankaj Nagar (University of Rajasthan, India)
DOI: 10.4018/978-1-4666-5146-3.ch006
OnDemand PDF Download:
No Current Special Offers


The cluster analysis, also known as grouping, clumping, unsupervised classification, is one of the multivariate analysis techniques. The technique of cluster analysis is highly useful in a wide range of problems related to managerial decisions, psychological solutions, categorization of business organizations on the basis of their performance for constructing separate policies for each clusters, in health sectors, societal problems, etc. For good governance there is a need to apply the proper statistical tools with ICT. Even today, the statistical tools are rarely used in the region of e-governance for better policy development. This chapter discusses the use of cluster analysis in classifying a large amount of data into sub-groups (known as clusters), which are homogeneous in a certain sense, and analyzes each sub-group separately to find solutions for each of them. The method in explained with the help of an illustration, by using the SPSS software.
Chapter Preview


Model building is one of the issues in proper development of new policies, based on quantitative facts and figures, so that an organization (Private or Public Sectors) could provide appropriate information to the public. On the other hand public could be capable to get useful information for societal development. In this connection, the applications of Statistical tool and ICT (Information and Communication Technology) play in important role for better solution of sociological, economical issues and other issues of similar kind. E- Commerce, Geo-informatics, E- governance is some of the important area which is mainly based on the knowledge of the subject and role of ICT. But, presently the application of Statistical tools in such areas is rare.

According to report of UN E-Government Survey-2012 on E-Governance, the country like India is well behind in country-ranking having Good Governance system. The importance of Statistics is well known all the worlds. To the best of my knowledge, I don’t know any of the field where the Statistical tools are not applied or applicable. Similarly, Statistical modeling on the data (or database), provided through E-governance, will definitely improve the system. In this light, a new neologism, known as ‘Governometrics’ (Sharma & Nagar, 2012) will be the new dimension for good policy syntax for various governmental dynamics. Some applications of statistical tools have been mentioned in that article. A Governometrician should start applying statistical tools with ICT for good governance.

When no any assumption are developed or no any information regarding the behavior of large amount of statistical facts and figures are not available then very first problem in front of a Statistician is of classifying the data in to multiple number of groups based on one or more characteristics of the population.

In Multivariate Data Analysis Statistical Classification Technique is applicable when an investigator makes a number of measurements in a data, based on a problem or system, and wants to classify the same into one or more categories on the basis of these measurements (T.W. Anderson, 1972). In many situations, it can be assumed that there are a finite number of categories or sub-populations from which the individual observations may come and each sub-population is categorized on the basis of values of these measurements. In fact, a Statistical Classification problem is a statistical decision problem, but the converse is not true. Classifying an observation in to one of a finite group is nothing but deciding the membership of an observation in a subpopulation. Following are some of the area where the Statistical Classification techniques have already been applied for better understanding of system:

  • 1.

    Identification of a human skull whether is of a male or female

  • 2.

    Identification of an excavated fossil whether is of an extinct species or of an existing one.

  • 3.

    Identification of a rare piece of art whether is of Moghul era, or of the pre-Moghul era of the post-Moghul era.

  • 4.

    Authorship identification of a piece of literacy work, etc.

  • 5.

    Presently, Statistical Classifications Techniques with the application of ICT Tools is very much useful in all walk of analytical analysis. Statistical classification techniques are basically of following two types:

  • 6.

    Supervised Classification

  • 7.

    Unsupervised Classification or Cluster Analysis.

In supervised classification, the researcher already knows the classes and training data that is labeled by their class membership is available to train or build a model. On the other hand, in unsupervised classification (Cluster Analysis) the researcher does not know, what classes exist in multi-dimensional data and has aim to group the same in to meaningful groups (known as clusters). In other words Clustering is the statistical operation of grouping objects (individuals or variables) into a limited number of groups known as clusters or segments.

Complete Chapter List

Search this Book: