Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Cluster Analysis Using Rough Clustering and K-Means Clustering

Kevin E. Voges

Source Title: Encyclopedia of Information Science and Technology, Third Edition

DOI: 10.4018/978-1-4666-5888-2.ch160

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Chapter Preview

Top

Introduction

Cluster analysis is a fundamental data reduction technique used in both the physical and social sciences. The extension of Rough Sets theory into cluster analysis through the techniques of Rough Clustering provides an important and potentially useful addition to the range of cluster analysis techniques available to the manager and the researcher.

Cluster analysis is defined as the grouping of “individuals or objects into clusters so that objects in the same cluster are more similar to one another than they are to objects in other clusters” (Hair, Black, Babin, & Anderson, 2009). There are a number of comprehensive introductions to cluster analysis (Abonyi & Feil, 2007; Arabie, Hubert & De Soete, 1996; Cramer, 2003; Everitt, Landau, Leese, & Stahl, 2011; Gan, Ma, & Wu, 2007). Techniques are often classified as hierarchical or nonhierarchical (Hair et al., 2009), and the most commonly used nonhierarchical technique is the k-means approach developed by MacQueen (1967). Over the past few decades, techniques based on developments in computational intelligence have been used as clustering algorithms. For example, the theory of fuzzy sets developed by Zadeh (1965), who introduced the concept of partial set membership, has been applied to clustering (Abonyi & Feil, 2007; Dumitrescu, Lazzerini, & Jain, 2000).

Fuzzy clustering has developed an extensive literature, too broad to be thoroughly reviewed here. However, two extensions will be briefly considered to demonstrate the flexibility of the technique. Atanassov (1986) extended Zadeh’s fuzzy set to a general form called an intuitionistic fuzzy set (IFS), which has been found to be more useful in dealing with uncertainty than a standard fuzzy set. Xu, Chen and Wu (2008) report an application of this IFS concept to clustering. In a second extension, Dunn (1973), and Bezdek (1981) proposed a Fuzzy C-means technique (FCM), which is one of the most commonly used objective function-based clustering techniques. Instead of assigning each object to a single cluster, class membership is relaxed by computing the membership grades using a unit interval. As will be seen below, this has similarities to clustering using rough sets. Izakian and Pedrycz (2014) developed an extension to the FCM, where the distance function is given adjustable weight parameters, quantifying the impact coming from blocks of features rather than from individual features. They also show the increased use of hybridization techniques (explored later in this article), using particle swarm optimization to optimize the weights. Genetic algorithms have also been applied to clustering tasks (Maulik, Bandyopadhyay, & Mukhopadhyay, 2011).

Another technique receiving considerable attention is the theory of rough sets (Pawlak, 1982), which has led to clustering algorithms referred to as rough clustering (do Prado, Engel, & Filho, 2002; Kumar, Krishna, Bapi, & De, 2007; Lingras & Peters, 2011; Parmar, Wu, & Blackhurst, 2007; Voges, Pope, & Brown, 2002).

This article provides brief introductions to k-means cluster analysis, rough sets theory, and rough clustering, and compares k-means clustering and rough clustering. The article shows that rough clustering provides a more flexible solution to the clustering problem, and can be conceptualized as extracting concepts from the data, rather than strictly delineated subgroupings (Pawlak, 1991). Traditional clustering methods generate extensional descriptions of groups (i.e. which objects are members of each cluster), whereas clustering techniques based on rough sets theory generate intensional descriptions (i.e. what are the main characteristics of each cluster) (do Prado et al., 2002). These different goals suggest that both k-means clustering and rough clustering have their place in the data analyst’s and the information manager’s toolbox.

Key Terms in this Chapter

Market Segmentation: Market segmentation is a central concept in marketing theory and practice, and involves identifying homogeneous sub-groups of buyers within a heterogeneous market. It is most commonly conducted using cluster analysis of the measured demographic or psychographic characteristics of consumers. Forming groups that are homogenous with respect to these measured characteristics segments the market.

Cluster Analysis: A data analysis technique involving the grouping of objects into sub-groups or clusters so that objects in the same cluster are more similar to one another than they are to objects in other clusters.

K-Means Clustering: A cluster analysis technique in which clusters are formed by randomly selecting k data points as initial seeds or centroids, and the remaining data points are assigned to the closest cluster on the basis of the distance between the data point and the cluster centroid.

Rough Set: The concept of rough, or approximation, sets was introduced by Pawlak, and is based on the single assumption that information is associated with every object in an information system. This information is expressed through attributes that describe the objects, and objects that cannot be distinguished on the basis of a selected attribute are referred to as indiscernible. A rough set is defined by two sets, the lower approximation and the upper approximation.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Cluster Analysis Using Rough Clustering and K-Means Clustering

Chapter Preview

Introduction

Key Terms in this Chapter

Complete Chapter List