Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Clustering Hybrid Data Using a Neighborhood Rough Set Based Algorithm and Expounding its Application

Akarsh Goyal, Rahul Chowdhury

Source Title: International Journal of Fuzzy System Applications (IJFSA) 8(4)

DOI: 10.4018/IJFSA.2019100105

This article was retracted

Abstract

In recent times, an enumerable number of clustering algorithms have been developed whose main function is to make sets of objects have almost the same features. But due to the presence of categorical data values, these algorithms face a challenge in their implementation. Also, some algorithms which are able to take care of categorical data are not able to process uncertainty in the values and therefore have stability issues. Thus, handling categorical data along with uncertainty has been made necessary owing to such difficulties. So, in 2007 an MMR algorithm was developed which was based on basic rough set theory. MMeR was proposed in 2009 which surpassed the results of MMR in taking care of categorical data but cannot be used robustly for hybrid data. In this article, the authors generalize the MMeR algorithm with neighborhood relations and make it a neighborhood rough set model which this article calls MMeNR (Min Mean Neighborhood Roughness). It takes care of the heterogeneous data. Also, the authors have extended the MMeNR method to make it suitable for various applications like geospatial data analysis and epidemiology.

Article Preview

Top

Introduction

We are living in a world full of data generated from several sources. Data describe the characteristics of living species, depict the properties of a natural phenomenon, summarize the results of a scientific experiment, and record the dynamics of a running machinery system. More importantly, data provide a basis for further analysis, reasoning, decisions, and ultimately, for the understanding of all kinds of objects and phenomena. Clustering is one of the most important data analysis activities, which helps to classify or group data having similar properties into a set of categories or clusters. It has been observed in (Anderberg, 1973; Everitt et al., 2001) that classification is one of the most primitive activities of human beings and plays an important and indispensable role in their long history. In order to learn a new object or understand a new phenomenon, people always try to identify descriptive feature and further compare these features with those of known objects or phenomena, based on their similarity or dissimilarity, generalized as proximity, according to some standards or rules. Actually, naming and classifying are essentially synonymous, according to Everitt et al. (2001). With such classification information at hand; we can infer the properties of a specific object based on the category to which it belongs. Clustering (Huang, 1998) is used to make small subsets which can be easily managed, analysed and taken care of by segmenting large hybrid data sets. Groupings which come naturally to the objects are found out using clustering. Many areas make use of clustering techniques. For instance, gene data complexity handling method was made by Wu et al. using clustering. Clustering techniques which can be used for the analysis of gene expression data (Jiang, Tang, & Zhang, 2004) were developed by Jiang et al. Positron emission tomography (PET) method (Wong, Feng, Meikle, & Fulham, 2002) was given by Wong et al. In this nuclear medical imaging was used to segment the tissues. In 1989 the segmentation of radar signals while scanning land and marine objects was done by using cluster analysis (Haimov et al., 1989). High scale research and development planning using cluster analysis was developed in (Mathieu & Gibson, 1993). These techniques mostly handle only numerical datasets. Hence, these cannot be used for data sets which have domains that are categorical (Gibson, Kleinberg, & Raghavan, 2000; Guha, Rastogi, & Shim, 2000). Earlier works in the field of clustering used to develop algorithms which could only take care of numerical data (Dempster, Laird, & Rubin, 1977) as it was very easy to formulate similarity functions between them. However, when it comes to categorical data, it becomes difficult as they have features which are multi-valued. The correspondence is in the form of values which are same in a given attribute and also objects which are similar. Because of this we have to see both in the rows as well as the columns for the similarity.

Some earliest clustering methods for categorical datasets are due to (Dempster et al., 1977; Guha et al., 2000 and Gibson et al., 2000). But these methods are not capable handling uncertainty in data. Thus, these algorithms have stability issues, which render them ineffective for real world databases having uncertainty inherent in them.

Complete Article List

Search this Journal:

Reset

Volume 13: 1 Issue (2024)

Volume 12: 1 Issue (2023)

Volume 11: 4 Issues (2022)

Volume 10: 4 Issues (2021)

Volume 9: 4 Issues (2020)

Volume 8: 4 Issues (2019)

Volume 7: 4 Issues (2018)

Volume 6: 4 Issues (2017)

Volume 5: 4 Issues (2016)

Volume 4: 4 Issues (2015)

Volume 3: 4 Issues (2013)

Volume 2: 4 Issues (2012)

Volume 1: 4 Issues (2011)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Clustering Hybrid Data Using a Neighborhood Rough Set Based Algorithm and Expounding its Application

Abstract

Introduction

Complete Article List