Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

Uncertainty in Concept Hierarchies for Generalization in Data Mining

Theresa Beaubouef, Frederick Petry

Source Title: Efficiency and Scalability Methods for Computational Intellect

DOI: 10.4018/978-1-4666-3942-3.ch003

OnDemand:

(Individual Chapters)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Attribute oriented induction is an approach used in data mining to provide summaries of data in a database by the process of generalization that can be used for knowledge discovery in the form of rules or patterns. This is accomplished through the use of a concept hierarchy. When uncertainty is involved in the development and use of the concept hierarchy, the theory behind the uncertainty models in use must first be established. This chapter focuses on providing the foundations for defining imprecise hierarchies and the generalization process with crisp and rough data and hierarchies. Scaling and efficiency issues here involve the problems of creation of appropriate concept hierarchies and the scaling of the generalization process to deal with large databases.

Chapter Preview

Top

Introduction

The world abounds in data, and as technology advances, opportunities for collecting, storing, and using this data increases. The magnitude of such data, as well as its typical lack of organization, however, can prove to be daunting without some means of automatically generating useful information from it. The process of data mining has developed into a useful tool for discovering interesting patterns and relationships in data, and these techniques have benefited information systems and users in a wide variety of fields.

One of the more widely known uses of data mining is for marketing purposes. Often the goal is to predict customer behavior (Chopra, Bhanbri, & Krishan, 2011) or to target selected groups for advertising purposes. Managers can use information from data mining to determine strategies for maximizing results without investing in strategies determined to have lower impact on the bottom line.

In the healthcare industry, data mining can help patients obtain better and less expensive healthcare while providing better information for both healthcare providers and patients (Koh & Tan, 2005; Rafalsky, 2002). It can be used to evaluate treatment practices, help with customer relation management, and detect fraud and insurance abuses. Data mining in healthcare can also alert providers and authorities about possible epidemics and bioterrorism threats (Piazza, 2002).

Data mining is also well established in a variety of scientific and engineering applications (Grossman, Kamath, Kegelmeyer, & Kumar, 2001). In spatial databases and geographic information systems, data contains positional information that often allows for the discovery of patterns involving spatial relationships (Miller & Han, 2001; Kopersky & Han, 1995). It has been used in numerous ways including the study of demographics (Malerba, 2002).

With new technologies, Web usage, social media, and smart devices come additional opportunities for data mining applications. Radio frequency identification (RFID), for example, can generate huge volumes of data and there is a great need for data mining techniques to assist with tracking, business processes, and organization (Kim, Kim, Jung, Kang, &Noh, 2009). Use of data mining for Web use has also been developed (Pohle & Spiliopoulou, 2002).

In data mining applications it is often the case that data or concepts can be generalized in an effort to discover useful patterns or rules in the data. This generalization must be done in some systematic and meaningful way. One approach is through the use of attributed oriented induction which provides summaries of data in a database by generalization. Generalization is achieved by using a concept hierarchy. Specific attribute values in a database tuple are replaced by more general values higher in the hierarchy. The resulting tuples may then be merged, ultimately producing a reduced number of tuples that represent a summarization of the data. This is related to, but more general than the roll-up operation on a data cube. This provides data summarization, a process of grouping of data, enabling transformation of similar item sets, stored originally in a database at the low (primitive) level, into more abstract conceptual representations. When either the data or the generalization process incorporates uncertainty, however, there can be many ways to determine how data is generalized.

Complete Chapter List

Search this Book:

Reset

MLA

APA

Chicago

Export Reference

Uncertainty in Concept Hierarchies for Generalization in Data Mining

Abstract

Introduction

Complete Chapter List