Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

A Hybrid Classification Approach Based on Decision Tree and Naïve Bays Methods

Saed A. Muqasqas, Qasem A. Al Radaideh, Bilal A. Abul-Huda

Source Title: International Journal of Information Retrieval Research (IJIRR) 4(4)

DOI: 10.4018/IJIRR.2014100104

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Data classification as one of the main tasks of data mining has an important role in many fields. Classification techniques differ mainly in the accuracy of their models, which depends on the method adopted during the learning phase. Several researchers attempted to enhance the classification accuracy by combining different classification methods in the same learning process; resulting in a hybrid-based classifier. In this paper, the authors propose and build a hybrid classifier technique based on Naïve Bayes and C4.5 classifiers. The main goal of the proposed model is to reduce the complexity of the NBTree technique, which is a well known hybrid classification technique, and to improve the overall classification accuracy. Thirty six samples of UCI datasets were used in evaluation. Results have shown that the proposed technique significantly outperforms the NBTree technique and some other classifiers proposed in the literature in term of classification accuracy. The proposed classification approach yields an overall average accuracy equal to 85.70% over the 36 datasets.

Article Preview

Top

1. Introduction

Data mining is the field that is concerned in extracting useful knowledge from large amount of data. Data mining employs several tasks and techniques toward extracting the knowledge including: classification, clustering, and association. Data classification is considered one of the most important techniques in data mining where in data classification a model is generated by a learning process of classification and then the model can be used for predication. Data Classification has contributed to many fields, such as medical diagnosis, remote sense, radar, etc (Sarkar and Sana, 2009, Haouari, et al., 2009; Friedman, et al., 1997).

There are several techniques that have been proposed and used for data classification such as the decision tree based techniques, Naïve Bays, Neural Networks, Genetic algorithms and many others (Han and Kamber, 2006).

A Naïve Bayes is a simple probabilistic classifier that is based on applying Bayes’ theorem for Thomas Bayes with strong independence assumptions. The Naïve Bayes classifier is widely used for its simplicity and traceability and it is considered a fast learner in comparison to other complex classification techniques (Langley, et al., 1992). Because of the simplicity of Naïve Bayes algorithm and the linear run time, it becomes a popular learning classifier for many data mining applications (Hall, 2007).

In Naïve Bays classifier, to predict the class label (C_i.) of a given instance (X), the classifier need to compute the posterior probability P(C_i|X) that an instance X = (x₁, x₂, x₃, .., x_n) belongs to the class C_i. The probability is computed using the following formula where x_i is the value of attribute A_i and x_n is the value of attribute A_n.

Where P(C_i) is the priori probability P(C_i) = |C_i,D|/|D|, where |C_i,D| is the number of instances of class C_i in the training dataset and |D| is the number of the instances in the training dataset.

Naïve Bayes algorithm can deal with continuous and nominal values. In addition Naïve Bayes has the most suitable dealing with complex and incomplete dataset (Soria, et al, 2011). This indicates that the Naïve Bayes has easy dealing with a number of features or classes and it is a fast learning algorithm that examines all its training dataset (Ratanamahatana & Gunopulos, 2003).

The decision tree based algorithms such as C4.5, ID3, or CART are known methods can handle the real world datasets efficiently (Han and Kamber, 2006). The C4.5 algorithm was proposed and designed in the nineties of the last century by Quinlan (1986) after 10 years of designing ID3. C4.5 builds the decision tree in a recursive fashion where it computes the Gain ratio measure for each attribute in the dataset then selects the best attribute that has the maximal Gain ratio to be the root node of the decision tree. The attribute of the maximum gain ratio is picked up for splitting the dataset to reduce the needed information to predict a given instance in the resulting attribute’s partition.

Kohavi (1996) proposed an approach called NBTree algorithm (Naïve Bayes Tree) which combines the Naïve Bayes and Decision Tree methods. Jiang and Li (2011) proposed another algorithm called C4.5-NB which is an enhancement of the NBTree algorithm.

NBTree and C4.5-NB, have proven their efficiency on different datasets, however, NBTree learning process is considered complex, in which a Naive Bayes classifier is built on each leaf node of the resulted decision tree. On the other side, C4.5-NB uses a simple approach in the learning process but with less accuracy compared with NBTree. Therefore, there is a need to build a hybrid classifier that is simple and has a better accuracy in comparison to C4.5-NB and NBTree algorithms.

Complete Article List

Search this Journal:

Reset

Volume 14: 1 Issue (2024)

Volume 13: 1 Issue (2023)

Volume 12: 4 Issues (2022): 3 Released, 1 Forthcoming

Volume 11: 4 Issues (2021)

Volume 10: 4 Issues (2020)

Volume 9: 4 Issues (2019)

Volume 8: 4 Issues (2018)

Volume 7: 4 Issues (2017)

Volume 6: 4 Issues (2016)

Volume 5: 4 Issues (2015)

Volume 4: 4 Issues (2014)

Volume 3: 4 Issues (2013)

Volume 2: 4 Issues (2012)

Volume 1: 4 Issues (2011)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Hybrid Classification Approach Based on Decision Tree and Naïve Bays Methods

Abstract

1. Introduction

Complete Article List