Special Offers
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education
  - Journals by Field
e-Collections
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us
Newsroom

A Framework for Homogeneous Cross-Project Defect Prediction

Lipika Goel, Mayank Sharma, Sunil Kumar Khatri, D. Damodaran

Source Title: International Journal of Software Innovation (IJSI) 9(1)

DOI: 10.4018/IJSI.2021010105

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Often, the prior defect data of the same project is unavailable; researchers thought whether the defect data of the other projects can be used for prediction. This made cross project defect prediction an open research issue. In this approach, the training data often suffers from class imbalance problem. Here, the work is directed on homogeneous cross-project defect prediction. A novel ensemble model that will perform in dual fold is proposed. Firstly, it will handle the class imbalance problem of the dataset. Secondly, it will perform the prediction of the target class. For handling the imbalance problem, the training dataset is divided into data frames. Each data frame will be balanced. An ensemble model using the maximum voting of all random forest classifiers is implemented. The proposed model shows better performance in comparison to the other baseline models. Wilcoxon signed rank test is performed for validation of the proposed model.

Article Preview

Top

1. Introduction

One of the vital activities and yet costly in the Software Development Process is Software Testing. It is mandatory and fundamental to manage all the limited resources the authors have in the present outline like workforce, time, monetary etc. To identify the part of software that are more likely to produce error and also requires considerations, software prediction models are useful in this scenario. Software defect prediction is one of the most heated topics at present in Software Engineering domain. Studies from the prediction models states that past data on software bugs in that particular software project can predict defects in its upcoming improvised versions. This approach is termed as Within-Project Defect Prediction (WPDP). The aspect of the training data and the machine learning techniques are used to impel and consume the conjecturing power of Software model. The WPDP examine the defect conjecture models that take up the preceding data, but the clear past records of the data are maintained only by few companies. Within-Project Defect Prediction has a drawback when a project has only limited historical bug related data due to wide pertinence of Cross-Project Defect Prediction, it has been the attraction for the researchers as it reunite and collect training set of the existing models.

To solve this mentioned problem, researchers have tried to apply defect prediction in cross projects by building the models for one project and predicting the other project. This approach is known as Cross-Project Defect Prediction (CPDP). The main aim of CPDP is to predict bug-prone instances (such as classes) in a project based on the data collected from other projects. CPDP is broadly classified into Homogeneous and Heterogeneous CPDP. When the source training project has the same set of features as of target project it is known as Homogeneous CPDP whereas when the target and the source project has different metrics or features then it is termed as Heterogeneous CPDP. The feasibility and potential usefulness of CPDP built with a number of software metrics have been validated, but how to improve the performance of CPDP models is still an open issue. Through various studies it has been also concluded that suitable training data set selection can also improve the performance of the model in defect prediction. Hence training data selection from widely available public repositories is an important research area in CPDP. Since data is collected from different projects in CPDP therefore there is an imbalance in the number of defective and non defective instances. This leads to improper training of the classification model. Such models are usually biased in nature thereby impacting the performance of prediction. Figure 1 shows the difference between WPDP, Homogeneous and Heterogeneous CPDP.

Figure 1.

Within project, homogeneous and heterogeneous cross project defect prediction

The objective of this work is to propose a novel defect prediction ensemble model that will perform in a bi-fold manner. Firstly, it will handle the imbalance nature of the dataset. It will partition the training data into seven data where each data frame will have approximately equal number of defect prone and non defect prone classes. Each of the seven data frame is trained. An ensemble model based on the maximum voting of the seven Random Forest is modeled. Secondly, this proposed model will perform cross project defect prediction besides handling the class imbalance problem. 7 fold cross validation is performed to evaluate the training accuracy of the proposed model. Finally, to prove the validity of the model Wilcoxon signed rank test is performed. The research question addressed in this paper is:

RQ. Does the proposed ensemble model outperform the existing models?

The significant contributions of this paper are:

1.
To develop a model to handle the class imbalance problem in CPDP.
2.
To develop a standalone ensemble framework for cross project defect prediction.

Complete Article List

Search this Journal:

Reset

Volume 12: 1 Issue (2024)

Volume 11: 1 Issue (2023)

Volume 10: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 9: 4 Issues (2021)

Volume 8: 4 Issues (2020)

Volume 7: 4 Issues (2019)

Volume 6: 4 Issues (2018)

Volume 5: 4 Issues (2017)

Volume 4: 4 Issues (2016)

Volume 3: 4 Issues (2015)

Volume 2: 4 Issues (2014)

Volume 1: 4 Issues (2013)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

A Framework for Homogeneous Cross-Project Defect Prediction

Abstract

1. Introduction

Complete Article List