Save 10% on All IGI Global Research Books
& OnDemand Individual Chapter & Article DownloadsAvailable exclusively on IGI Global’s Online Bookstore. Offer valid through October 31, 2024

Special Offers
- Save 10% on the IGI Global Online bookstore
  Now through October 31, 2024, save 10% on all IGI Global research books & OnDemand individual chapter & article downloads. IGI Global contributors may stack this discount with their exclusive 50% contributor discount, which is automatically applied when logged into a contributor portal account. Non-contributors may also combine the discount with one other discount, including coupon codes. Not valid on open access processing charges, e-collections, or videos. Discount is not applicable for distributors.
  Explore Books & Chapters
- IGI Global’s New Emerging Topic e-Book Collections
  Acquire highly focused and affordable Cutting-Edge Peer-Reviewed Research Content through a selection of 17 topic-focused e-Book Collections discounted up to 90%, compared to list prices. Collection topics include Artificial Intelligence, Data Science, Language Learning, Marketing and Customer Relations, Sustainability, and many more. Hosted on the InfoSci^® platform, these collections feature no DRM, no additional cost for multi-user licensing, no embargo of content, full-text PDF & HTML format, and more.
  Learn More
- Open Access Book (Free Access) - Encyclopedia of Information Science and Technology, Sixth Edition (ISBN: 9781668473665)
  The Encyclopedia of Information Science and Technology, Sixth Edition) continues the legacy set forth by the first five editions by providing comprehensive coverage and up-to-date definitions of the most important issues, concepts, and trends pertaining to technological advancements and information management within a variety of settings and industries. The entire book is being published under open access.
  Read Now
- Open Access Book (Free Access) - Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries (ISBN: 9781668456293)
  Food Sustainability, Environmental Awareness, and Adaptation and Mitigation Strategies for Developing Countries provides information on the recent technology, mitigation, and environmental protection that must be applied for food sustainability in developing countries. This book is being published under Platinum Open Access through funding from Diponegoro University, Indonesia.
  Read Now
- Open Access Book (Free Access) - New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY (ISBN: 9781668438091)
  The Walmart Corporation and the Lumina Foundation have provided funding to make New Models of Higher Education: Unbundled, Rebundled, Customized, and DIY fully open access, completely removing any paywall between scholars in education and the latest research on new models for the future of higher education.
  Read Now
- Open Access Book (Free Access) - Handbook of Research on the Global View of Open Access and Scholarly Communications (ISBN: 9781799898054)
  Through a collaboration between IGI Global and the University of North Texas, the Handbook of Research on the Global View of Open Access and Scholarly Communications has been published as fully open access, completely removing any paywall between researchers of any field, and the latest research on the equitable and inclusive nature of Open Access and all of its complications.
  Read Now
Books
- - Books by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Books by Field
Journals
- - Journals
  - OnDemand Journal Articles
  - Journals by Subject
  - Business, Administration, & Management
  - Scientific, Technical, & Medical (STM)
  - Education & Social Sciences
  - Journals by Field
e-Collections
OnDemand
Open Access
- View All Open Access Opportunities
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Find an Open Access Journal for Your Next Manuscript
  Search across all of IGI Global’s available open access publishing opportunities to unleash your research potential.
  Submit an Open Access Book Proposal
  Learn more about open access book publishing and how it can propel your research forward in the field.
  Convert Your Work to Open Access
  Already published? You can convert your work to open access to increase its impact through IGI Global’s Restrospective Open Access Program.
  Utilize Open Access Collection Database
  Open up your research potential by utilizing our open access content or integrating the open access collection into your library
  Consider Open Access Agreements
  For Libraries: consider no-cost or investment-level open access agreements with IGI Global to support your faculty's research endeavors.
  Search Funding Resources
  Looking for additional funding resources to support your open accesss endeavors? View industry resources compiled by our open access team.
  Review Open Access Policies & Ethical Guidelines
  Considering IGI Global to publish your work under open access? Review IGI Global’s open access policies and ethical guidelines
Publish with Us
Resources
- - Instructors
  - Course Adoption
  - Teaching Cases
  - K-12 Online Learning Collection
  - Authors and Editors
  - eEditorial Discovery^® System
  - Peer Review Process
  - Ethics and Malpractice
  - COPE Membership
  - Fair Use Policy
  - Open Access Publishing
  - FAQ
Catalogs
About Us

Class Imbalance Learning to Heterogeneous Cross-Software Projects Defect Prediction

Rohit Vashisht, Syed Afzal Murtaza Rizvi

Source Title: International Journal of Software Innovation (IJSI) 10(1)

DOI: 10.4018/IJSI.292021

OnDemand:

(Individual Articles)

Available

$37.50

Current Special Offers

No Current Special Offers

Abstract

Heterogeneous CPDP (HCPDP) attempts to forecast defects in a software application having insufficient previous defect data. Nonetheless, with a Class Imbalance Problem (CIP) perspective, one should have a clear view of data distribution in the training dataset otherwise the trained model would lead to biased classification results. Class Imbalance Learning (CIL) is the method of achieving an equilibrium ratio between two classes in imbalanced datasets. There are a range of effective solutions to manage CIP such as resampling techniques like Over-Sampling (OS) & Under-Sampling (US) methods. The proposed research work employs Synthetic Minority Oversampling TEchnique (SMOTE) and Random Under Sampling (RUS) technique to handle CIP. In addition to this, the paper proposes a novel four-phase HCPDP model and contrasts the efficiency of basic HCPDP model with CIP and after handling CIP using SMOTE & RUS with three prediction pairs. Results show that training performance with SMOTE is substantially improved but RUS displays variations in relation to HCPDP for all three prediction pairs.

Article Preview

Top

Introduction

The prime objective of any software development model is to ensure that the final product or service has the correct level of quality as per the end user’s requirements, called as Software Quality Assurance (SQA). Any deviation from the actual and expected results for some preset environmental configurations can be described as a defect in terms of end-user specifications. Software Development Life Cycle (SDLC)'s most important stage is testing, as it consumes a large proportion of the project's total cost. Therefore, in every software development cycle this step should be focused first. The only way to address this problem is the Software Defect Prediction (SDP) at the right time.

The traditional approach of Defect Prediction (DP) is to identify "Within-Project" defects by slicing the accessible defect dataset into two subsections so that DP model is trained with one subsection of a dataset (referred to as marked cases) and the other subsection is used to test the designed DP model which means finding marks in target application dataset which are either defective or non- defective for unidentifiable instances (Ambros et al .(2012)). Cross Project Defect Prediction (CPDP) is a research field where software project lacking enough local defect data can use data from other projects to create an effective and efficient defect predictor. Clearly, cross-project information needs to be listed before; to promote CPDP as it is applied locally (Han et al. (2011)). Homogeneous CPDP gathers common software metrics/features from both parent application (DP model is trained using it’s defect data) & target application (DP model is designed for this) (He et al. (2014)). But, in case of HCPDP, there is no requirement of common metrics between datasets of prediction pair. Matched metrics can be found by measuring the correlation coefficient among all possible combinations of software features between two applications. To predict project-wide defects among heterogeneous projects, the combinations of feature pairs showing some uniform kind of variations in their values are taken as common features between considered pair of datasets. In this article, the authors are attempting to forecast defects in software application which have features that are entirely heterogeneous from the feature set of the source application and also depriving of defect data for constructing effective DP model. Figure 1 shows a clear disparity between Homogeneous CPDP & Heterogeneous CPDP.

Figure 1.

Classification of Cross Software Projects Defect Prediction

The proposed research work offers a four-phase novel HCPDP model for addressing the same problem. In addition to this, it also focuses on uneven ratio of instances in training dataset for a binary classification problem known as Class Imbalance Problem (CIP). In order to resolve this issue, the proposed work employs resampling techniques to achieve CIL for an imbalanced training dataset. SMOTE is used as OS technique & RUS is used as US technique to tackle skewness in distribution of instances in training dataset. The paper addresses the following key areas of study. In order to boost the accuracy of a SDP model, the article's motivation is to investigate and fix this CIP in imbalanced datasets.

RQ1. Compare the performance of proposed HCPDP model with CIP and after handling CIP using SMOTE & RUS.
RQ2. Which of the two techniques of resampling (SMOTE & RUS) is giving better outcome?
RQ3. Contrast the results of defect prediction for two categories which are WPDP & HCPDP.

Complete Article List

Search this Journal:

Reset

Volume 12: 1 Issue (2024)

Volume 11: 1 Issue (2023)

Volume 10: 4 Issues (2022): 2 Released, 2 Forthcoming

Volume 9: 4 Issues (2021)

Volume 8: 4 Issues (2020)

Volume 7: 4 Issues (2019)

Volume 6: 4 Issues (2018)

Volume 5: 4 Issues (2017)

Volume 4: 4 Issues (2016)

Volume 3: 4 Issues (2015)

Volume 2: 4 Issues (2014)

Volume 1: 4 Issues (2013)

View Complete Journal Contents Listing

MLA

APA

Chicago

Export Reference

Class Imbalance Learning to Heterogeneous Cross-Software Projects Defect Prediction

Abstract

Introduction

Complete Article List