Software Quality Modeling with Limited Apriori Defect Data

Software Quality Modeling with Limited Apriori Defect Data

Naeem Seliya (University of Michigan, USA) and Taghi M. Khoshgoftaar (Florida Atlantic University, USA)
Copyright: © 2007 |Pages: 15
DOI: 10.4018/978-1-59904-252-7.ch001
OnDemand PDF Download:
$37.50

Abstract

In machine learning the problem of limited data for supervised learning is a challenging problem with practical applications. We address a similar problem in the context of software quality modeling. Knowledge- based software engineering includes the use of quantitative software quality estimation models. Such models are trained using apriori software quality knowledge in the form of software metrics and defect data of previously developed software projects. However, various practical issues limit the availability of defect data for all modules in the training data. We present two solutions to the problem of software quality modeling when a limited number of training modules have known defect data. The proposed solutions are a semisupervised clustering with expert input scheme and a semisupervised classification approach with the expectation-maximization algorithm. Software measurement datasets obtained from multiple NASA software projects are used in our empirical investigation. The software quality knowledge learnt during the semisupervised learning processes provided good generalization performances for multiple test datasets. In addition, both solutions provided better predictions compared to a supervised learner trained on the initial labeled dataset.

Complete Chapter List

Search this Book:
Reset
Table of Contents
Foreword
Philip S. Yu
Preface
Xingquan Zhu, Ian Davidson
Acknowledgment
Xingquan Zhu, Ian Davidson
Chapter 1
Naeem Seliya, Taghi M. Khoshgoftaar
In machine learning the problem of limited data for supervised learning is a challenging problem with practical applications. We address a similar... Sample PDF
Software Quality Modeling with Limited Apriori Defect Data
$37.50
Chapter 2
Jason H. Moore
Human genetics is an evolving discipline that is being driven by rapid advances in technologies that make it possible to measure enormous quantities... Sample PDF
Genome-Wide Analysis of Epistasis Using Multifactor Dimensionality Reduction: Feature Selection and Construction in the Domain of Human Genetics
$37.50
Chapter 3
Jose Ma. J. Alvir, Javier Cabrera
Mining clinical trails is becoming an important tool for extracting information that might help design better clinical trials. One important... Sample PDF
Mining Clinical Trial Data
$37.50
Chapter 4
Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos
Multimedia objects like video clips or captioned images contain data of various modalities such as image, audio, and transcript text. Correlations... Sample PDF
Cross-Modal Correlation Mining Using Graph Algorithms
$37.50
Chapter 5
Petra Perner
This chapter introduces image mining as a method to discover implicit, previously unknown and potentially useful information from digital image and... Sample PDF
Image Mining for the Construction of Semantic-Inference Rules and for the Development of Automatic Image Diagnosis Systems
$37.50
Chapter 6
Jianting Zhang, Wieguo Liu, Le Gruenwald
Decision trees (DT) has been widely used for training and classification of remotely sensed image data due to its capability to generate human... Sample PDF
A Successive Decision Tree Approach to Mining Remotely Sensed Image Data
$37.50
Chapter 7
Tilmann Bruckhaus
This chapter examines the business impact of predictive analytics. It argues that in order to understand the potential business impact of a... Sample PDF
The Business Impact of Predictive Analytics
$37.50
Chapter 8
Anna Olecka
This chapter will focus on challenges in modeling credit risk for new accounts acquisition process in the credit card industry. First section... Sample PDF
Beyond Classification: Challenges of Data Mining for Credit Scoring
$37.50
Chapter 9
Elena Irina Neaga
This chapter deals with a roadmap on the bidirectional interaction and support between knowledge discovery (Kd) processes and ontology engineering... Sample PDF
Semantics Enhancing Knowledge Discovery and Ontology Engineering Using Mining Techniques: A Crossover Review
$37.50
Chapter 10
Amandeep S. Sidhu, Paul J. Kennedy, Simeon Simoff
In some real-world areas, it is important to enrich the data with external background knowledge so as to provide context and to facilitate pattern... Sample PDF
Knowledge Discovery in Biomedical Data Facilitated by Domain Ontologies
$37.50
Chapter 11
Malcolm J. Beynon
The efficacy of data mining lies in its ability to identify relationships amongst data. This chapter investigates that constraining this efficacy is... Sample PDF
Effective Intelligent Data Mining Using Dempster-Shafer Theory
$37.50
Chapter 12
Fedja Hadzic, Tharam S. Dillon
Real world datasets are often accompanied with various types of anomalous or exceptional entries which are often referred to as outliers. Detecting... Sample PDF
Outlier Detection Strategy Using the Self-Organizing Map
$37.50
Chapter 13
Benjamin Griffiths
Predictive accuracy, as an estimation of a classifier’s future performance, has been studied for at least seventy years. With the advent of the... Sample PDF
Re-Sampling Based Data Mining Using Rough Set Theory
$37.50
About the Authors