Software Quality Modeling with Limited Apriori Defect Data

Software Quality Modeling with Limited Apriori Defect Data

Naeem Seliya (University of Michigan, USA) and Taghi M. Khoshgoftaar (Florida Atlantic University, USA)
Copyright: © 2007 |Pages: 15
DOI: 10.4018/978-1-59904-252-7.ch001
OnDemand PDF Download:


In machine learning the problem of limited data for supervised learning is a challenging problem with practical applications. We address a similar problem in the context of software quality modeling. Knowledge- based software engineering includes the use of quantitative software quality estimation models. Such models are trained using apriori software quality knowledge in the form of software metrics and defect data of previously developed software projects. However, various practical issues limit the availability of defect data for all modules in the training data. We present two solutions to the problem of software quality modeling when a limited number of training modules have known defect data. The proposed solutions are a semisupervised clustering with expert input scheme and a semisupervised classification approach with the expectation-maximization algorithm. Software measurement datasets obtained from multiple NASA software projects are used in our empirical investigation. The software quality knowledge learnt during the semisupervised learning processes provided good generalization performances for multiple test datasets. In addition, both solutions provided better predictions compared to a supervised learner trained on the initial labeled dataset.

Complete Chapter List

Search this Book:
Table of Contents
Philip S. Yu
Xingquan Zhu, Ian Davidson
Xingquan Zhu, Ian Davidson
Chapter 1
Naeem Seliya, Taghi M. Khoshgoftaar
In machine learning the problem of limited data for supervised learning is a challenging problem with practical applications. We address a similar... Sample PDF
Software Quality Modeling with Limited Apriori Defect Data
Chapter 2
Jason H. Moore
Human genetics is an evolving discipline that is being driven by rapid advances in technologies that make it possible to measure enormous quantities... Sample PDF
Genome-Wide Analysis of Epistasis Using Multifactor Dimensionality Reduction: Feature Selection and Construction in the Domain of Human Genetics
Chapter 3
Jose Ma. J. Alvir, Javier Cabrera
Mining clinical trails is becoming an important tool for extracting information that might help design better clinical trials. One important... Sample PDF
Mining Clinical Trial Data
Chapter 4
Jia-Yu Pan, Hyung-Jeong Yang, Christos Faloutsos
Multimedia objects like video clips or captioned images contain data of various modalities such as image, audio, and transcript text. Correlations... Sample PDF
Cross-Modal Correlation Mining Using Graph Algorithms
Chapter 5
Petra Perner
This chapter introduces image mining as a method to discover implicit, previously unknown and potentially useful information from digital image and... Sample PDF
Image Mining for the Construction of Semantic-Inference Rules and for the Development of Automatic Image Diagnosis Systems
Chapter 6
Jianting Zhang, Wieguo Liu, Le Gruenwald
Decision trees (DT) has been widely used for training and classification of remotely sensed image data due to its capability to generate human... Sample PDF
A Successive Decision Tree Approach to Mining Remotely Sensed Image Data
Chapter 7
Tilmann Bruckhaus
This chapter examines the business impact of predictive analytics. It argues that in order to understand the potential business impact of a... Sample PDF
The Business Impact of Predictive Analytics
Chapter 8
Anna Olecka
This chapter will focus on challenges in modeling credit risk for new accounts acquisition process in the credit card industry. First section... Sample PDF
Beyond Classification: Challenges of Data Mining for Credit Scoring
Chapter 9
Elena Irina Neaga
This chapter deals with a roadmap on the bidirectional interaction and support between knowledge discovery (Kd) processes and ontology engineering... Sample PDF
Semantics Enhancing Knowledge Discovery and Ontology Engineering Using Mining Techniques: A Crossover Review
Chapter 10
Amandeep S. Sidhu, Paul J. Kennedy, Simeon Simoff
In some real-world areas, it is important to enrich the data with external background knowledge so as to provide context and to facilitate pattern... Sample PDF
Knowledge Discovery in Biomedical Data Facilitated by Domain Ontologies
Chapter 11
Malcolm J. Beynon
The efficacy of data mining lies in its ability to identify relationships amongst data. This chapter investigates that constraining this efficacy is... Sample PDF
Effective Intelligent Data Mining Using Dempster-Shafer Theory
Chapter 12
Fedja Hadzic, Tharam S. Dillon
Real world datasets are often accompanied with various types of anomalous or exceptional entries which are often referred to as outliers. Detecting... Sample PDF
Outlier Detection Strategy Using the Self-Organizing Map
Chapter 13
Benjamin Griffiths
Predictive accuracy, as an estimation of a classifier’s future performance, has been studied for at least seventy years. With the advent of the... Sample PDF
Re-Sampling Based Data Mining Using Rough Set Theory
About the Authors