Software Cost Estimation using Soft Computing Approaches

Software Cost Estimation using Soft Computing Approaches

K. Vinaykumar (Institute for Development and Research in Banking Technology (IDRBT), India), V. Ravi (Institute for Development and Research in Banking Technology (IDRBT), India) and Mahil Carr (Institute for Development and Research in Banking Technology (IDRBT), India)
DOI: 10.4018/978-1-60566-766-9.ch024
OnDemand PDF Download:
List Price: $37.50


Software development has become an essential investment for many organizations. Software engineering practitioners have become more and more concerned about accurately predicting the cost of software products to be developed. Accurate estimates are desired but no model has proved to be successful at effectively and consistently predicting software development cost. This chapter investigates the use of the soft computing approaches in predicting the software development effort. Various statistical and intelligent techniques are employed to estimate software development effort. Further, based on the abovementioned techniques, ensemble models are developed to forecast software development effort. Two types of ensemble models viz., linear (average) and nonlinear are designed and tested on COCOMO’81 dataset. Based on the experiments performed on the COCOMO’81 data, it was observed that the nonlinear ensemble using radial basis function network as arbitrator outperformed all the other ensembles and also the constituent statistical and intelligent techniques. The authors conclude that using soft computing models they can accurately estimate software development effort.
Chapter Preview


Software development has become an important activity for many modern organizations (Pressman, 1997). In fact the quality, cost, and timeliness of developed software are often crucial determinants of an organization's success. There are significant financial and strategic implications for development projects in terms of activity scheduling and cost estimation. Software cost estimation is one of the most critical tasks in managing software projects. Development costs tend to increase with project complexity and hence accurate cost estimates are highly desired during the early stages of development (Wittig and Finnie, 1997). A major problem of the software cost estimation is first obtaining an accurate size estimate of the software to be developed (Kitchenham et al., 2003). An important objective of the software engineering community is to develop useful models that constructively explain the software development life cycle and accurately estimate the cost of software development.

In order to effectively develop software in an increasingly competitive and complex environment many organizations use software metrics as a part of their project management process. The field concerned with managing software development projects using empirical models is referred to as software project management (Fenton and Pleeger, 1997). Software metrics are aspects of software development (either of the software product itself, or of the development process producing that product) that can be measured. These measurements can be used as variables in models for predicting or estimating some aspects of the development process or product that are of interest.

Estimating development effort and schedule, can include activities such as assessing and predicting system quality, measuring system performance, estimating user satisfaction and in fact any modeling task involving measurable attributes of interest within the software development sphere (Gray, 1999). However, the most researched area has been effort estimation as it carries the greatest promise of benefit for project management. Such models are generally developed using a set of measures that describe the software development process, product and resources-such as developer experience, system size, complexity, and the characteristics of the development environment respectively. The output of the model is usually some measure of effort in terms of person hours (months or years).

There are many models and tools used in software cost estimation that provide invaluable information regarding efforts and expenditure to the management to bid for a project (Kitchenham et al., 2003). The most commonly used methods for predicting software development effort have been based on linear-least-squares regression such as COCOMO (Fenton and Pleeger, 1997; Pressman, 1997). As such, the models have been extremely susceptible to local variations in data points (Miyazaki et al., 1994). Additionally, the models have failed to deal with implicit nonlinearities and interactions between the characteristics of the project and effort (Gray, 1999).

Key Terms in this Chapter

Software Cost Estimation: Software development has become an essential investment for many organizations. Software engineering practitioners have become more and more concerned about accurately predicting the cost of software products to be developed.

Ensemble Forecasting Method: The idea behind ensemble systems is to exploit each constituent model’s unique features to capture different patterns that exist in the dataset.

Dynamic Evolving Neuro-Fuzzy Inference System (DENFIS): It evolves through incremental hybrid (supervised/unsupervised), learning, and accommodates new input data, including new features, new classes, etc., through local element tuning. New fuzzy rules are created and updated during the operation of the system

Radial Basis Function Network (RBFN): RBFN is another member of the feed-forward neural networks and has both unsupervised and supervised phases. In the unsupervised phase input data are clustered and cluster details are sent to hidden neurons, where radial basis functions of the inputs are computed by making use of the center and the standard deviation of the clusters

Multilayer Perceptron (MLP): MLP is typically composed of several layers of many computing elements called nodes. Each node receives an input signal from other nodes or external inputs and then after processing the signals locally through a transfer function it outputs a transformed signal to other nodes or final result

Classification and Regression Trees (CART): CART (classification and regression trees) solves classification and regression problems as well. Decision tree algorithms induce a binary tree on a given training set resulting in a set of ‘if-then’ rules

Support Vector Machine (SVM): The SVM is a powerful learning algorithm based on recent advances in statistical learning theory. SVMs are learning systems that use a hypothesis space of linear functions in a high dimensional space trained with a learning algorithm from optimization theory that implements a learning bias derived from statistical learning theory

Complete Chapter List

Search this Book: