Effort models and effort estimates help project managers allocate resources, control costs and schedule, and improve current practices, leading to projects that are finished on time and within budget. In the context of Web development and maintenance, these issues are also crucial, and very challenging, given that Web projects have short schedules and a highly fluidic scope. Therefore, the objective of this chapter is to introduce the concepts related to Web effort estimation and effort estimation techniques. In addition, this chapter also details and compares, by means of a case study, three effort estimation techniques, chosen for this chapter because they have been to date the ones mostly used for Web effort estimation: Multivariate regression, Case-based reasoning, and Classification and Regression Trees. The case study uses data on industrial Web projects from Spanish Web companies.
Key Terms in this Chapter
Prediction at Level l: Also known as Pred(l). It measures the percentage of estimates that are within l% of the actual values.
Case-Based Reasoning: Assumes that similar problems provide similar solutions. It provides estimates by comparing the characteristics of the current project to be estimated against a library of historical information from completed projects with a known effort (case base).
Effort Estimation: To predict the necessary amount of labour units to accomplish a given task, based on knowledge of previous similar projects and other project characteristics that are believed to be related to effort. Project characteristics (independent variables) are the input, and effort (dependent variable) is the output we wish to predict.
Mean Magnitude of Relative Error: Calculates the Mean Magnitude of Relative Error (MRE), which measures for a given project the difference between actual and estimated effort relative to the actual effort. The mean takes into account the numerical value of every observation in the data distribution, and is sensitive to individual predictions with large MREs.
Classification and Regression Trees (CART) (Brieman et al., 1984): Techniques where independent variables (predictors) are used to build binary trees where each leaf node either represents a category to which an estimate belongs to, or a value for an estimate. In order to obtain an estimate one has to traverse tree nodes from root to leaf by selecting the nodes that represent the category or value for the independent variables associated with the case to be estimated.
Expert-Based Effort Estimation: Represents the process of estimating effort by subjective means, and is often based on previous experience from developing/managing similar projects. This is by far the mostly used technique for Web effort estimation. Within this context, the attainment of accurate effort estimates relies on the competence and experience of individuals (e.g., project manager, developer).
Cross-Validation: Process by which an original dataset d is divided into a training set t and a validation set v. The training set is used to produce an effort estimation model (if applicable), later used to predict effort for each of the projects in v, as if these projects were new projects for which effort was unknown. Accuracy statistics are then obtained and aggregated to provide an overall measure of prediction accuracy.
Algorithmic Techniques: Attempt to build models that precisely represent the relationship between effort and one or more project characteristics via the use of algorithmic models. Such models assume that application size is the main contributor to effort thus in any algorithmic model the central project characteristic used is usually taken to be some notion of application size (e.g., the number of lines of source code, function points, number of Web pages, number of new images). The relationship between size and effort is often translated as an equation.