A Two-Stage Zone Regression Method for Global Characterization of a Project Database
J. J. Dolado (University of the Basque Country, Spain), D. Rodríguez (University of Reading, UK), J. Riquelme (University of Seville, Spain), F. Ferrer-Troyano (University of Seville, Spain) and J. J. Cuadrado (University of Alcalá de Henares, Spain)
Copyright: © 2007
One of the problems found in generic project databases, where the data is collected from different organizations, is the large disparity of its instances. In this chapter, we characterize the database selecting both attributes and instances so that project managers can have a better global vision of the data they manage. To achieve that, we first make use of data mining algorithms to create clusters. From each cluster, instances are selected to obtain a final subset of the database. The result of the process is a smaller database which maintains the prediction capability and has a lower number of instances and attributes than the original, yet allow us to produce better predictions.