Liu, Q. and Mintram, R., 2005. Preliminary data analysis methods in software estimation. Software Quality Journal, 13 (1), pp. 91-115.
Full text not available from this repository.
Official URL: http://www.springerlink.com/content/qt73u45626uw33...
Software is quite often expensive to develop and can become a major cost factor in corporate information systems budgets. With the variability of software characteristics and the continual emergence of new technologies the accurate prediction of software development costs is a critical problem within the project management context. In order to address this issue a large number of software cost prediction models have been proposed. Each model succeeds to some extent but they all encounter the same problem, i.e., the inconsistency and inadequacy of the historical data sets. Often a preliminary data analysis has not been performed and it is possible for the data to contain non-dominated or confounded variables. Moreover, some of the project attributes or their values are inappropriately out of date, for example the type of computer used for project development in the COCOMO 81 (Boehm, 1981) data set. This paper proposes a framework composed of a set of clearly identified steps that should be performed before a data set is used within a cost estimation model. This framework is based closely on a paradigm proposed by Maxwell (2002). Briefly, the framework applies a set of statistical approaches, that includes correlation coefficient analysis, Analysis of Variance and Chi-Square test, etc., to the data set in order to remove outliers and identify dominant variables. To ground the framework within a practical context the procedure is used to analyze the ISBSG (International Software Benchmarking Standards Group data—Release 8) data set. This is a frequently used accessible data collection containing information for 2,008 software projects. As a consequence of this analysis, 6 explanatory variables are extracted and evaluated.
|Additional Information:||Clarifies the contribution of the data to the creation of prediction techniques specifically focused in the field of software effort prediction. It is axiomatic that the data determines the characteristics of the derived prediction mechanism, however, this research clarifies the technqiues that should be employed to remove spurious contributions from unwhitened data.|
|Subjects:||Generalities > Computer Science and Informatics|
|Group:||School of Design, Engineering & Computing > Software Systems Research Centre|
|Deposited By:||INVALID USER|
|Deposited On:||16 Apr 2007|
|Last Modified:||07 Mar 2013 14:36|
|Repository Staff Only -|
|BU Staff Only -|
|Help Guide -||Editing Your Items in BURO|