Журнал основ возобновляемой энергии и приложений

Журнал основ возобновляемой энергии и приложений
Открытый доступ

ISSN: 2165- 7866

Абстрактный

Use of the Multiple Imputation Strategy to Deal with Missing Data in the ISBSG Repository

Abdalla Bala and Alain Abran

Multi-organizational repositories, in particular those based on voluntary data contributions such as the repository of the International Software Benchmarking Standards Group (ISBSG), may be missing a large number of values for many of their data fields, as well as including some outliers. This paper suggests a number of data quality issues associated with the ISBSG repository which can compromise the outcomes for users exploiting it for benchmarking purposes or for building estimation models. We propose a number of criteria and techniques for preprocessing the data in order to improve the quality of the samples identified for detailed statistical analysis, and present a multiple imputation (MI) strategy for dealing with datasets with missing values.

Top