What software repositories should be mined for defect predictors?

Authors Rudolf Ramler
Stefan Larndorfer
Thomas Natschläger
Title What software repositories should be mined for defect predictors?
Booktitle Proceedings oft the 35th Euromicro Conference on Software Engineering and Advanced Applications (SEAA 2009)
Type in proceedings
Publisher IEEE Computer Society
Department PQE
ISBN 978-0-7695-3784-9
Month August
Year 2009
Pages 181-187
SCCH ID# 917
Abstract

The information about which modules in a software system’s future version are potentially defective is a valuable aid for quality managers and testers. Defect prediction promises to indicate these defect-prone modules. Constructing effective defect prediction models in an industrial setting involves the decision from what data source the defect predictors should be derived. In this paper we compare defect prediction results based on three different data sources of a large industrial software system to answer the question what repositories to mine. In addition, we investigate whether a combination of different data sources improves the prediction results. The findings indicate that predictors derived from static code and design analysis provide slightly yet still significant better results than predictors derived from version control, while a combination of all data sources showed no further improvement.