Comparing Static Bug Finders and Statistical Prediction

Author(s): Foyzur Rahman, Sameer Khatr, Earl T. Barr, Premkumar Devanbu
Venue: International Conference on Software Engineering
Date: May 2014

Type of Experiement: Case Study
Sample Size: 5
Class/Experience Level: Professional
Participant Selection: Five Apache projects. The projects ranged in size but are all Java projects.
Data Collection Method: Observation, Code Metric, Project Artifact(s)


This study compares the effectiveness of static bug finders to statistical predictors. It is always important for a software engineer to have efficient means of bug finding. Static bug finders will range from simple code matching techniques to carefully designed semantic abstraction of the code. It's usually slower, but it's reliable. Prediction relies on past human and technological errors. Using machine learning models can predict where bugs can occur.

Evaluation of the two were used over five different Apache projects. They tracked bugs found with FindBugs, PMD, and JLINT. At the end, they compared how many bugs were found by each program and the overlapping bugs found. In results, it was shown the defect prediction fared much better against PMD. PMD is a widely used tool, so this finding is relatively significant. However, it was shown to do much worse when compared to FindBugs.