Extending static analysis by mining project-specific rules

Author(s): Boya Sun, Gang Shu, Andy Podgurski, Brain Robinson
Venue: International Conference on Software Engineering
Date: 2012

Type of Experiement: Other
Sample Size: 2
Class/Experience Level: Professional
Participant Selection: Two industrial software systems developed by product groups at ABB
Data Collection Method: Survey, Project Artifact(s)


The problem presented in this paper is that when using commercial static program analysis tools, many defects will not be found because the tools are too general for an specific project. Alternatively, creating an analysis tool for a specific project, while effective, is time consuming to create and is no good for general use on other projects. The solution presented in this paper is a framework that creates custom checkers based on mines project-specific rules to check for project specific defects.

Their framework works by using "checkers" for common defects its expects. Therea re 2 types of checkers: One for general checks that can be used in multiple programming domains; The other type is generated checkers that is created based on mined project-specific rules. The framework starts by mining patterns to create the checkers for the system it is checking. After the checkers are created it they are fed into an existing commercial static program analysis tool canned Klockwork. Klockwork takes the checkers generated by the system and creates warnings as a final output.

A experiment was conducted on two large software projects (over 2 millions SLOC each) to test the accuracy and usefulness of the framework. This was done by using the framework on the large projects and then manually analyzing the the usefulness of the checkers as well as analyzing the warnings produced by Klocwork. Running Klocwork over the entire projects they were testing was in infeasible so they used random subsets of the project.

The experiment was overall a success with over 60% of the generated checkers on both projects finding useful patterns and with an over 80% precision of the warning generated by Klocwork for both projects. They believe their approach is practical and generalizable and hope to extend the current work to address more types of programming rules in the future.