A Learning Algorithm for Change Impact Prediction

Author(s): Vincenzo Musco, Antonin Carette, Martin Monperrus, and Phillipe Preux
Venue: International Conference of Software Engineering
Date: May 2016

Sample Size: 6000


This paper investigates the accuracy of change impact analysis, which is predicting the impact of a code change in a software application. This experiment seeks to test the accuracy of a specific learning algorithm called LCIP by testing it against Java programs. LCIP creates a weighted call graph based on given training data with a node for each method of the program and an edge to each influenced method. The algorithm weights the edge based on a the probability of influence along that edge, determined by the differing branches and loops of the code method.

This study tests LCIP on two well-tested Java software programs that total around 120,000+ lines of code. 6,000 different changes are used to modify the code and test impact prediction. LCIP's results are compared to two control algorithms, a standard closure algorithm using an unweighted call graph and a basic learning strategy. The algorithm accuracies are evaluated by the precision of predicted tests that were really influenced, the recall of impacted tests that are retrieved, and the F-score which combines the other two values.

In conclusion, the LCIP algorithm had the highest precision values for all 6000 mutations ran, with an average of 22% increase in accuracy from the standard closure algorithm and 3% increase from the basic learning strategy. The recall values were all very similar, with a 1% average improvement of the basic strategy. This lead to the LCIP algorithm having a 2% average F-score improvement. This study demonstrates that the application of learning algorithms drastically increases the change impact analysis accuracy, and the weighted call graphs slightly improves that accuracy even more.