An empirical investigation into the nature of test smells

Author(s): Michele Tufano, Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrea De Lucia, Denys Poshyvanyk
Venue: Automated Software Engineering (ASE)
Date: 06 October 2016

Type of Experiement: Case Study
Class/Experience Level: Graduate Student
Participant Selection: Invited from Apache and Eclipse ecosystems
Data Collection Method: Survey


A study of 19 developers were given a survey which tried to pinpoint if developers were aware of test smells, identify test smells, determine the severity of the test smell, if the test smell would be fixed, and determine if a test smell was an indicator of a code smell in the production code that is being tested.

The study was conducted with only 5 different test smells: having assertions with no failure explanation, if the test tests more than one method, if the test's setUp method is too generic for the test case, if the test has a mystery file as a external resource and not self contained, and if the test uses the toString method to test for equality. The survey was given out to 298 developers in which only 19 developers responded. The survey was designed to be completed under 30 minutes. The survey was structured as so: the developer would first be given a test file and was asked if they could see any problems with the code. If they answered yes, they would be asked what they were, then explain why the problem was introduced. They they would be asked if they believe the test should be refactored and if they answered "yes", they would be asked if they would do it themselves.

The results were so, developers were not very good at identifying test smells, only 17 of the 95 test smells were identified by the developers. Only 5 of the 19 participants were able to identify test smells. From those answers that correctly identified a test smells, 91% of the developers believed that the test smell was not really a big issue and did not need refactoring as it would not be beneficial. Because of this, developers also believed that they could not refactor the test themselves as they did not have a solution to the problem. The most important result of the study was that developers had a hard time answering the question: "how was this code smell introduced?". Overall the results of this study indicated that developers are not able to correctly identify what a test smell is and if they can, they can not identify how a test smell is introduced. Developers are unable to identify the root cause of the test smell thus they are unable to fix the issue.