Development of Auxiliary Functions: Should You Be Agile? An Empirical Assessment of Pair Programming and Test-First Programming

Author(s): Otávio Augusto Lazzarini Lemos, Fabiano Cutigi Ferrari, Fábio Fagundes Silveira, and Alessandro Garcia
Venue: 34th International Conference on Software Engineering
Date: 2012

Type of Experiement: Controlled Experiment
Sample Size: 92
Class/Experience Level: Undergraduate Student, Professional
Participant Selection: responded to survey
Data Collection Method: Code Metric


Auxiliary functions, such as Apple's leap year detection, are a small fraction of the entire code base. However problems in these functions may lead to larger issues in the system as a whole. Augusto and colleagues investigated whether agile practices, such as pair-programming and test first, produce more correct implementations than their non agile counterparts.

Augusto and colleagues conducted a set of experiments on Computer Science undergraduate students and professionals. Students and professionals were invited to take place in a experiment that compared test first v. test last. The experiment was set up so each subject would employ each method of programming, one way for one problem and the other way for the other problem. To ensure that the methodologies were being tested independently the problems were taken from different subject domains (Strings, Arrays, Integers). Correctness was measured by a set of predefined unit tests. The students also took place in a similar experiment that compared single v. pair programming.

For the pair programming experiment, 6.5% of the individuals passed all of the unit tests compared to 18% of the pairs. After an analysis this difference was found to be statistically significant. There was no significant difference between the two methods of the test first experiment. The number of professionals was too small for statistical analysis however their results appeared similar to the students.

Pair programming does seem to improve quality, at the cost of more time. However there doesn't seem to be a large difference between test last and test first, which is the big draw of TDD. Since professionals also took pace in the test first experiment the results seem to hold regardless of experience.