Preliminary Analysis of the Effects of Pair Programming and Test-Driven Development on the External Code Quality

Author(s): L. Madeyski
Venue: Software Engineering: Evolution and Emerging Technologies
Date: 2005


Test-Driven Development (TDD) and Pair Programming (PP) together have caught the attention of software engineering worldwide as methodologies of the infamous eXtreme Programming (XP). Most studies done are anecdotal and favorable toward XP practices and methodologies. Not many have been concerned with external code quality either. Madeyski proposed to investigate the differences between TDD and classical test-last development combined with PP or solo-programming on external code quality.

The study took place in 2004 at Wroclaw University of Technology in Poland. The subjects of the study were all graduate level students. 188 students were involved, mostly second to third year students into their masters program in Computer Science. A few were fourth and fifth year students. They were divided into four groups. Those in pairs developing in the classical approach (CP), solo developing in classical approach (CS), pairs using TDD (TP), and solo using TDD (TS). The breakdown for numbers was 28 CS, 28 TS, 31 CP, and 35 TP. Java was used as the programming language, and most students had background with C/C++ programming, and object-oriented design. Eclipse was used as the primary IDE. A finance-accounting system was built over eight laboratory sessions, each ninety (90) minutes long. External code quality was measured by the number of black box test cases passed, and reliability was measured as the fraction of the number of black box test cases passed over the total number of test cases. Data was collected automatically with built-in infrastructure, and pre-test/post-test questionnaires to evaluate students experience and preferences.

External code quality is lower when TDD was used instead of classic test-last for solo programmers (p = 0.028) as well as for pairs (p = 0.013). These are both statistically significant differences. There was no significant difference between solo and pair programming for classic test-last (p = 0.538) or TDD (p =0.945).

According to Madeyski’s experiment, pair programming versus solo programming has no significant difference in external code quality. However, for both solo programming and pair programming, TDD external code quality was lower than the classic test-last approach.