An empirical assessment of using stereotypes to improve reading techniques in software inspections

Author(s): Miroslaw Staron, Ludwik Kuzniarz, and Christian Thurn
Venue: International Conference on Software Engineering, Proceedings of the third workshop on Software quality, SESSION: Quality tools and techniques II
Date: 2005


This paper studies the effect of using stereotypes on program comprehension coupled with
the use of various reading techniques for software inspection. The experiment was done at
the Blekinge Institute of Technology in Sweden. 11 last-year grad software engineering students
were chosen by the authors to participate in the study.

One week prior to the experiment, the students were given a background questionnaire to fill
out. The experiment itself took a period of three hours. The students were given three sets
of UML diagrams to study (one set at a time) and a questionnaire to fill out on each. They
first were given the non-stereotyped class diagrams, then the non-stereotyped collaboration
diagrams, and finally the stereotyped class and collaboration diagrams together. In an effort
to minmize the learning effect, the non-stereotyped diagrams included errors while the
stereotyped ones did not.

Each student was given a specific reading technique to use when reading the diagrams along with
instructions on how to use that technique. The three techniques were a checklist based reading
(they follow a checklist), a perspective-based reading (they read the UML diagrams from a particular
perspective), and ad-hoc (they were told what types of errors to look out for but left on their
own for how to go about finding them).

The results were compiled from the questionnaires that the students filled out. They found that
in the mean, there was a 76% increase of efficiency (I assume that they mean how quickly the
students finished) when using the stereotyped UML diagrams. However, there was only a 17%
boost in effectiveness (I assume that they mean accuracy in answering the questions) when
using the stereotyped UML diagrams.

They found that with all of the reading techniques, there was a boost in both efficiency and
effectiveness when using stereotypes. In terms of effectiveness, the checklist and ad-hoc
methods saw an 89% boost in efficiency while the perspective method saw only a 53% boost.
The boost in effectiveness however, was much smaller. Checklists had the greatest boost at
36%, but the perspective method had only a 10% boost and ad-hoc was a mere 3%.

Personally, I think that they had too many independent variables in this experiment. Were
they examining the effects of stereotypes or of the various reading techniques. Perhaps there
is useful data in mixing the two in the same experiment but the sample size is so small here
that it seems to me that tackling both issues at once is foolish. They do discuss such
validity issues but don't seem to think that they're a big problem.

All in all, this study points to stereotypes being of some benefit and that perhaps a checklist-based
reading of UML diagrams is best, but I think that this experiment needs at minimum a lot more
participants before it can be considered at all significant. It would also be a lot more
convincing if they did a better job of limiting the independent variables.