Supporting Swift Reaction: Automatically Uncovering Performance Problems by Systematic Experiments

Author(s): Alexander Wert, Jens Happe, Lucia Happe
Venue: Proceedings of the 2013 International Conference on Software Engineering
Date: 2013

Type of Experiement: Other
Sample Size: 2
Data Collection Method: Observation, Code Metric, Project Artifact(s)


This paper proposes a fast, automated way to diagnosing performance problems in real world applications. Currently, their technology is designed to detect performance problems only in Java-based three-tier enterprise applications. They say that current methods of identifying problems by analyzing software architecture, load tests, and/or runtime data only solve part of the problem, sometimes when it's already too late. Their technology can be run on a project in it's current state, identifying performance problems and their root causes right on the spot.

Their technology works by first analyzing the application and comparing it's behavior to common performance issues, know as performance anti-patterns. Since performance issues share common symptoms, they utilize a hierarchy of performance problems, their symptoms, and root causes to simplify the diagnostic process. The hierarchy starts from very general problems (or symptoms) and then for each further level, refines the problems down to root causes. For each of these performance problems they have defined some amount of detection strategies used for heuristically deciding if a problem is indeed present, then refining the search. The detection strategies are ways in which the system is tested and are comprised of three elements: Workload Variation, Gathering Metrics, and Analysis of Measurement Data. The results of each detection strategy are automatically analyzed to determine if a performance problem exists.

They tested their technology against the TPC-W Benchmark, an official benchmark for measuring performance of webservers and databases which emulates an online bookstore. Their technology was able to identify four points where performance suffered, one of which was in the benchmark application itself. They were then able to fix the problems and increase performance from 1800 requests/second to 3500 requests/second, nearly doubling the throughput. Though they had some successful runs, their technology is still immature and thus suffers from a few limitations. These include: The system to be tested needs to already possess a Usage Profile on how users are using the system, there is a potential for performance detection without being able to identify the root cause, and their technology is designed specifically for three-tier enterprise applications and thus cannot be directly generalized.