Release Planning of Mobile Apps Based on User Reviews

Author(s): Lorenzo Villarroel, Gabriele Bavota, Barbara Russo, Rocco Oliveto, Massimiliano Di Penta
Venue: Software Engineering (ICSE), 2016 IEEE/ACM 38th International Conference on
Date: 2016

Type of Experiement: Survey/Multi-Case Study
Sample Size: 3
Class/Experience Level: Graduate Student
Data Collection Method: Observation, Survey

Quality
4

With mobiles apps it is a highly competitive marketplace with an platform for user reviews. The writers of this study wanted to develop a method to help organize and prioritize the user reviews for an app to make planning a sprint easier. To do this they first prepocess the reviews to remove negations("good. no crashes" becomes "good."), removing stop-words(like "the"), and unifying synonyms. They analyzed this data to find which reviews were bugs or feature requests. They also created an n-gram model without the preprocessing of the reviews to help with the clustering. Then they ran a clustering algorithm on this to group the similar reviews together into clusters to reduce the number of reviews that a programmer would need to read. This did fairly well, with most of the errors being false negatives where the review was relevant, but not marked so.
They also attempted to rate the reviews to prioritize them by looking at how large the clusters were, how many devices they affected, and gave a higher priority to lower rated reviews. However, the prioritization seemed less effe
They found that this technique worked better at separating bug or feature reviews from uninformative ones then the "state-of-the-art" technique AR-MINER. They used Area Under the Curve to compare the two techniques effectivesness getting (.86) for bug and (0.81) for feature with their technique. Which was better than the AR-MINER (0.51) for informative(bug or feature) on the same set. Lastly, they did a survey on 3 Italian mobile project managers , getting 2 they found it highly useful and 1 that said they listen to the customer not the users. So seems like it is a fairly useful tool to help choose what reviews to actually read, but should not be relied on completely.

0