Link analysis algorithms for static concept location: an empirical assessment

Scanniello, Giuseppe; Marcus, A.; Pascale, D.

doi:10.1007/s10664-014-9327-7

During software evolution, one of the most important comprehension activities is concept location in source code, as it identifies the places in the code where changes are to be made in response to a modification request. Change requests (such as, bug fixing or new feature requests) are usually formulated in natural language, while the source code also includes large amounts of text. In consequence, many of the existing concept location techniques are based on text search or text retrieval. Such approaches reformulate concept location as a document retrieval problem. We refine and improve such solutions by leveraging dependencies between source code elements. Dependency information is used by a link analysis algorithm to rank the document space and to improve concept location based on text retrieval. We implemented our solution to concept location using the PageRank algorithm, used in web document retrieval applications. The results of an empirical evaluation indicate that the new approach leads to better retrieval performance than baseline approaches that use text retrieval and clustering. In addition, we present the results of a controlled experiment and of a differentiated replication to assess whether the new technique supports users in identifying the places in the code where changes are to be made. The results of these experiments revealed that the users exploiting our technique were significantly better supported in the identification of the code to be changed in response to a bug fixing request compared to the users who did not use this technique.