

We combine a vector space model (VSM) and multiple topic models to compute the similarity and apply a genetic algorithm to infer semi-optimal topic model configurations.

Similarity of a textual description and a code unit is now made by considering all these abstraction levels. M ULAB represents a code unit and a textual description at multiple abstraction levels. In this work, we propose a multi-abstraction concern localization technique named M ULAB. These techniques typically represent code units and textual descriptions as a bag of tokens at one level of abstraction, e.g., each token is a word, or each token is a topic. Many information retrieval (IR) based concern localization techniques have been proposed in the literature.

It takes as input textual documents such as bug reports and feature requests and outputs a list of candidate code units that are relevant to the bug reports or feature requests. Concern localization refers to the process of locating code units that match a particular textual description.
