TY - JOUR
T1 - Improving custom-tailored variability mining using outlier and cluster detection
AU - Wille, David
AU - Babur, Önder
AU - Cleophas, Loek
AU - Seidl, Christoph
AU - van den Brand, Mark
AU - Schaefer, Ina
PY - 2018/10/1
Y1 - 2018/10/1
N2 - To satisfy demand for customized software solutions, companies commonly use so-called clone-and-own approaches to reuse functionality by copying existing realization artifacts and modifying them to create new product variants. Lacking clear documentation about the variability relations (i.e., the common and varying parts), the resulting variants have to be developed, maintained and evolved in isolation. In previous work, we introduced a semi-automatic mining algorithm allowing custom-tailored identification of distinct variability relations for block-based model variants (e.g., MATLAB/Simulink models or statecharts) using user-adjustable metrics. However, variants completely unrelated with other variants (i.e., outliers) can negatively influence the usefulness of the generated variability relations for developers maintaining the variants (e.g., erroneous relations might be identified). In addition, splitting the compared models into smaller sets (i.e., clusters) can be sensible to provide developers separate view points on different variable system features. In further previous work, we proposed statistical clustering capable of identifying such outliers and clusters. The contribution of this paper is twofold. First, we present guidelines and a generic implementation that both ease adaptation of our variability mining algorithm for new languages. Second, we integrate our clustering approach as a preprocessing step to the mining. This allows users to remove outliers prior to executing variability mining on suggested clusters. Using models from two industrial case studies, we show feasibility of the approach and discuss how our clustering can support our variability mining in identifying sensible variability information.
AB - To satisfy demand for customized software solutions, companies commonly use so-called clone-and-own approaches to reuse functionality by copying existing realization artifacts and modifying them to create new product variants. Lacking clear documentation about the variability relations (i.e., the common and varying parts), the resulting variants have to be developed, maintained and evolved in isolation. In previous work, we introduced a semi-automatic mining algorithm allowing custom-tailored identification of distinct variability relations for block-based model variants (e.g., MATLAB/Simulink models or statecharts) using user-adjustable metrics. However, variants completely unrelated with other variants (i.e., outliers) can negatively influence the usefulness of the generated variability relations for developers maintaining the variants (e.g., erroneous relations might be identified). In addition, splitting the compared models into smaller sets (i.e., clusters) can be sensible to provide developers separate view points on different variable system features. In further previous work, we proposed statistical clustering capable of identifying such outliers and clusters. The contribution of this paper is twofold. First, we present guidelines and a generic implementation that both ease adaptation of our variability mining algorithm for new languages. Second, we integrate our clustering approach as a preprocessing step to the mining. This allows users to remove outliers prior to executing variability mining on suggested clusters. Using models from two industrial case studies, we show feasibility of the approach and discuss how our clustering can support our variability mining in identifying sensible variability information.
KW - Block-based language
KW - Clone-and-own
KW - Conceptual framework
KW - Outlier and cluster detection
KW - Variability mining
UR - http://www.scopus.com/inward/record.url?scp=85046631676&partnerID=8YFLogxK
U2 - 10.1016/j.scico.2018.04.002
DO - 10.1016/j.scico.2018.04.002
M3 - Article
AN - SCOPUS:85046631676
SN - 0167-6423
VL - 163
SP - 62
EP - 84
JO - Science of Computer Programming
JF - Science of Computer Programming
ER -