TY - JOUR
T1 - Predicting the emergence of community smells using socio-technical metrics - A machine-learning approach
AU - Palomba, Fabio
AU - Tamburri, Damian Andrew
PY - 2021
Y1 - 2021
N2 - Community smells represent sub-optimal conditions appearing within software development communities (e.g., non-communicating sub-teams, deviant contributors, etc.) that may lead to the emergence of social debt and increase the overall project’s cost. Previous work has studied these smells under different perspectives, investigating their nature, diffuseness, and impact on technical aspects of source code. Furthermore, it has been shown that some socio-technical metrics like, for instance, the well-known socio-technical congruence, can potentially be employed to foresee their appearance. Yet, there is still a lack of knowledge of the actual predictive power of such socio-technical metrics. In this paper, we aim at tackling this problem by empirically investigating (i) the potential value of socio-technical metrics as predictors of community smells and (ii) what is the performance of within- and cross-project community smell prediction models based on socio-technical metrics. To this aim, we exploit a dataset composed of 60 open-source projects and consider four community smells such as Organizational Silo, Black Cloud, Lone Wolf, and Bottleneck. The key results of our work report that a within-project solution can reach F-Measure and AUC-ROC of 77% and 78%, respectively, while cross-project models still require improvements, being however able to reach an F-Measure of 62% and overcome a random baseline. Among the metrics investigated, socio-technical congruence, communicability, and turnover-related metrics are the most powerful predictors of the emergence of community smells.
AB - Community smells represent sub-optimal conditions appearing within software development communities (e.g., non-communicating sub-teams, deviant contributors, etc.) that may lead to the emergence of social debt and increase the overall project’s cost. Previous work has studied these smells under different perspectives, investigating their nature, diffuseness, and impact on technical aspects of source code. Furthermore, it has been shown that some socio-technical metrics like, for instance, the well-known socio-technical congruence, can potentially be employed to foresee their appearance. Yet, there is still a lack of knowledge of the actual predictive power of such socio-technical metrics. In this paper, we aim at tackling this problem by empirically investigating (i) the potential value of socio-technical metrics as predictors of community smells and (ii) what is the performance of within- and cross-project community smell prediction models based on socio-technical metrics. To this aim, we exploit a dataset composed of 60 open-source projects and consider four community smells such as Organizational Silo, Black Cloud, Lone Wolf, and Bottleneck. The key results of our work report that a within-project solution can reach F-Measure and AUC-ROC of 77% and 78%, respectively, while cross-project models still require improvements, being however able to reach an F-Measure of 62% and overcome a random baseline. Among the metrics investigated, socio-technical congruence, communicability, and turnover-related metrics are the most powerful predictors of the emergence of community smells.
KW - Community smells
KW - Empirical software engineering
KW - Social debt
UR - http://www.scopus.com/inward/record.url?scp=85093986587&partnerID=8YFLogxK
U2 - 10.1016/J.JSS.2020.110847
DO - 10.1016/J.JSS.2020.110847
M3 - Article
SN - 0164-1212
VL - 171
JO - Journal of Systems and Software
JF - Journal of Systems and Software
M1 - 110847
ER -