TY - JOUR
T1 - GAP: forecasting commit activity in git projects
AU - Decan, Alexandre
AU - Constantinou, Eleni
AU - Mens, Tom
AU - Rocha, Henrique
PY - 2020/7/1
Y1 - 2020/7/1
N2 - Abandonment of active developers poses a significant risk for many open source software projects. This risk can be reduced by forecasting the future activity of contributors involved in such projects. Focusing on the commit activity of individuals involved in git repositories, this paper proposes a practicable probabilistic forecasting model based on the statistical technique of survival analysis. The model is empirically validated on a wide variety of projects accounting for 7,528 git repositories and 5,947 active contributors. We found that a model based on the last 20 observed days of commit activity per contributor provides the best concordance. We also found that the predictions provided by the model are generally close to actual observations, with slight underestimations for low probability predictions and slight overestimations for higher probability predictions. This model is implemented as part of an open source tool, called GAP, that predicts future commit activity.
AB - Abandonment of active developers poses a significant risk for many open source software projects. This risk can be reduced by forecasting the future activity of contributors involved in such projects. Focusing on the commit activity of individuals involved in git repositories, this paper proposes a practicable probabilistic forecasting model based on the statistical technique of survival analysis. The model is empirically validated on a wide variety of projects accounting for 7,528 git repositories and 5,947 active contributors. We found that a model based on the last 20 observed days of commit activity per contributor provides the best concordance. We also found that the predictions provided by the model are generally close to actual observations, with slight underestimations for low probability predictions and slight overestimations for higher probability predictions. This model is implemented as part of an open source tool, called GAP, that predicts future commit activity.
KW - Git
KW - Commit activity
KW - Developer abandonment
KW - Distributed software development
KW - Prediction model
UR - https://github.com/alexandredecan/gap
UR - http://dx.doi.org/10.5281/zenodo.3666048
U2 - 10.1016/j.jss.2020.110573
DO - 10.1016/j.jss.2020.110573
M3 - Article
SN - 0164-1212
VL - 165
JO - Journal of Systems and Software
JF - Journal of Systems and Software
M1 - 110573
ER -