How bugs are born: a model to identify how bugs are introduced in software components

Gema Rodríguez-Pérez, Gregorio Robles (Corresponding author), Alexander Serebrenik, Andy Zaidman, Daniel M. Germán, Jesus M. Gonzalez-Barahona

Research output: Contribution to journalArticleAcademicpeer-review

50 Citations (Scopus)
297 Downloads (Pure)

Abstract

When identifying the origin of software bugs, many studies assume that "a bug was introduced by the lines of code that were modified to fix it''. However, this assumption does not always hold and at least in some cases, these modified lines are not responsible for introducing the bug. For example, when the bug was caused by a change in an external API. The lack of empirical evidence makes it impossible to assess how important these cases are and therefore, to which extent the assumption is valid.

To advance in this direction, and better understand how bugs "are born'', we propose a model for defining criteria to identify the first snapshot of an evolving software system that exhibits a bug. This model, based on the perfect test idea, decides whether a bug is observed after a change to the software. Furthermore, we studied the model's criteria by carefully analyzing how 116 bugs were introduced in two different open source software projects. The manual analysis helped classify the root cause of those bugs and created manually curated datasets with bug-introducing changes and with bugs that were not introduced by any change in the source code. Finally, we used these datasets to evaluate the performance of four existing SZZ-based algorithms for detecting bug-introducing changes. We found that SZZ-based algorithms are not very accurate, especially when multiple commits are found; the F-Score varies from 0.44 to 0.77, while the percentage of true positives does not exceed 63%.

Our results show empirical evidence that the prevalent assumption, "a bug was introduced by the lines of code that were modified to fix it'', is just one case of how bugs are introduced in a software system. Finding what introduced a bug is not trivial: bugs can be introduced by the developers and be in the code, or be created irrespective of the code. Thus, further research towards a better understanding of the origin of bugs in software projects could help to improve design integration tests and to design other procedures to make software development more robust.
Original languageEnglish
Pages (from-to)1294-1340
Number of pages47
JournalEmpirical Software Engineering
Volume25
Issue number2
Early online date4 Feb 2020
DOIs
Publication statusPublished - 1 Mar 2020

Funding

We want to express our gratitude to Bitergia 36 for the support they have provided when questions have arisen using their tools. We also acknowledge the support of several authors by the Government of Spain through projects TIN2014-59400-R and “BugBirth” RTI2018-101963-B-I00. The first author has been supported by the 4TU federation (The Netherlands) through the project “Social aspects of software quality”. Other funding came from the Netherlands Organisation for Scientific Research (NWO) through the “TestRoots” project and the EU Horizon 2020 ICT-10-2016-RIA “STAMP” project (No.731529).

FundersFunder number
Nederlandse Organisatie voor Wetenschappelijk Onderzoek
European Union's Horizon 2020 - Research and Innovation Framework Programme731529

    Keywords

    • Bug origins
    • Bug-introducing changes
    • Extrinsic bugs
    • First-failing change
    • Intrinsic bugs
    • SZZ algorithm

    Fingerprint

    Dive into the research topics of 'How bugs are born: a model to identify how bugs are introduced in software components'. Together they form a unique fingerprint.

    Cite this