Sentiment of Technical Debt Security Questions on Stack Overflow: A Replication Study

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

Samenvatting

Technical debt (TD) refers to the accumulation of negative consequences resulting from sub-optimal solutions during software development. A recent paper by Edbert et al. studied the difference between security-related TD questions, and security-related non- TD questions on Stack Overflow (SO). One of the characteristics under investigation is the sentiment expressed in these two categories as sentiment provides insight into developers' attitudes and emotions toward security-related TD. To this end, Edbert et al. used a general-purpose, off-the-shelf, sentiment analysis tool. However, previous research has shown that general-purpose off-the-shelf sentiment tools are potentially unreliable when applied to software engineering texts. Therefore, we replicate the study by Edbert et al. using state-of-the-art sentiment analysis tools purpose-built and fine-tuned on SE data, to understand whether and how tool-choice influences the obtained results. We consider both shallow (Senti4SD) and deep learning (BERT4SentiSE) tools. To further understand the differences between shallow and deep-learning sentiment analysis tools, we perform a qualitative analysis into the underlying reasons for tools disagreement. We identify five categories of disagreements: misunderstanding context, courtesy phrases, subjective sentiment, brevity, and divergent examples. Our results are relevant to academics, reiterating the relevance of careful selection of tools used to perform sentiment analysis. Furthermore, the results are relevant to users and developers of sentiment analysis tools, as they inform tool selection dependent on the application domain, and provide insight into optimization of the pre-processing steps. Finally, our study shows that retraining sentiment analysis tools with identical data fails to resolve fundamental inconsis-tencies between how certain types of language, such as courtesy phrases, are classified.
Originele taal-2Engels
Titel2024 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024
UitgeverijInstitute of Electrical and Electronics Engineers
Pagina's821-829
Aantal pagina's9
ISBN van elektronische versie979-8-3503-3066-3
DOI's
StatusGepubliceerd - 16 jul. 2024
EvenementInternational Conference on Software Analysis, Evolution, and Reengineering, SANER 2024 - Rovaniemi, Finland
Duur: 12 mrt. 202415 mrt. 2024

Congres

CongresInternational Conference on Software Analysis, Evolution, and Reengineering, SANER 2024
Verkorte titelSANER 2024
Land/RegioFinland
StadRovaniemi
Periode12/03/2415/03/24

Vingerafdruk

Duik in de onderzoeksthema's van 'Sentiment of Technical Debt Security Questions on Stack Overflow: A Replication Study'. Samen vormen ze een unieke vingerafdruk.

Citeer dit