Abstract
Technical debt (TD) refers to the accumulation of negative consequences resulting from sub-optimal solutions during software development. A recent paper by Edbert et al. studied the difference between security-related TD questions, and security-related non- TD questions on Stack Overflow (SO). One of the characteristics under investigation is the sentiment expressed in these two categories as sentiment provides insight into developers' attitudes and emotions toward security-related TD. To this end, Edbert et al. used a general-purpose, off-the-shelf, sentiment analysis tool. However, previous research has shown that general-purpose off-the-shelf sentiment tools are potentially unreliable when applied to software engineering texts. Therefore, we replicate the study by Edbert et al. using state-of-the-art sentiment analysis tools purpose-built and fine-tuned on SE data, to understand whether and how tool-choice influences the obtained results. We consider both shallow (Senti4SD) and deep learning (BERT4SentiSE) tools. To further understand the differences between shallow and deep-learning sentiment analysis tools, we perform a qualitative analysis into the underlying reasons for tools disagreement. We identify five categories of disagreements: misunderstanding context, courtesy phrases, subjective sentiment, brevity, and divergent examples. Our results are relevant to academics, reiterating the relevance of careful selection of tools used to perform sentiment analysis. Furthermore, the results are relevant to users and developers of sentiment analysis tools, as they inform tool selection dependent on the application domain, and provide insight into optimization of the pre-processing steps. Finally, our study shows that retraining sentiment analysis tools with identical data fails to resolve fundamental inconsis-tencies between how certain types of language, such as courtesy phrases, are classified.
Original language | English |
---|---|
Title of host publication | 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2024 |
Publisher | Institute of Electrical and Electronics Engineers |
Pages | 821-829 |
Number of pages | 9 |
ISBN (Electronic) | 979-8-3503-3066-3 |
DOIs | |
Publication status | Published - 16 Jul 2024 |
Event | International Conference on Software Analysis, Evolution, and Reengineering, SANER 2024 - Rovaniemi, Finland Duration: 12 Mar 2024 → 15 Mar 2024 |
Conference
Conference | International Conference on Software Analysis, Evolution, and Reengineering, SANER 2024 |
---|---|
Abbreviated title | SANER 2024 |
Country/Territory | Finland |
City | Rovaniemi |
Period | 12/03/24 → 15/03/24 |
Keywords
- Replication study
- Security
- Sentiment Analysis
- Stack Overflow
- Technical debt