TY - GEN
T1 - What Data Scientists (Care To) Recall
AU - Saeed, Samar
AU - Sheikholeslami, Shahrzad
AU - Krüger, Jacob
AU - Hebig, Regina
PY - 2023/12/2
Y1 - 2023/12/2
N2 - To maintain and evolve a software system, developers need to gain new or recover lost knowledge about that system. Thus, program comprehension is a crucial activity in software development and maintenance processes. We know from previous work that developers prioritize what information they want to remember about a system based on the perceived importance of that information. However, AI-based software systems as a special case are not developed by software developers alone, but also by data scientists who deal with other concepts and have a different educational background than most developers. In this paper, we study what information data scientists (aim to) recall about their systems. For this purpose, we replicated our previous work by interviewing 11 data scientists, investigating the knowledge they consider important to remember, and whether they can remember parts of their systems correctly. Our results suggest that data scientists consider knowledge about the AI-project settings to be the most important to remember and that they perform best when remembering knowledge they consider important. Contrary to software developers, data scientists’ self-assessments increase when reflecting on their systems. Our findings indicate similarities and differences between developers and data scientists that are important for managing the processes surrounding a system.
AB - To maintain and evolve a software system, developers need to gain new or recover lost knowledge about that system. Thus, program comprehension is a crucial activity in software development and maintenance processes. We know from previous work that developers prioritize what information they want to remember about a system based on the perceived importance of that information. However, AI-based software systems as a special case are not developed by software developers alone, but also by data scientists who deal with other concepts and have a different educational background than most developers. In this paper, we study what information data scientists (aim to) recall about their systems. For this purpose, we replicated our previous work by interviewing 11 data scientists, investigating the knowledge they consider important to remember, and whether they can remember parts of their systems correctly. Our results suggest that data scientists consider knowledge about the AI-project settings to be the most important to remember and that they perform best when remembering knowledge they consider important. Contrary to software developers, data scientists’ self-assessments increase when reflecting on their systems. Our findings indicate similarities and differences between developers and data scientists that are important for managing the processes surrounding a system.
KW - Program comprehension
KW - Human memory
KW - Remembering
KW - Data scientists
KW - Maintenance
KW - Program Comprehension
KW - Data Scientists
KW - Human Memory
UR - http://www.scopus.com/inward/record.url?scp=85199165052&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-49266-2_15
DO - 10.1007/978-3-031-49266-2_15
M3 - Conference contribution
SN - 978-3-031-49265-5
T3 - Lecture Notes in Computer Science (LNCS)
SP - 208
EP - 224
BT - Product-Focused Software Process Improvement
A2 - Kadgien, Regina
A2 - Jedlitschka, Andreas
A2 - Janes, Andrea
A2 - Lenarduzzi, Valentina
A2 - Li, Xiaozhou
PB - Springer
T2 - International Conference on Product-Focused Software Process Improvement (PROFES)
Y2 - 10 December 2023 through 13 December 2023
ER -