TY - JOUR
T1 - Early prediction of writing quality using keystroke logging
AU - Conijn, Rianne
AU - Cook, Christine
AU - van Zaanen, Menno
AU - Van Waes, Luuk
N1 - Publisher Copyright:
© 2021, The Author(s).
PY - 2022/12
Y1 - 2022/12
N2 - Feedback is important to improve writing quality; however, to provide timely and personalized feedback is a time-intensive task. Currently, most literature focuses on providing (human or machine) support on product characteristics, especially after a draft is submitted. However, this does not assist students who struggle during the writing process. Therefore, in this study, we investigate the use of keystroke analysis to predict writing quality throughout the writing process. Keystroke data were analyzed from 126 English as a second language learners performing a timed academic summarization task. Writing quality was measured using participants’ final grade. Based on previous literature, 54 keystroke features were extracted. Correlational analyses were conducted to identify the relationship between keystroke features and writing quality. Next, machine learning models (regression and classification) were used to predict final grade and classify students who might need support at several points during the writing process. The results show that, in contrast to previous work, the relationship between writing quality and keystroke data was rather limited. None of the regression models outperformed the baseline, and the classification models were only slightly better than the majority class baseline (highest AUC = 0.57). In addition, the relationship between keystroke features and writing quality changed throughout the course of the writing process. To conclude, the relationship between keystroke data and writing quality might be less clear than previously posited.
AB - Feedback is important to improve writing quality; however, to provide timely and personalized feedback is a time-intensive task. Currently, most literature focuses on providing (human or machine) support on product characteristics, especially after a draft is submitted. However, this does not assist students who struggle during the writing process. Therefore, in this study, we investigate the use of keystroke analysis to predict writing quality throughout the writing process. Keystroke data were analyzed from 126 English as a second language learners performing a timed academic summarization task. Writing quality was measured using participants’ final grade. Based on previous literature, 54 keystroke features were extracted. Correlational analyses were conducted to identify the relationship between keystroke features and writing quality. Next, machine learning models (regression and classification) were used to predict final grade and classify students who might need support at several points during the writing process. The results show that, in contrast to previous work, the relationship between writing quality and keystroke data was rather limited. None of the regression models outperformed the baseline, and the classification models were only slightly better than the majority class baseline (highest AUC = 0.57). In addition, the relationship between keystroke features and writing quality changed throughout the course of the writing process. To conclude, the relationship between keystroke data and writing quality might be less clear than previously posited.
KW - Academic writing
KW - Early prediction
KW - Keystroke logging
KW - Writing processes
KW - Writing quality
UR - http://www.scopus.com/inward/record.url?scp=85115054948&partnerID=8YFLogxK
U2 - 10.1007/s40593-021-00268-w
DO - 10.1007/s40593-021-00268-w
M3 - Article
AN - SCOPUS:85115054948
SN - 1560-4292
VL - 32
SP - 835
EP - 866
JO - International Journal of Artificial Intelligence in Education
JF - International Journal of Artificial Intelligence in Education
IS - 4
ER -