Every conference imposing a limit on the length of submissions must deal with the problem of page limit cheating: authors tweaking the parameters of the game such that they can squeeze more content into their paper. We claim that this problem is endemic, although we lack the data to formally prove this. Instead, this paper provides a far from exhaustive summary of ways to cheat the page limit, a case study involving the papers accepted for the Research and Applied Data Science tracks at the European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD) 2019, and a discussion of ways for program chairs to tackle this problem. Of the 130 accepted papers in these two ECMLPKDD 2019 tracks, 68 satisfied the page limit; 62 (47.7%) turned out to spill over the page limit, by up to as much as 50%. To misappropriate a phrase from Darrell Huff's “How to Lie with Statistics,” we intend for this paper not to be a manual for swindlers; instead, nefarious paper authors already know these tricks, and honest program chairs must learn them in self-defense. This article is categorized under: Commercial, Legal, and Ethical Issues > Fairness in Data Mining.
|Tijdschrift||Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery|
|Nummer van het tijdschrift||3|
|Status||Gepubliceerd - 1 mei 2020|