One of the most common problems in instance-based learning of text categorization is high dimensionality of feature space and problem of deciding which instances to store for use during generalization. These problems can be solved with use of reduction methods. In this paper, comparison of three reduction techniques for feature space reduction and one algorithm for reduction of storage requirements is presented. These techniques were combined with k-NN (k-Nearest Neighbors) classifier, which is one of the top-performing methods in the text classification tasks. We describe the benefit of this combination of methods and present results with the Reuters-21578 dataset.
Dams, D. R., Gerth, R. T., Knaack, B. T., & Kuiper, R. (1998). Partial-order reduction techniques for real-time model checking. Formal Aspects of Computing, 10(5-6), 469-482. https://doi.org/10.1007/s001650050028