In recent decades, feature selection has become an indispensable step in data mining with the emergence of \big data" in various elds. However, despite the substantial work on feature selection, the stability of feature selection algorithms is relatively less addressed in the literature. The stability of a feature selection algorithm means the robustness of the feature preferences it produces to dierences in training sets drawn from the same generating distribution. Robust feature selection results are signicant for the domain experts to interpret the results and make proper decisions. In this paper, we illustrate the issue of instability of feature selection and propose an "inner ensemble" sequential selection to improve the stability of sequential feature selection. We compare our proposed method with the standard sequential forward selection (SFS) and an "outer ensemble" SFS on three articial datasets and seven real-world datasets. We evaluate these methods in three aspects: sensitivity, stability, prediction performance. The results shows that our proposed method outperforms the standard SFS and ensemble SFS.
|Status||In voorbereiding - 5 mrt 2020|