Incremental learning strategies with random forest classifiers

P. Shrestha, P.H.N. With, de

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

3 Downloads (Pure)

Abstract

The random forest (RF) technique is used among the best performing multi-class classifiers, popular in different machine learning applications. They are known for high computational efficiency during training and testing, while delivering highly accurate results. However, conventionally, RF is trained in an off-line mode, where it requires the entire training set to be available beforehand. This imposes practical limitations, such as compiling training data in advance and disregard any further changes in the data distribution, even when the data is sequential. In this paper, we investigate the incremental learning behavior RF algorithm. We generate an initial RF based on a limited training data, and update the RF incrementally with the arrival of the new data. We have developed three incremental learning strategies with the RF, based on the selection criteria of the trees for an update, namely all update, random update and performance-based update. We have tested our methods in different publicly available multi-class static streaming data sets. The results show that the performance-based update of RF results in a classification accuracy comparable to an off-line RF, while requiring a significantly lower computational cost.
Original languageEnglish
Title of host publicationProceedings of the 32nd WIC Symposium on Information Theory in the Benelux, 10-11 may 2011, Brussels, Belgium
Place of PublicationDelft
PublisherWerkgemeenschap voor Informatie- en Communicatietheorie (WIC)
Pages1-6
Publication statusPublished - 2011

Fingerprint Dive into the research topics of 'Incremental learning strategies with random forest classifiers'. Together they form a unique fingerprint.

Cite this