A Probabilistic Framework for Adapting to Changing and Recurring Concepts in Data Streams.

Ben Halstead, Yun Sing Koh, Patricia Riddle, Mykola Pechenizkiy, Albert Bifet

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

1 Citaat (Scopus)


The distribution of streaming data often changes over time as conditions change, a phenomenon known as concept drift. Only a subset of previous experience, collected in similar conditions, is relevant to learning an accurate classifier for current data. Learning from irrelevant experience describing a different concept can degrade performance. A system learning from streaming data must identify which recent experience is irrelevant when conditions change and which past experience is relevant when concepts reoccur, e.g., when weather events or financial patterns repeat. Existing streaming approaches either do not consider experience to change in relevance over time and thus cannot handle concept drift, or only consider the recency of experience and thus cannot handle recurring concepts, or only sparsely evaluate relevance and thus fail when concept drift is missed. To enable learning in changing conditions, we propose SELeCT, a probabilistic method for continuously evaluating the relevance of past experience. SELeCT maintains a distinct internal state for each concept, representing relevant experience with a unique classifier. We propose a Bayesian algorithm for estimating state relevance, combining the likelihood of drawing recent observations from a given state with a transition pattern prior based on the system's current state. The current state is continuously maintained using a Hoeffding bound based algorithm, which unlike existing methods, guarantees that every observation is classified using the state estimated as the most relevant, while also maintaining temporal stability. We find SELeCT is able to choose experience relevant to ground truth concepts with recall and precision above 0.9, significantly outperforming existing methods and close to a theoretical optimum, leading to significantly higher accuracy and enabling new opportunities for learning in complex changing conditions.

Originele taal-2Engels
TitelProceedings - 2022 IEEE 9th International Conference on Data Science and Advanced Analytics, DSAA 2022
RedacteurenJoshua Zhexue Huang, Yi Pan, Barbara Hammer, Muhammad Khurram Khan, Xing Xie, Laizhong Cui, Yulin He
Aantal pagina's10
ISBN van elektronische versie9781665473309
StatusGepubliceerd - 2022

Bibliografische nota

DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.


Duik in de onderzoeksthema's van 'A Probabilistic Framework for Adapting to Changing and Recurring Concepts in Data Streams.'. Samen vormen ze een unieke vingerafdruk.
  • Best Research Paper Award of IEEE DSAA 2022

    Halstead, B. (Ontvanger), Koh, Y. S. (Ontvanger), Riddle, P. (Ontvanger), Pechenizkiy, M. (Ontvanger) & Bifet, A. (Ontvanger), 2022

    Prijs: AndersWerk, activiteit of publicatie gerelateerde prijzen (lifetime, best paper, poster etc.)Wetenschappelijk

Citeer dit