BoostEMM : Transparent boosting using exceptional model mining

S.B. van der Zon, O. Zeev Ben Mordehay, T.S. Vrijdag, W. van Ipenburg, J. Veldsink, W. Duivesteijn, M. Pechenizkiy

Onderzoeksoutput: Hoofdstuk in Boek/Rapport/CongresprocedureConferentiebijdrageAcademicpeer review

14 Downloads (Pure)

Samenvatting

Boosting is an iterative ensemble-learning paradigm. Every iteration, a weak predictor learns a classification task, taking into account performance achieved in previous iterations. This is done by assigning weights to individual records of the dataset, which are increased if the record is misclassified by the previous weak predictor. Hence, subsequent predictors learn to focus on problematic records in the dataset. Boosting ensembles such as AdaBoost have shown to be effective models at fighting both high variance and high bias, even in challenging situations such as class imbalance. However, some aspects of AdaBoost might imply limitations for its deployment in the real world. On the one hand, focusing on problematic records can lead to overfitting in the presence of random noise. On the other hand, learning a boosting ensemble that assigns higher weights to hard-to-classify people might throw up serious questions in the age of responsible and transparent data analytics; if a bank must tell a customer that they are denied a loan, because the underlying algorithm made a decision specifically focusing the customer since they are hard to classify, this could be legally dubious. To kill these two birds with one stone, we introduce BoostEMM: a variant of AdaBoost where in every iteration of the procedure, rather than boosting problematic records, we boost problematic subgroups as found through Exceptional Model Mining. Boosted records being part of a coherent group should prevent overfitting, and explicit definitions of the subgroups of people being boosted enhances the transparency of the algorithm.

Originele taal-2Engels
TitelProceedings of the Second Workshop on MIning DAta for financial applicationS (MIDAS 2017), 18 September 2017, Skopje, Macedonia
RedacteurenI. Bordino, G. Caldarelli, F. Fumarola, F. Gullo, T. Squartini
Pagina's5-16
Aantal pagina's12
StatusGepubliceerd - 2017
EvenementSecond Workshop on MIning DAta for financial applicationS (MIDAS 2017), Skopje, Macedonia, September 18, 2017 - Skopje, Macedonië
Duur: 18 sep 2017 → …
Congresnummer: 2nd
http://ceur-ws.org/Vol-1941/

Publicatie series

NaamCEUR Workshop Proceedings
UitgeverijCEUR-WS.org
Volume1941
ISSN van geprinte versie1613-0073

Congres

CongresSecond Workshop on MIning DAta for financial applicationS (MIDAS 2017), Skopje, Macedonia, September 18, 2017
Verkorte titelMIDAS 2017
LandMacedonië
StadSkopje
Periode18/09/17 → …
Internet adres

    Vingerafdruk

Citeer dit

van der Zon, S. B., Zeev Ben Mordehay, O., Vrijdag, T. S., van Ipenburg, W., Veldsink, J., Duivesteijn, W., & Pechenizkiy, M. (2017). BoostEMM : Transparent boosting using exceptional model mining. In I. Bordino, G. Caldarelli, F. Fumarola, F. Gullo, & T. Squartini (editors), Proceedings of the Second Workshop on MIning DAta for financial applicationS (MIDAS 2017), 18 September 2017, Skopje, Macedonia (blz. 5-16). (CEUR Workshop Proceedings; Vol. 1941).