Abstract
Boosting is an iterative ensemble-learning paradigm. Every iteration, a weak predictor learns a classification task, taking into account performance achieved in previous iterations. This is done by assigning weights to individual records of the dataset, which are increased if the record is misclassified by the previous weak predictor. Hence, subsequent predictors learn to focus on problematic records in the dataset. Boosting ensembles such as AdaBoost have shown to be effective models at fighting both high variance and high bias, even in challenging situations such as class imbalance. However, some aspects of AdaBoost might imply limitations for its deployment in the real world. On the one hand, focusing on problematic records can lead to overfitting in the presence of random noise. On the other hand, learning a boosting ensemble that assigns higher weights to hard-to-classify people might throw up serious questions in the age of responsible and transparent data analytics; if a bank must tell a customer that they are denied a loan, because the underlying algorithm made a decision specifically focusing the customer since they are hard to classify, this could be legally dubious. To kill these two birds with one stone, we introduce BoostEMM: a variant of AdaBoost where in every iteration of the procedure, rather than boosting problematic records, we boost problematic subgroups as found through Exceptional Model Mining. Boosted records being part of a coherent group should prevent overfitting, and explicit definitions of the subgroups of people being boosted enhances the transparency of the algorithm.
Original language | English |
---|---|
Title of host publication | Proceedings of the Second Workshop on MIning DAta for financial applicationS (MIDAS 2017), 18 September 2017, Skopje, Macedonia |
Editors | I. Bordino, G. Caldarelli, F. Fumarola, F. Gullo, T. Squartini |
Pages | 5-16 |
Number of pages | 12 |
Publication status | Published - 2017 |
Event | Second Workshop on MIning DAta for financial applicationS (MIDAS 2017), Skopje, Macedonia, September 18, 2017 - Skopje, Macedonia, The Former Yugoslav Republic of Duration: 18 Sep 2017 → … Conference number: 2nd http://ceur-ws.org/Vol-1941/ |
Publication series
Name | CEUR Workshop Proceedings |
---|---|
Publisher | CEUR-WS.org |
Volume | 1941 |
ISSN (Print) | 1613-0073 |
Conference
Conference | Second Workshop on MIning DAta for financial applicationS (MIDAS 2017), Skopje, Macedonia, September 18, 2017 |
---|---|
Abbreviated title | MIDAS 2017 |
Country | Macedonia, The Former Yugoslav Republic of |
City | Skopje |
Period | 18/09/17 → … |
Internet address |
Keywords
- Boosting
- Class imbalance
- Exceptional Model Mining
- Model transparency
- Responsible analytics