Learning optimal classification trees using a binary linear program formulation

Sicco E. Verwer, Y. Zhang

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

Abstract

We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.
LanguageEnglish
Title of host publication33rd AAAI Conference on Artificial Intelligence
PublisherAAAI
Number of pages8
StatePublished - 2019
Event33rd AAAI Conference on Artificial Intelligence - Hawaii, Honolulu, United States
Duration: 27 Jan 20191 Feb 2019
Conference number: 33
https://aaai.org/Conferences/AAAI-19/

Conference

Conference33rd AAAI Conference on Artificial Intelligence
Abbreviated titleAAAI-19
CountryUnited States
CityHonolulu
Period27/01/191/02/19
Internet address

Fingerprint

Linear programming

Keywords

  • Machine learning
  • decision tree
  • mathematical optimization

Cite this

Verwer, S. E., & Zhang, Y. (2019). Learning optimal classification trees using a binary linear program formulation. In 33rd AAAI Conference on Artificial Intelligence AAAI.
Verwer, Sicco E. ; Zhang, Y./ Learning optimal classification trees using a binary linear program formulation. 33rd AAAI Conference on Artificial Intelligence. AAAI, 2019.
@inproceedings{366ce919deb0461bb7ab1393e06d731d,
title = "Learning optimal classification trees using a binary linear program formulation",
abstract = "We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.",
keywords = "Machine learning, decision tree, mathematical optimization",
author = "Verwer, {Sicco E.} and Y. Zhang",
year = "2019",
language = "English",
booktitle = "33rd AAAI Conference on Artificial Intelligence",
publisher = "AAAI",

}

Verwer, SE & Zhang, Y 2019, Learning optimal classification trees using a binary linear program formulation. in 33rd AAAI Conference on Artificial Intelligence. AAAI, 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States, 27/01/19.

Learning optimal classification trees using a binary linear program formulation. / Verwer, Sicco E.; Zhang, Y.

33rd AAAI Conference on Artificial Intelligence. AAAI, 2019.

Research output: Chapter in Book/Report/Conference proceedingConference contributionAcademicpeer-review

TY - GEN

T1 - Learning optimal classification trees using a binary linear program formulation

AU - Verwer,Sicco E.

AU - Zhang,Y.

PY - 2019

Y1 - 2019

N2 - We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.

AB - We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.

KW - Machine learning

KW - decision tree

KW - mathematical optimization

M3 - Conference contribution

BT - 33rd AAAI Conference on Artificial Intelligence

PB - AAAI

ER -

Verwer SE, Zhang Y. Learning optimal classification trees using a binary linear program formulation. In 33rd AAAI Conference on Artificial Intelligence. AAAI. 2019.