Learning optimal classification trees using a binary linear program formulation

Sicco E. Verwer, Y. Zhang

Research output: Contribution to conferencePaperAcademic

Abstract

We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.

Conference

Conference33rd AAAI Conference on Artificial Intelligence
Abbreviated titleAAAI-19
CountryUnited States
CityHonolulu
Period27/01/191/02/19
Internet address

Fingerprint

Linear programming

Keywords

  • Machine learning
  • decision tree
  • mathematical optimization

Cite this

Verwer, S. E., & Zhang, Y. (2019). Learning optimal classification trees using a binary linear program formulation. Paper presented at 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States.
Verwer, Sicco E. ; Zhang, Y./ Learning optimal classification trees using a binary linear program formulation. Paper presented at 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States.8 p.
@conference{366ce919deb0461bb7ab1393e06d731d,
title = "Learning optimal classification trees using a binary linear program formulation",
abstract = "We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.",
keywords = "Machine learning, decision tree, mathematical optimization",
author = "Verwer, {Sicco E.} and Y. Zhang",
year = "2019",
language = "English",
note = "33rd AAAI Conference on Artificial Intelligence, AAAI-19 ; Conference date: 27-01-2019 Through 01-02-2019",
url = "https://aaai.org/Conferences/AAAI-19/",

}

Verwer, SE & Zhang, Y 2019, 'Learning optimal classification trees using a binary linear program formulation' Paper presented at 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States, 27/01/19 - 1/02/19, .

Learning optimal classification trees using a binary linear program formulation. / Verwer, Sicco E.; Zhang, Y.

2019. Paper presented at 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States.

Research output: Contribution to conferencePaperAcademic

TY - CONF

T1 - Learning optimal classification trees using a binary linear program formulation

AU - Verwer,Sicco E.

AU - Zhang,Y.

PY - 2019

Y1 - 2019

N2 - We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.

AB - We provide a new formulation for the problem of learning the optimal classification tree of a given depth as a binary linear program. A limitation of previously proposed mathematical Optimization formulations is that they create constraints and variables for every row in the training data. As a result, the running time of the existing Integer Linear programming (ILP) formulations increases dramatically with the size of data. In our new binary formulation, we aim to circumvent this problem by making the formulation size largely independent from the training data size. We show experimentally that our formulation achieves better performance than existing formulations on both small and large problem instances within shorter running time.

KW - Machine learning

KW - decision tree

KW - mathematical optimization

M3 - Paper

ER -

Verwer SE, Zhang Y. Learning optimal classification trees using a binary linear program formulation. 2019. Paper presented at 33rd AAAI Conference on Artificial Intelligence, Honolulu, United States.