Decision trees are among the most popular classi- fication models in machine learning. Traditionally, they are learned using greedy algorithms. However, such algorithms have their disadvantages: it is dif- ficult to limit the size of the decision trees while maintaining a good classification accuracy, and it is hard to impose additional constraints on the models that are learned. For these reasons, there has been a recent interest in exact and flexible algorithms for learning decision trees. In this paper, we intro- duce a new approach to learn decision trees using constraint programming. Compared to earlier ap- proaches, we show that our approach obtains better performance, while still being sufficiently flexible to allow for the inclusion of constraints. Our ap- proach builds on three key building blocks: (1) the use of AND/OR search, (2) the use of caching, (3) the use of the CoverSize global constraint proposed recently for the problem of itemset mining. This al- lows our constraint programming approach to deal in a much more efficient way with the decomposi- tions in the learning problem.
Verhaeghe, H., Nijssen, S., Pesant, G., Quimper, C.-G., & Schaus, P. (2020). Learning Optimal Decision Trees using Constraint Programming (Extended Abstract). Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence, IJCAI2020, 2020(1), 4765-4769. https://hdl.handle.net/2078.5/269946 (Original work published 2020)