TY - GEN
T1 - Soft Learning Probabilistic Circuits
AU - Ghandi, Soroush
AU - Quost, Benjamin
AU - de Campos, Cassio P.
PY - 2024
Y1 - 2024
N2 - Probabilistic Circuits (PCs) are prominent tractable probabilistic models, allowing for a wide range of exact inferences. This paper focuses on the main algorithm for training PCs, LearnSPN, a gold standard due to its efficiency, performance, and ease of use, in particular for tabular data. We show that LearnSPN is a greedy likelihood maximizer under mild assumptions. While inferences in PCs may use the entire circuit structure for processing queries, LearnSPN applies a hard method for learning them, propagating at each sum node a data point through one and only one of the children/edges as in a hard clustering process. We propose a new learning procedure named SoftLearn, that induces a PC using a soft clustering process. We investigate the effect of this learning-inference compatibility in PCs. Our experiments show that SoftLearn outperforms LearnSPN in many situations, yielding better likelihoods and arguably better samples. We also analyze comparable tractable models to highlight the differences between soft/hard learning and model querying.
AB - Probabilistic Circuits (PCs) are prominent tractable probabilistic models, allowing for a wide range of exact inferences. This paper focuses on the main algorithm for training PCs, LearnSPN, a gold standard due to its efficiency, performance, and ease of use, in particular for tabular data. We show that LearnSPN is a greedy likelihood maximizer under mild assumptions. While inferences in PCs may use the entire circuit structure for processing queries, LearnSPN applies a hard method for learning them, propagating at each sum node a data point through one and only one of the children/edges as in a hard clustering process. We propose a new learning procedure named SoftLearn, that induces a PC using a soft clustering process. We investigate the effect of this learning-inference compatibility in PCs. Our experiments show that SoftLearn outperforms LearnSPN in many situations, yielding better likelihoods and arguably better samples. We also analyze comparable tractable models to highlight the differences between soft/hard learning and model querying.
M3 - Conference contribution
T3 - Proceedings of Machine Learning Research
SP - 273
EP - 294
BT - 12th International Conference on Probabilistic Graphical Models, 11-13 September 2024, De Lindenberg, Nijmegen, the Netherlands
A2 - Kwisthout, Johan
A2 - Renooij, Silja
PB - PMLR
T2 - 12th International Conference on Probabilistic Graphical Models
Y2 - 11 September 2024 through 13 September 2024
ER -