Abstract
Learning disentangled representations is suggested to help with generalisation in AI models. This is particularly obvious for combinatorial generalisation, the ability to combine familiar factors to produce new unseen combinations. Disentangling such factors should provide a clear method to generalise to novel combinations, but recent empirical studies suggest that this does not really happen in practice. Disentanglement methods typically assume i.i.d. training and test data, but for combinatorial generalisation we want to generalise towards factor combinations that can be considered out-of-distribution (OOD). There is a misalignment between the distribution of the observed data and the structure that is induced by the underlying factors.
A promising direction to address this misalignment is symmetry-based disentanglement, which is defined as disentangling symmetry transformations that induce a group structure underlying the data. Such a structure is independent of the (observed) distribution of the data and thus provides a sensible language to model OOD factor combinations as well. We investigate the combinatorial generalisation capabilities of a symmetry-based disentanglement model (LSBD-VAE) compared to traditional VAE-based disentanglement models. We observe that both types of models struggle with generalisation in more challenging settings, and that symmetry-based disentanglement appears to show no obvious improvement over traditional disentanglement. However, we also observe that even if LSBD-VAE assigns low likelihood to OOD combinations, the encoder may still generalise well by learning a meaningful mapping reflecting the underlying group structure.
A promising direction to address this misalignment is symmetry-based disentanglement, which is defined as disentangling symmetry transformations that induce a group structure underlying the data. Such a structure is independent of the (observed) distribution of the data and thus provides a sensible language to model OOD factor combinations as well. We investigate the combinatorial generalisation capabilities of a symmetry-based disentanglement model (LSBD-VAE) compared to traditional VAE-based disentanglement models. We observe that both types of models struggle with generalisation in more challenging settings, and that symmetry-based disentanglement appears to show no obvious improvement over traditional disentanglement. However, we also observe that even if LSBD-VAE assigns low likelihood to OOD combinations, the encoder may still generalise well by learning a meaningful mapping reflecting the underlying group structure.
Original language | English |
---|---|
Title of host publication | Advances in Intelligent Data Analysis XXI - 21st International Symposium on Intelligent Data Analysis, IDA 2023, Proceedings |
Editors | Bruno Crémilleux, Sibylle Hess, Siegfried Nijssen |
Pages | 433-445 |
Number of pages | 13 |
DOIs | |
Publication status | Published - 12 Apr 2023 |
Publication series
Name | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
---|---|
Volume | 13876 LNCS |
ISSN (Print) | 0302-9743 |
ISSN (Electronic) | 1611-3349 |