Abstract
This paper introduces an Automated Machine Learning (AutoML) framework specifically designed to efficiently synthesize end-to-end multimodal machine learning pipelines. Traditional reliance on the computationally demanding Neural Architecture Search is minimized through the strategic integration of pre-trained transformer models. This innovative approach enables the effective unification of diverse data modalities into high-dimensional embeddings, streamlining the pipeline development process. We leverage an advanced Bayesian Optimization strategy, informed by meta-learning, to facilitate the warm-starting of the pipeline synthesis, thereby enhancing computational efficiency. Our methodology demonstrates its potential to create advanced and custom multimodal pipelines within limited computational resources. Extensive testing across 23 varied multimodal datasets indicates the promise and utility of our framework in diverse scenarios. The results contribute to the ongoing efforts in the AutoML field, suggesting new possibilities for efficiently handling complex multimodal data. This research represents a step towards developing more efficient and versatile tools in multimodal machine learning pipeline development, acknowledging the collaborative and ever-evolving nature of this field.
Original language | English |
---|---|
Pages (from-to) | 7011-7053 |
Number of pages | 43 |
Journal | Machine Learning |
Volume | 113 |
Issue number | 9 |
DOIs | |
Publication status | Published - Sept 2024 |
Bibliographical note
Publisher Copyright:© The Author(s) 2024.
Keywords
- Automated machine learning (AutoML)
- Bayesian optimization (BO)
- Multimodal data
- Pre-trained transformer models