A Distance-Based Framework for the Characterization of Metabolic Heterogeneity in Large Sets of Genome-Scale Metabolic Models

Andrea Cabbia (Corresponding author), Peter A.J. Hilbers, Natal A.W. van Riel

Research output: Contribution to journalArticleAcademicpeer-review

1 Citation (Scopus)


Gene expression and protein abundance data of cells or tissues belonging to healthy and diseased individuals can be integrated and mapped onto genome-scale metabolic networks to produce patient-derived models. As the number of available and newly developed genome-scale metabolic models increases, new methods are needed to objectively analyze large sets of models and to identify the determinants of metabolic heterogeneity. We developed a distance-based workflow that combines consensus machine learning and metabolic modeling techniques and used it to apply pattern recognition algorithms to collections of genome-scale metabolic models, both microbial and human. Model composition, network topology and flux distribution provide complementary aspects of metabolic heterogeneity in patient-specific genome-scale models of skeletal muscle. Using consensus clustering analysis we identified the metabolic processes involved in the individual responses to resistance training in older adults. High-throughput techniques enable the analysis of complex biological systems at multiple levels, including genome, transcriptome, proteome, and metabolome. Integration of multi-omics data is often focused on dimensionality reduction and feature selection for classification tasks. Genome-scale metabolic models are extensive maps of the network of biochemical reactions taking place in a particular cell, tissue or organism. Each reaction is associated with the respective enzyme and gene, enabling the mapping of transcriptomics and proteomics data and providing a structure for the system-level interpretation of multi-omics datasets. The result of this process is a personalized model that gives a snapshot of the metabolic status of an individual. Analyzing these complex models, for example, to detect differences between individuals, is cumbersome. We applied consensus clustering to a set of data-driven models to monitor the progression of a lifestyle intervention in a cohort of older adults. Genome-scale metabolic models are maps of the metabolic network that function as structures for the integration of molecular data, such as transcriptomics and proteomics. We developed a method for the analysis of large sets of data-driven models, using different distance metrics to quantify model similarity. Consensus analysis is then used to reach a single metabolic distance. The method was applied to model the individual variability in the responses to resistance training in a cohort of older adults.

Original languageEnglish
Article number100080
Number of pages17
Issue number6
Publication statusPublished - 11 Sep 2020


  • Metabolism
  • Genome-scale metabolic model
  • Heterogeneity
  • Machine Learning
  • Distance
  • DSML 2: Proof-of-Concept: Data science output has been formulated, implemented, and tested for one domain/problem
  • distance
  • genome-scale metabolic models
  • metabolism
  • machine learning
  • heterogeneity


Dive into the research topics of 'A Distance-Based Framework for the Characterization of Metabolic Heterogeneity in Large Sets of Genome-Scale Metabolic Models'. Together they form a unique fingerprint.

Cite this