Data for paper 'Shape matters: Inferring the motility of confluent cells from static images' (Soft Matter, 2025)

Dataset

Description

This dataset accompanies the research article “Shape Matters: Inferring the motility of confluent cells from static images” (Soft Matter, 2025). The data was generated using simulations based on the Cellular Potts model, a computational framework widely used to study collective cell behavior and tissue morphogenesis.

Feature data

The simulations model a confluent layer of cells with subpopulations that are characterized by a different motility. Simulations were run for 6 different conditions (see labeling in the table below) and with different number of high-motility (Na) and low-motility cells.




 
High-motility cells
Low-motility cells


A
kappa = 1500
kappa = 0


B
kappa = 1500
kappa = 150


C
kappa = 750
kappa = 150


D
kappa = 500
kappa = 150


E
kappa = 375
kappa = 150


F
kappa = 300
kappa = 150




 

From each individual cell in the simulations, we extracted an extensive list of features (see Table 1 in the manuscript for the definitions). The complete dataset of extracted features is available in the folders of this repository.

File Naming Convention:Files are named using the format:

            A/M/ML_data_M_.pkl

Where:



indicates the number of high-motility cells (Na)

 denotes the index of the independent simulation replicate


 

Machine Learning results 

The features are used in a machine-learning model. This model uses the features to generate a classification report. The machine-learning model has used either the complete data set (All) or a subset of the features (e.g. Local_and_Shape). The results of these model calculations are stored in the ML folder.

File Naming Convention:

Files are named using the format:

            ML/A//on__trained

Where:



indicates which subset of the data has been used

indicates what training set has been used (e.g. itself = same training/testing, 1 = trained on simulations with 1 high-motility cell)


For each condition, the 20 independent classification report are provided.

 
Date made available19 Jun 2025
PublisherZenodo

Cite this