TY - JOUR
T1 - The shape of things to come: Topological Data Analysis and biology, from molecules to organisms
AU - Amézquita, Erik
AU - Quigley, Michelle
AU - Ophelders, Tim A.E.
AU - Munch, Elizabeth
AU - Chitwood, Daniel
PY - 2020/7/1
Y1 - 2020/7/1
N2 - Shape is data and data is shape. Biologists are accustomed to thinking about how the shape of biomolecules, cells, tissues, and organisms arise from the effects of genetics, development, and the environment. Less often do we consider that data itself has shape and structure, or that it is possible to measure the shape of data and analyze it. Here, we review applications of topological data analysis (TDA) to biology in a way accessible to biologists and applied mathematicians alike. TDA uses principles from algebraic topology to comprehensively measure shape in data sets. Using a function that relates the similarity of data points to each other, we can monitor the evolution of topological features—connected components, loops, and voids. This evolution, a topological signature, concisely summarizes large, complex data sets. We first provide a TDA primer for biologists before exploring the use of TDA across biological sub‐disciplines, spanning structural biology, molecular biology, evolution, and development. We end by comparing and contrasting different TDA approaches and the potential for their use in biology. The vision of TDA, that data are shape and shape is data, will be relevant as biology transitions into a data‐driven era where the meaningful interpretation of large data sets is a limiting factor.
AB - Shape is data and data is shape. Biologists are accustomed to thinking about how the shape of biomolecules, cells, tissues, and organisms arise from the effects of genetics, development, and the environment. Less often do we consider that data itself has shape and structure, or that it is possible to measure the shape of data and analyze it. Here, we review applications of topological data analysis (TDA) to biology in a way accessible to biologists and applied mathematicians alike. TDA uses principles from algebraic topology to comprehensively measure shape in data sets. Using a function that relates the similarity of data points to each other, we can monitor the evolution of topological features—connected components, loops, and voids. This evolution, a topological signature, concisely summarizes large, complex data sets. We first provide a TDA primer for biologists before exploring the use of TDA across biological sub‐disciplines, spanning structural biology, molecular biology, evolution, and development. We end by comparing and contrasting different TDA approaches and the potential for their use in biology. The vision of TDA, that data are shape and shape is data, will be relevant as biology transitions into a data‐driven era where the meaningful interpretation of large data sets is a limiting factor.
KW - biology
KW - data science
KW - mathematical biology
KW - persistent homology
KW - shape
KW - topological data analysis
UR - http://www.scopus.com/inward/record.url?scp=85083309912&partnerID=8YFLogxK
U2 - 10.1002/dvdy.175
DO - 10.1002/dvdy.175
M3 - Article
C2 - 32246730
SN - 1058-8388
VL - 249
SP - 816
EP - 833
JO - Developmental Dynamics
JF - Developmental Dynamics
IS - 7
ER -