TY - JOUR
T1 - Wasserstein metric for improved quantum machine learning with adjacency matrix representations
AU - Caylak, Onur
AU - von Lilienfeld, O. Anatole
AU - Baumeier, Björn
PY - 2020/9
Y1 - 2020/9
N2 - We study the Wasserstein metric to measure distances between molecules represented by the atom index dependent adjacency 'Coulomb' matrix, used in kernel ridge regression based supervised learning. Resulting machine learning models of quantum properties, a.k.a. quantum machine learning models exhibit improved training efficiency and result in smoother predictions of energies related to molecular distortions. We first illustrate smoothness for the continuous extraction of an atom from some organic molecule. Learning curves, quantifying the decay of the atomization energy's prediction error as a function of training set size, have been obtained for tens of thousands of organic molecules drawn from the QM9 data set. In comparison to conventionally used metrics (L1 and L2 norm), our numerical results indicate systematic improvement in terms of learning curve off-set for random as well as sorted (by norms of row) atom indexing in Coulomb matrices. Our findings suggest that this metric corresponds to a favorable similarity measure which introduces index-invariance in any kernel based model relying on adjacency matrix representations.
AB - We study the Wasserstein metric to measure distances between molecules represented by the atom index dependent adjacency 'Coulomb' matrix, used in kernel ridge regression based supervised learning. Resulting machine learning models of quantum properties, a.k.a. quantum machine learning models exhibit improved training efficiency and result in smoother predictions of energies related to molecular distortions. We first illustrate smoothness for the continuous extraction of an atom from some organic molecule. Learning curves, quantifying the decay of the atomization energy's prediction error as a function of training set size, have been obtained for tens of thousands of organic molecules drawn from the QM9 data set. In comparison to conventionally used metrics (L1 and L2 norm), our numerical results indicate systematic improvement in terms of learning curve off-set for random as well as sorted (by norms of row) atom indexing in Coulomb matrices. Our findings suggest that this metric corresponds to a favorable similarity measure which introduces index-invariance in any kernel based model relying on adjacency matrix representations.
KW - Adjacency Matrix Representation
KW - Atomization Energies
KW - Kernel Ridge Regression
KW - Quantum Machine Learning
KW - Wasserstein metric
UR - http://www.scopus.com/inward/record.url?scp=85097812039&partnerID=8YFLogxK
U2 - 10.1088/2632-2153/aba048
DO - 10.1088/2632-2153/aba048
M3 - Article
SN - 2632-2153
VL - 1
JO - Machine Learning: Science and Technology
JF - Machine Learning: Science and Technology
IS - 3
M1 - 03LT01
ER -