Bidirectional Molecule Generation with Recurrent Neural Networks

Francesca Grisoni, Michael Moret, Robin Lingwood, Gisbert Schneider

Research output: Contribution to journalArticleAcademicpeer-review

Abstract

Recurrent neural networks (RNNs) are able to generate de novo molecular designs using simplified molecular input line entry systems (SMILES) string representations of the chemical structure. RNN-based structure generation is usually performed unidirectionally, by growing SMILES strings from left to right. However, there is no natural start or end of a small molecule, and SMILES strings are intrinsically nonunivocal representations of molecular graphs. These properties motivate bidirectional structure generation. Here, bidirectional generative RNNs for SMILES-based molecule design are introduced. To this end, two established bidirectional methods were implemented, and a new method for SMILES string generation and data augmentation is introduced-the bidirectional molecule design by alternate learning (BIMODAL). These three bidirectional strategies were compared to the unidirectional forward RNN approach for SMILES string generation, in terms of the (i) novelty, (ii) scaffold diversity, and (iii) chemical-biological relevance of the computer-generated molecules. The results positively advocate bidirectional strategies for SMILES-based molecular de novo design, with BIMODAL showing superior results to the unidirectional forward RNN for most of the criteria in the tested conditions. The code of the methods and the pretrained models can be found at URL https://github.com/ETHmodlab/BIMODAL.

Original languageEnglish
Pages (from-to)1175-1183
Number of pages9
JournalJournal of Chemical Information and Modeling
Volume60
Issue number3
DOIs
Publication statusPublished - 23 Mar 2020
Externally publishedYes

Fingerprint Dive into the research topics of 'Bidirectional Molecule Generation with Recurrent Neural Networks'. Together they form a unique fingerprint.

Cite this