Supporting Information
For the article "CHEMOGRAPHY: SEARCHING FOR HIDDEN TREASURES"

Yuliana Zabolotna, Arkadii Lin, Dragos Horvath, Gilles Marcou, Dmitriy M. Volochnyuk, Alexandre Varnek

Corresponding Author
Prof. A. Varnek, E-mail: varnek@unistra.fr.

Abstract

The days when medicinal chemistry was limited to a few series of compounds of therapeutic interest are long gone. Nowadays, no human may succeed to acquire a complete overview of more than a billion existing or feasible compounds within which the potential “blockbuster drugs” are well hidden, and yet only a few mouse clicks away. To reach these «hidden treasures», we adapted Generative Topographic Mapping to enable efficient navigation through the chemical space, from a global overview to structural pattern detection, covering, for the first time, the complete ZINC library of purchasable compounds, relative to 1.6 million biologically relevant ChEMBL molecules. About 40 000 hierarchical maps of the chemical space were constructed. Structural motifs inherent to only one library were identified. Roughly 20 000 off-market ChEMBL compound families represent incentives to enrich commercial catalogs. Alternatively, 125 000 ZINC-specific compound classes, absent in structure-activity bases are novel paths to explore in medicinal chemistry.


We are pleased to share with you a complete list of the abovementioned ZINC- and ChEMBL-specific chemotypes. The link will be given after registration below.

Please note, that ChEMBL and ZINC datasets were split into four subsets - Fragment-Like, Lead-Like, Drug-Like and PPI-Like, and then analyzed pairwise. Thus, Fragment-Like ChEMBL-specific chemotypes are absent from Fragment-Like ZINC but can be present in the unfiltered version of ZINC.


Sign in to Google to save your progress. Learn more
Name *
Email *
Organization
Position
City / Region
Submit
Clear form
Never submit passwords through Google Forms.
This content is neither created nor endorsed by Google. Report Abuse - Terms of Service - Privacy Policy