Less may be more : an informed reflection on molecular descriptors for drug design and discovery

[thumbnail of WRAP-less-may-more-reflection-molecular-descriptors-drug-Sosso-2019.pdf]
Preview
PDF
WRAP-less-may-more-reflection-molecular-descriptors-drug-Sosso-2019.pdf - Accepted Version - Requires a PDF viewer.

Download (7MB) | Preview

Request Changes to record.

Abstract

The phenomenal advances of machine learning in the context of drug design and discovery have led to the development of a plethora of molecular descriptors. In fact, many of these "standard" descriptors are now readily available via open source, easy-to-use computational tools. As a result, it is not uncommon to take advantage of large numbers - up to thousands in some cases - of these descriptors to predict the functional properties of drug-like molecules. This "strength in numbers" approach does usually provide excellent flexibility - and thus, good numerical accuracy - to the machine learning framework of choice; however, it suffers from a lack of transparency, in that it becomes very challenging to pinpoint the - usually, few - descriptors that are playing a key role in determining the functional properties of a given molecule. In this work, we show that just a handful of well-tailored molecular descriptors may often be capable to predict the functional properties of drug-like molecules with an accuracy comparable to that obtained by using hundreds of standard descriptors. In particular, we apply feature selection and genetic algorithms to in-house descriptors we have developed building on junction trees and symmetry functions, respectively. We find that information from as few as 10-20 molecular fragments is often enough to predict with decent accuracy even complex biomedical activities. In addition, we demonstrate that the usage of small sets of optimised symmetry functions may pave the way towards the prediction of the physical properties of drugs in their solid phases - a pivotal challenge for the pharmaceutical industry. Thus, this work brings strong arguments in support of the usage of small numbers of selected descriptors to discover the structure-function relation of drug-like molecules - as opposed to blindly leveraging the flexibility of the thousands of molecular descriptors currently available.

Item Type: Journal Article
Subjects: Q Science > Q Science (General)
R Medicine > RS Pharmacy and materia medica
Divisions: Faculty of Science, Engineering and Medicine > Science > Chemistry
Library of Congress Subject Headings (LCSH): Drugs -- Design -- Molecular aspects -- Research, Machine learning -- Research
Journal or Publication Title: Molecular Systems Design & Engineering
Publisher: Royal Society of Chemistry (RSC)
ISSN: 2058-9689
Official Date: 1 January 2020
Dates:
Date
Event
1 January 2020
Published
8 November 2019
Available
7 November 2019
Accepted
Volume: 5
Number: 1
Page Range: pp. 317-329
DOI: 10.1039/C9ME00109C
Status: Peer Reviewed
Publication Status: Published
Access rights to Published version: Restricted or Subscription Access
Date of first compliant deposit: 12 November 2019
Date of first compliant Open Access: 8 November 2020
RIOXX Funder/Project Grant:
Project/Grant ID
RIOXX Funder Name
Funder ID
EP/L015374/1
[EPSRC] Engineering and Physical Sciences Research Council
EP/L015374/1
University of Warwick
URI: https://wrap.warwick.ac.uk/129507/

Export / Share Citation


Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item