
The Library
Insights into performance evaluation of compound–protein interaction prediction methods
Tools
Yaseen, Adiba, Amin, Imran, Akhter, Naeem, Ben-Hur, Asa and Minhas, Fayyaz ul Amir Afsar (2022) Insights into performance evaluation of compound–protein interaction prediction methods. Bioinformatics, 38 (Supplement 2). ii75-ii81. doi:10.1093/bioinformatics/btac496 ISSN 1460-2059.
![]() |
PDF
WRAP-Insights-performance-evaluation-com-pound-protein-interaction-prediction-methods-22.pdf - Accepted Version Embargoed item. Restricted access to Repository staff only until 18 September 2023. Contact author directly, specifying your specific needs. - Requires a PDF viewer. Download (716Kb) |
Official URL: https://doi.org/10.1093/bioinformatics/btac496
Abstract
Motivation Machine-learning-based prediction of compound–protein interactions (CPIs) is important for drug design, screening and repurposing. Despite numerous recent publication with increasing methodological sophistication claiming consistent improvements in predictive accuracy, we have observed a number of fundamental issues in experiment design that produce overoptimistic estimates of model performance. Results We systematically analyze the impact of several factors affecting generalization performance of CPI predictors that are overlooked in existing work: (i) similarity between training and test examples in cross-validation; (ii) synthesizing negative examples in absence of experimentally verified negative examples and (iii) alignment of evaluation protocol and performance metrics with real-world use of CPI predictors in screening large compound libraries. Using both state-of-the-art approaches by other researchers as well as a simple kernel-based baseline, we have found that effective assessment of generalization performance of CPI predictors requires careful control over similarity between training and test examples. We show that, under stringent performance assessment protocols, a simple kernel-based approach can exceed the predictive performance of existing state-of-the-art methods. We also show that random pairing for generating synthetic negative examples for training and performance evaluation results in models with better generalization in comparison to more sophisticated strategies used in existing studies. Our analyses indicate that using proposed experiment design strategies can offer significant improvements for CPI prediction leading to effective target compound screening for drug repurposing and discovery of putative chemical ligands of SARS-CoV-2-Spike and Human-ACE2 proteins. Availability and implementation Code and supplementary material available at https://github.com/adibayaseen/HKRCPI. Supplementary information Supplementary data are available at Bioinformatics online.
Item Type: | Journal Article | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software R Medicine > RS Pharmacy and materia medica T Technology > TA Engineering (General). Civil engineering (General) |
|||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | |||||||||
SWORD Depositor: | Library Publications Router | |||||||||
Library of Congress Subject Headings (LCSH): | Protein-protein interactions, Inorganic compounds, Deep learning (Machine learning), Drugs -- Design | |||||||||
Journal or Publication Title: | Bioinformatics | |||||||||
Publisher: | Oxford University Press (OUP) | |||||||||
ISSN: | 1460-2059 | |||||||||
Official Date: | 18 September 2022 | |||||||||
Dates: |
|
|||||||||
Volume: | 38 | |||||||||
Number: | Supplement 2 | |||||||||
Page Range: | ii75-ii81 | |||||||||
DOI: | 10.1093/bioinformatics/btac496 | |||||||||
Status: | Peer Reviewed | |||||||||
Publication Status: | Published | |||||||||
Reuse Statement (publisher, data, author rights): | This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record Adiba Yaseen, Imran Amin, Naeem Akhter, Asa Ben-Hur, Fayyaz Minhas, Insights into performance evaluation of compound–protein interaction prediction methods, Bioinformatics, Volume 38, Issue Supplement_2, September 2022, Pages ii75–ii81 is available online at: https://doi.org/10.1093/bioinformatics/btac496 | |||||||||
Access rights to Published version: | Open Access (Creative Commons) | |||||||||
Description: | Free access |
|||||||||
Date of first compliant deposit: | 27 October 2022 | |||||||||
RIOXX Funder/Project Grant: |
|
|||||||||
Related URLs: | ||||||||||
Open Access Version: |
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |