Skip to content Skip to navigation
University of Warwick
  • Study
  • |
  • Research
  • |
  • Business
  • |
  • Alumni
  • |
  • News
  • |
  • About

University of Warwick
Publications service & WRAP

Highlight your research

  • WRAP
    • Home
    • Search WRAP
    • Browse by Warwick Author
    • Browse WRAP by Year
    • Browse WRAP by Subject
    • Browse WRAP by Department
    • Browse WRAP by Funder
    • Browse Theses by Department
  • Publications Service
    • Home
    • Search Publications Service
    • Browse by Warwick Author
    • Browse Publications service by Year
    • Browse Publications service by Subject
    • Browse Publications service by Department
    • Browse Publications service by Funder
  • Help & Advice
University of Warwick

The Library

  • Login
  • Admin

Insights into performance evaluation of compound–protein interaction prediction methods

Tools
- Tools
+ Tools

Yaseen, Adiba, Amin, Imran, Akhter, Naeem, Ben-Hur, Asa and Minhas, Fayyaz ul Amir Afsar (2022) Insights into performance evaluation of compound–protein interaction prediction methods. Bioinformatics, 38 (Supplement 2). ii75-ii81. doi:10.1093/bioinformatics/btac496 ISSN 1460-2059.

[img] PDF
WRAP-Insights-performance-evaluation-com-pound-protein-interaction-prediction-methods-22.pdf - Accepted Version
Embargoed item. Restricted access to Repository staff only until 18 September 2023. Contact author directly, specifying your specific needs. - Requires a PDF viewer.

Download (716Kb)
Official URL: https://doi.org/10.1093/bioinformatics/btac496

Request Changes to record.

Abstract

Motivation Machine-learning-based prediction of compound–protein interactions (CPIs) is important for drug design, screening and repurposing. Despite numerous recent publication with increasing methodological sophistication claiming consistent improvements in predictive accuracy, we have observed a number of fundamental issues in experiment design that produce overoptimistic estimates of model performance. Results We systematically analyze the impact of several factors affecting generalization performance of CPI predictors that are overlooked in existing work: (i) similarity between training and test examples in cross-validation; (ii) synthesizing negative examples in absence of experimentally verified negative examples and (iii) alignment of evaluation protocol and performance metrics with real-world use of CPI predictors in screening large compound libraries. Using both state-of-the-art approaches by other researchers as well as a simple kernel-based baseline, we have found that effective assessment of generalization performance of CPI predictors requires careful control over similarity between training and test examples. We show that, under stringent performance assessment protocols, a simple kernel-based approach can exceed the predictive performance of existing state-of-the-art methods. We also show that random pairing for generating synthetic negative examples for training and performance evaluation results in models with better generalization in comparison to more sophisticated strategies used in existing studies. Our analyses indicate that using proposed experiment design strategies can offer significant improvements for CPI prediction leading to effective target compound screening for drug repurposing and discovery of putative chemical ligands of SARS-CoV-2-Spike and Human-ACE2 proteins. Availability and implementation Code and supplementary material available at https://github.com/adibayaseen/HKRCPI. Supplementary information Supplementary data are available at Bioinformatics online.

Item Type: Journal Article
Subjects: Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software
R Medicine > RS Pharmacy and materia medica
T Technology > TA Engineering (General). Civil engineering (General)
Divisions: Faculty of Science, Engineering and Medicine > Science > Computer Science
SWORD Depositor: Library Publications Router
Library of Congress Subject Headings (LCSH): Protein-protein interactions, Inorganic compounds, Deep learning (Machine learning), Drugs -- Design
Journal or Publication Title: Bioinformatics
Publisher: Oxford University Press (OUP)
ISSN: 1460-2059
Official Date: 18 September 2022
Dates:
DateEvent
18 September 2022Published
Volume: 38
Number: Supplement 2
Page Range: ii75-ii81
DOI: 10.1093/bioinformatics/btac496
Status: Peer Reviewed
Publication Status: Published
Reuse Statement (publisher, data, author rights): This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record Adiba Yaseen, Imran Amin, Naeem Akhter, Asa Ben-Hur, Fayyaz Minhas, Insights into performance evaluation of compound–protein interaction prediction methods, Bioinformatics, Volume 38, Issue Supplement_2, September 2022, Pages ii75–ii81 is available online at: https://doi.org/10.1093/bioinformatics/btac496
Access rights to Published version: Open Access (Creative Commons)
Description:

Free access

Date of first compliant deposit: 27 October 2022
RIOXX Funder/Project Grant:
Project/Grant IDRIOXX Funder NameFunder ID
NRPU 6085Higher Education Commission, Pakistanhttp://dx.doi.org/10.13039/501100004681
PathLAKEUniversity of Warwickhttp://dx.doi.org/10.13039/501100000741
Related URLs:
  • https://academic.oup.com/journals/pages/...
Open Access Version:
  • Publisher

Request changes or add full text files to a record

Repository staff actions (login required)

View Item View Item
twitter

Email us: wrap@warwick.ac.uk
Contact Details
About Us