
The Library
Issues in performance evaluation for host–pathogen protein interaction prediction
Tools
Abbasi, Wajid Arshad and Minhas, Fayyaz ul Amir Afsar (2016) Issues in performance evaluation for host–pathogen protein interaction prediction. Journal of Bioinformatics and Computational Biology, 14 (3). 1650011. doi:10.1142/S0219720016500116 ISSN 0219-7200.
|
PDF
WRAP-issues-performance-evaluation-host-pathogen-protein-prediction-Minhas-2016.pdf - Submitted Version - Requires a PDF viewer. Download (1291Kb) | Preview |
Official URL: http://dx.doi.org/10.1142/S0219720016500116
Abstract
The study of interactions between host and pathogen proteins is important for understanding the underlying mechanisms of infectious diseases and for developing novel therapeutic solutions. Wet-lab techniques for detecting protein–protein interactions (PPIs) can benefit from computational predictions. Machine learning is one of the computational approaches that can assist biologists by predicting promising PPIs. A number of machine learning based methods for predicting host–pathogen interactions (HPI) have been proposed in the literature. The techniques used for assessing the accuracy of such predictors are of critical importance in this domain. In this paper, we question the effectiveness of K-fold cross-validation for estimating the generalization ability of HPI prediction for proteins with no known interactions. K-fold cross-validation does not model this scenario, and we demonstrate a sizable difference between its performance and the performance of an alternative evaluation scheme called leave one pathogen protein out (LOPO) cross-validation. LOPO is more effective in modeling the real world use of HPI predictors, specifically for cases in which no information about the interacting partners of a pathogen protein is available during training. We also point out that currently used metrics such as areas under the precision-recall or receiver operating characteristic curves are not intuitive to biologists and propose simpler and more directly interpretable metrics for this purpose.
Item Type: | Journal Article | ||||||||
---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QA Mathematics > QA76 Electronic computers. Computer science. Computer software Q Science > QL Zoology Q Science > QP Physiology |
||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Computer Science | ||||||||
Library of Congress Subject Headings (LCSH): | Protein-protein interactions, Protein-protein interactions -- Data processing, Host-parasite relationships, Host-parasite relationships -- Data processing, Machine learning | ||||||||
Journal or Publication Title: | Journal of Bioinformatics and Computational Biology | ||||||||
Publisher: | World Scientific Publishing | ||||||||
ISSN: | 0219-7200 | ||||||||
Official Date: | June 2016 | ||||||||
Dates: |
|
||||||||
Volume: | 14 | ||||||||
Number: | 3 | ||||||||
Article Number: | 1650011 | ||||||||
DOI: | 10.1142/S0219720016500116 | ||||||||
Status: | Peer Reviewed | ||||||||
Publication Status: | Published | ||||||||
Reuse Statement (publisher, data, author rights): | Electronic version of an article published as Journal of Bioinformatics and Computational Biology, 14 (3). 1650011. doi:10.1142/S0219720016500116 © copyright World Scientific Publishing Company https://www.worldscientific.com/worldscinet/jbcb | ||||||||
Access rights to Published version: | Restricted or Subscription Access | ||||||||
RIOXX Funder/Project Grant: |
|
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year