
The Library
NeuRiPP : neural network identification of RiPP precursor peptides
Tools
de los Santos, Emmanuel L. C. (2019) NeuRiPP : neural network identification of RiPP precursor peptides. Scientific Reports, 9 (1). 13406. doi:10.1038/s41598-019-49764-z ISSN 2045-2322.
|
PDF
WRAP-NeuRiPP-neural-network-identification-RiPP-precursor-peptides-delosSantos-2019.pdf - Published Version - Requires a PDF viewer. Available under License Creative Commons Attribution 4.0. Download (1330Kb) | Preview |
Official URL: http://dx.doi.org/10.1038/s41598-019-49764-z
Abstract
Significant progress has been made in the past few years on the computational identification of biosynthetic gene clusters (BGCs) that encode ribosomally synthesized and post-translationally modified peptides (RiPPs). This is done by identifying both RiPP tailoring enzymes (RTEs) and RiPP precursor peptides (PPs). However, identification of PPs, particularly for novel RiPP classes remains challenging. To address this, machine learning has been used to accurately identify PP sequences. Current machine learning tools have limitations, since they are specific to the RiPPclass they are trained for and are context-dependent, requiring information about the surrounding genetic environment of the putative PP sequences. NeuRiPP overcomes these limitations. It does this by leveraging the rich data set of high-confidence putative PP sequences from existing programs, along with experimentally verified PPs from RiPP databases. NeuRiPP uses neural network archictectures that are suitable for peptide classification with weights trained on PP datasets. It is able to identify known PP sequences, and sequences that are likely PPs. When tested on existing RiPP BGC datasets, NeuRiPP was able to identify PP sequences in significantly more putative RiPP clusters than current tools while maintaining the same HMM hit accuracy. Finally, NeuRiPP was able to successfully identify PP sequences from novel RiPP classes that were recently characterized experimentally, highlighting its utility in complementing existing bioinformatics tools.
Item Type: | Journal Article | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
Subjects: | Q Science > QD Chemistry Q Science > QP Physiology |
|||||||||
Divisions: | Faculty of Science, Engineering and Medicine > Science > Life Sciences (2010- ) | |||||||||
Library of Congress Subject Headings (LCSH): | Peptides , Information storage and retrieval systems -- Nucleotide sequence | |||||||||
Journal or Publication Title: | Scientific Reports | |||||||||
Publisher: | Nature Publishing Group | |||||||||
ISSN: | 2045-2322 | |||||||||
Official Date: | 16 September 2019 | |||||||||
Dates: |
|
|||||||||
Volume: | 9 | |||||||||
Number: | 1 | |||||||||
Article Number: | 13406 | |||||||||
DOI: | 10.1038/s41598-019-49764-z | |||||||||
Status: | Peer Reviewed | |||||||||
Publication Status: | Published | |||||||||
Access rights to Published version: | Open Access (Creative Commons) | |||||||||
Date of first compliant deposit: | 18 September 2019 | |||||||||
Date of first compliant Open Access: | 20 September 2019 | |||||||||
RIOXX Funder/Project Grant: |
|
Request changes or add full text files to a record
Repository staff actions (login required)
![]() |
View Item |
Downloads
Downloads per month over past year