PepNet: A Fully Convolutional Neural Network for De novo Peptide Sequencing

ResearchSquare 2022  ·  Kaiyuan Liu, Yuzhen Ye, Haixu Tang ·

The de novo peptide sequencing, which does not rely on a comprehensive target sequence database, provided us a way to identify novel peptides from tandem mass (MS/MS) spectra. However, current de novo sequencing algorithms suffer from lower accuracy and coverage, which hinders their applications in proteomics. In this paper, we present PepNet, a fully convolutional neural network (CNN) for high accuracy de novo peptide sequencing. It takes an MS/MS spectrum (represented as a high dimensional vector) as input, and outputs the optimal peptide sequence along with its confidence score. Our model was trained using a total of 30 million high-energy collisional dissociation (HCD) MS/MS spectra from multiple human peptide spectral libraries. The evaluation results show that PepNet significantly outperformed currently best-performing de novo sequencing algorithms (e.g. PointNovo and DeepNovo) at both peptide level accuracy and positional level accuracy. In addition, PepNet can sequence a large fraction of spectra that were not identified by database search engines, and thus could be used as a complementary tool of database search engines for peptide identification in proteomics.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here