Abstract
Prediction of protein cleavage sites is an important step in drug design. Recent research has demonstrated that conditional random fields are capable of predicting the cleavage site locations of signal peptides, and their performance is comparable to that of SignalP-a state-of-the-art predictor based on hidden Markov models and neural networks. This paper investigates the degree of complementarity between CRF-based predictors and SignalP and proposes using the complementary properties to fuse the two predictors. It was found that about 40% of the sequences that are incorrectly predicted by SignalP can be correctly predicted by CRF, and that about 30% of the sequences that are incorrectly predicted by CRF can be correctly predicted by SignalP. This suggests that the two predictors complement each other. The paper also shows that the performance of CRF can be further improved by constructing the state features from spatially dispersed amino acids in the training sequences.
Original language | English (US) |
---|---|
Pages | 716-721 |
Number of pages | 6 |
State | Published - 2009 |
Event | Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference, APSIPA ASC 2009 - Sapporo, Japan Duration: Oct 4 2009 → Oct 7 2009 |
Other
Other | Asia-Pacific Signal and Information Processing Association 2009 Annual Summit and Conference, APSIPA ASC 2009 |
---|---|
Country/Territory | Japan |
City | Sapporo |
Period | 10/4/09 → 10/7/09 |
All Science Journal Classification (ASJC) codes
- Computer Networks and Communications
- Information Systems
- Electrical and Electronic Engineering
- Communication
Keywords
- Cleavage sites
- Conditional random fields
- Discriminative models
- Protein sequences
- Signal peptides