Truncation of protein sequences for fast profile alignment with application to subcellular localization

Man Wai Mak, Wei Wang, Sun Yuan Kung

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We have recently found that the computation time of homology-based subcellular localization can be substantially reduced by aligning profiles up to the cleavage site positions of signal peptides, mitochondrial targeting peptides, and chloroplast transit peptides [1]. While the method can reduce the profile alignment time by as much as 20 folds, it cannot reduce the computation time spent on creating the profiles. In this paper, we propose a new approach that can reduce both the profile creation time and profile alignment time. In the new approach, instead of cutting the profiles, we shorten the sequences by cutting them at the cleavage site locations. The shortened sequences are then presented to PSI-BLAST to compute the profiles. Experimental results and analysis of profile-alignment score matrices suggest that both profile creation time and profile alignment time can be reduced without sacrificing subcellular localization accuracy. Once a pairwise profile-alignment score matrix has been obtained, a one-vs-rest SVM classifier can be trained. To further reduce the training and recognition time of the classifier, we propose a perturbation discriminant analysis (PDA) technique. It was found that PDA enjoys a short training time as compared to the conventional SVM.

Original languageEnglish (US)
Title of host publicationProceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
Pages115-120
Number of pages6
DOIs
StatePublished - 2010
Event2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010 - Hong Kong, China
Duration: Dec 18 2010Dec 21 2010

Publication series

NameProceedings - 2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010

Other

Other2010 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2010
Country/TerritoryChina
CityHong Kong
Period12/18/1012/21/10

All Science Journal Classification (ASJC) codes

  • Biomedical Engineering
  • Health Informatics

Keywords

  • Cleavage sites prediction
  • Kernel discriminant analysis
  • Profiles alignment
  • Protein sequences
  • SVM
  • Subcellular localization

Fingerprint

Dive into the research topics of 'Truncation of protein sequences for fast profile alignment with application to subcellular localization'. Together they form a unique fingerprint.

Cite this