Transductive Learning for Multi-Label Protein Subchloroplast Localization Prediction

Shibiao Wan, Man Wai Mak, Sun Yuan Kung

Research output: Contribution to journalArticlepeer-review

37 Scopus citations

Abstract

Predicting the localization of chloroplast proteins at the sub-subcellular level is an essential yet challenging step to elucidate their functions. Most of the existing subchloroplast localization predictors are limited to predicting single-location proteins and ignore the multi-location chloroplast proteins. While recent studies have led to some multi-location chloroplast predictors, they usually perform poorly. This paper proposes an ensemble transductive learning method to tackle this multi-label classification problem. Specifically, given a protein in a dataset, its composition-based sequence information and profile-based evolutionary information are respectively extracted. These two kinds of features are respectively compared with those of other proteins in the dataset. The comparisons lead to two similarity vectors which are weighted-combined to constitute an ensemble feature vector. A transductive learning model based on the least squares and nearest neighbor algorithms is proposed to process the ensemble features. We refer to the resulting predictor as as EnTrans-Chlo. Experimental results on a stringent benchmark dataset and a novel dataset demonstrate that EnTrans-Chlo significantly outperforms state-of-the-art predictors and particularly gains more than 4 percent (absolute) improvement on the overall actual accuracy. For readers' convenience, EnTrans-Chlo is freely available online at http://bioinfo.eie.polyu.edu.hk/EnTransChloServer/.

Original languageEnglish (US)
Article number7401011
Pages (from-to)212-224
Number of pages13
JournalIEEE/ACM Transactions on Computational Biology and Bioinformatics
Volume14
Issue number1
DOIs
StatePublished - Jan 1 2017

All Science Journal Classification (ASJC) codes

  • Applied Mathematics
  • Genetics
  • Biotechnology

Keywords

  • Protein subchloroplast localization prediction
  • ensemble transductive learning
  • multi-label classification
  • profile alignment

Fingerprint

Dive into the research topics of 'Transductive Learning for Multi-Label Protein Subchloroplast Localization Prediction'. Together they form a unique fingerprint.

Cite this