MPLR-Loc: An adaptive decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction

Shibiao Wan, Man Wai Mak, Sun Yuan Kung

Research output: Contribution to journalArticlepeer-review

49 Scopus citations

Abstract

Proteins located in appropriate cellular compartments are of paramount importance to exert their biological functions. Prediction of protein subcellular localization by computational methods is required in the post-genomic era. Recent studies have been focusing on predicting not only single-location proteins but also multi-location proteins. However, most of the existing predictors are far from effective for tackling the challenges of multi-label proteins. This article proposes an efficient multi-label predictor, namely mPLR-Loc, based on penalized logistic regression and adaptive decisions for predicting both single- and multi-location proteins. Specifically, for each query protein, mPLR-Loc exploits the information from the Gene Ontology (GO) database by using its accession number (AC) or the ACs of its homologs obtained via BLAST. The frequencies of GO occurrences are used to construct feature vectors, which are then classified by an adaptive decision-based multi-label penalized logistic regression classifier. Experimental results based on two recent stringent benchmark datasets (virus and plant) show that mPLR-Loc remarkably outperforms existing state-of-the-art multi-label predictors. In addition to being able to rapidly and accurately predict subcellular localization of single- and multi-label proteins, mPLR-Loc can also provide probabilistic confidence scores for the prediction decisions. For readers' convenience, the mPLR-Loc server is available online (http://bioinfo.eie.polyu.edu.hk/mPLRLocServer).

Original languageEnglish (US)
Pages (from-to)14-27
Number of pages14
JournalAnalytical Biochemistry
Volume473
DOIs
StatePublished - Mar 15 2015

All Science Journal Classification (ASJC) codes

  • Molecular Biology
  • Biophysics
  • Biochemistry
  • Cell Biology

Keywords

  • Adaptive decision
  • Logistic regression
  • Multi-label classification
  • Multi-location proteins
  • Protein subcellular localization

Fingerprint

Dive into the research topics of 'MPLR-Loc: An adaptive decision multi-label classifier based on penalized logistic regression for protein subcellular localization prediction'. Together they form a unique fingerprint.

Cite this