TY - JOUR
T1 - Ensemble Linear Neighborhood Propagation for Predicting Subchloroplast Localization of Multi-Location Proteins
AU - Wan, Shibiao
AU - Mak, Man Wai
AU - Kung, Sun Yuan
N1 - Publisher Copyright:
© 2016 American Chemical Society.
Copyright:
Copyright 2017 Elsevier B.V., All rights reserved.
PY - 2016/12/2
Y1 - 2016/12/2
N2 - In the postgenomic era, the number of unreviewed protein sequences is remarkably larger and grows tremendously faster than that of reviewed ones. However, existing methods for protein subchloroplast localization often ignore the information from these unlabeled proteins. This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark dataset and a novel independent dataset suggest that LNP-Chlo performs at least 6% (absolute) better than state-of-the-art predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features. For readers' convenience, the online Web server LNP-Chlo is freely available at http://bioinfo.eie.polyu.edu.hk/LNPChloServer/.
AB - In the postgenomic era, the number of unreviewed protein sequences is remarkably larger and grows tremendously faster than that of reviewed ones. However, existing methods for protein subchloroplast localization often ignore the information from these unlabeled proteins. This paper proposes a multi-label predictor based on ensemble linear neighborhood propagation (LNP), namely, LNP-Chlo, which leverages hybrid sequence-based feature information from both labeled and unlabeled proteins for predicting localization of both single- and multi-label chloroplast proteins. Experimental results on a stringent benchmark dataset and a novel independent dataset suggest that LNP-Chlo performs at least 6% (absolute) better than state-of-the-art predictors. This paper also demonstrates that ensemble LNP significantly outperforms LNP based on individual features. For readers' convenience, the online Web server LNP-Chlo is freely available at http://bioinfo.eie.polyu.edu.hk/LNPChloServer/.
KW - linear neighborhood propagation
KW - multi-label classification
KW - protein subchloroplast localization
KW - split amino-acid composition
KW - transductive learning
UR - http://www.scopus.com/inward/record.url?scp=85000360223&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85000360223&partnerID=8YFLogxK
U2 - 10.1021/acs.jproteome.6b00686
DO - 10.1021/acs.jproteome.6b00686
M3 - Article
C2 - 27766879
AN - SCOPUS:85000360223
SN - 1535-3893
VL - 15
SP - 4755
EP - 4762
JO - Journal of Proteome Research
JF - Journal of Proteome Research
IS - 12
ER -