Bacteria have a highly organized internal architecture at the cellular level. Identifying the subcellular localization of bacterial proteins is vital to infer their functions and design antibacterial drugs. Recent decades have witnessed remarkable progress in bacterial protein subcellular localization by computational approaches. However, existing computational approaches have the following disadvantages: (1) the prediction results are hard to interpret; and (2) they ignore multi-location bacterial proteins. To tackle these problems, this paper proposes an interpretable multi-label predictor, namely Gram-LocEN, for predicting the subcellular localization of both single- and multi-location proteins of Gram-positive or Gram negative bacteria. By using a multi-label elastic-net (EN) classifier, Gram-LocEN is capable of selecting location-specific essential features which play key roles in determining the subcellular localization. With these essential features, not only where a bacterial protein resides can be decided, but also why it locates there can be revealed. Experimental results on two stringent benchmark datasets suggest that Gram-LocEN significantly outperforms existing state-of-the-art multi-label predictors for both Gram-positive and Gram-negative bacteria. For readers' convenience, the Gram-LocEN web-server is available at http://bioinfo.eie.polyu.edu.hk/Gram-LocEN/.
All Science Journal Classification (ASJC) codes
- Analytical Chemistry
- Process Chemistry and Technology
- Computer Science Applications
- Bacterial protein subcellular localization
- Interpretable predictor
- Multi-location proteins