TY - GEN
T1 - Protein Family Classification using Sparse Markov Transducers
AU - Eskin, Eleazar
AU - Grundy, William Noble
AU - Singer, Yoram
N1 - Publisher Copyright:
Copyright © 2000, American Association for Artificial Intelligence (www.aaai.org). All rights reserved.
PY - 2000
Y1 - 2000
N2 - In this paper we present a method for classifying proteins into families using sparse Markov transducers (SMTs). Sparse Markov transducers, similar to probabilistic suffix trees, estimate a probability distribution conditioned on an input sequence. SMTs generalize probabilistic suffix trees by allowing for wild-cards in the conditioning sequences. Because substitutions of amino acids are common in protein families, incorporating wildcards into the model significantly improves classification performance. We present two models for building protein family classifiers using SMTs. We also present efficient data structures to improve the memory usage of the models. We evaluate SMTs by building protein family classifiers using the Pfam database and compare our results to previously published results.
AB - In this paper we present a method for classifying proteins into families using sparse Markov transducers (SMTs). Sparse Markov transducers, similar to probabilistic suffix trees, estimate a probability distribution conditioned on an input sequence. SMTs generalize probabilistic suffix trees by allowing for wild-cards in the conditioning sequences. Because substitutions of amino acids are common in protein families, incorporating wildcards into the model significantly improves classification performance. We present two models for building protein family classifiers using SMTs. We also present efficient data structures to improve the memory usage of the models. We evaluate SMTs by building protein family classifiers using the Pfam database and compare our results to previously published results.
UR - http://www.scopus.com/inward/record.url?scp=0034564289&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0034564289&partnerID=8YFLogxK
M3 - Conference contribution
C2 - 10977074
AN - SCOPUS:0034564289
T3 - Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, ISMB 2000
SP - 134
EP - 145
BT - Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology, ISMB 2000
PB - AAAI press
T2 - 8th International Conference on Intelligent Systems for Molecular Biology, ISMB 2000
Y2 - 19 August 2000 through 23 August 2000
ER -