TY - GEN
T1 - On the learnability and usage of acyclic probabilistic finite automata
AU - Ron, Dana
AU - Singer, Yoram
AU - Tishby, Naftali
N1 - Publisher Copyright:
© 1995 ACM.
PY - 1995/7/5
Y1 - 1995/7/5
N2 - We propose and analyze a distribution learning algorithm for a subclass of Acyclic Probabilistic Finite Automata (APFA). This subclass is characterized by a certain distinguishability property of the automata's states. Though hardness results are known for learning distributions generated by general APFAs, we prove that our algorithm can indeed efficiently learn distributions generated by the subclass of APFAs we consider. In particular, we show that the KL-divergence between the distribution generated by the target source and the distribution generated by our hypothesis can be made small with high confidence in polynomial time. We present two applications of our algorithm. In the first, we show how to model cursively written letters. The resulting models are part of a complete cursive handwriting recognition system. In the second application we demonstrate how APFAs can be used to build multiple-pronunciation models for spoken words. We evaluate the APFA based pronunciation models on labeled speech data. The good performance (in terms of the log-likelihood obtained on test data) achieved by the APFAs and the incredibly small amount of time needed for learning suggests that the learning algorithm of AP-FAs might be a powerful alternative to commonly used probabilistic models.
AB - We propose and analyze a distribution learning algorithm for a subclass of Acyclic Probabilistic Finite Automata (APFA). This subclass is characterized by a certain distinguishability property of the automata's states. Though hardness results are known for learning distributions generated by general APFAs, we prove that our algorithm can indeed efficiently learn distributions generated by the subclass of APFAs we consider. In particular, we show that the KL-divergence between the distribution generated by the target source and the distribution generated by our hypothesis can be made small with high confidence in polynomial time. We present two applications of our algorithm. In the first, we show how to model cursively written letters. The resulting models are part of a complete cursive handwriting recognition system. In the second application we demonstrate how APFAs can be used to build multiple-pronunciation models for spoken words. We evaluate the APFA based pronunciation models on labeled speech data. The good performance (in terms of the log-likelihood obtained on test data) achieved by the APFAs and the incredibly small amount of time needed for learning suggests that the learning algorithm of AP-FAs might be a powerful alternative to commonly used probabilistic models.
UR - http://www.scopus.com/inward/record.url?scp=84954410847&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84954410847&partnerID=8YFLogxK
U2 - 10.1145/225298.225302
DO - 10.1145/225298.225302
M3 - Conference contribution
AN - SCOPUS:84954410847
T3 - Proceedings of the 8th Annual Conference on Computational Learning Theory, COLT 1995
SP - 31
EP - 40
BT - Proceedings of the 8th Annual Conference on Computational Learning Theory, COLT 1995
PB - Association for Computing Machinery, Inc
T2 - 8th Annual Conference on Computational Learning Theory, COLT 1995
Y2 - 5 July 1995 through 8 July 1995
ER -