TY - GEN
T1 - Fast and adaptive online training of feature-rich translation models
AU - Green, Spence
AU - Wang, Sida
AU - Cer, Daniel
AU - Manning, Christopher D.
PY - 2013
Y1 - 2013
N2 - We present a fast and scalable online method for tuning statistical machine translation models with large feature sets. The standard tuning algorithm-MERT-only scales to tens of features. Recent discriminative algorithms that accommodate sparse features have produced smaller than expected translation quality gains in large systems. Our method, which is based on stochastic gradient descent with an adaptive learning rate, scales to millions of features and tuning sets with tens of thousands of sentences, while still converging after only a few epochs. Large-scale experiments on Arabic-English and Chinese-English show that our method produces significant translation quality gains by exploiting sparse features. Equally important is our analysis, which suggests techniques for mitigating overfitting and domain mismatch, and applies to other recent discriminative methods for machine translation.
AB - We present a fast and scalable online method for tuning statistical machine translation models with large feature sets. The standard tuning algorithm-MERT-only scales to tens of features. Recent discriminative algorithms that accommodate sparse features have produced smaller than expected translation quality gains in large systems. Our method, which is based on stochastic gradient descent with an adaptive learning rate, scales to millions of features and tuning sets with tens of thousands of sentences, while still converging after only a few epochs. Large-scale experiments on Arabic-English and Chinese-English show that our method produces significant translation quality gains by exploiting sparse features. Equally important is our analysis, which suggests techniques for mitigating overfitting and domain mismatch, and applies to other recent discriminative methods for machine translation.
UR - http://www.scopus.com/inward/record.url?scp=84907368273&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84907368273&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84907368273
SN - 9781937284503
T3 - ACL 2013 - 51st Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference
SP - 311
EP - 321
BT - Long Papers
PB - Association for Computational Linguistics (ACL)
T2 - 51st Annual Meeting of the Association for Computational Linguistics, ACL 2013
Y2 - 4 August 2013 through 9 August 2013
ER -