A fully Bayesian approach to unsupervised part-of-speech tagging

Sharon Goldwater, Thomas L. Griffiths

Research output: Chapter in Book/Report/Conference proceedingConference contribution

237 Scopus citations

Abstract

Unsupervised learning of linguistic structure is a difficult problem. A common approach is to define a generative model and maximize the probability of the hidden structure given the observed data. Typically, this is done using maximum-likelihood estimation (MLE) of the model parameters. We show using part-of-speech tagging that a fully Bayesian approach can greatly improve performance. Rather than estimating a single set of parameters, the Bayesian approach integrates over all possible parameter values. This difference ensures that the learned structure will have high probability over a range of possible parameters, and permits the use of priors favoring the sparse distributions that are typical of natural language. Our model has the structure of a standard trigram HMM, yet its accuracy is closer to that of a state-of-the-art discriminative model (Smith and Eisner, 2005), up to 14 percentage points better than MLE. We find improvements both when training from data alone, and using a tagging dictionary.

Original languageEnglish (US)
Title of host publicationACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics
Pages744-751
Number of pages8
StatePublished - 2007
Externally publishedYes
Event45th Annual Meeting of the Association for Computational Linguistics, ACL 2007 - Prague, Czech Republic
Duration: Jun 23 2007Jun 30 2007

Publication series

NameACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics

Other

Other45th Annual Meeting of the Association for Computational Linguistics, ACL 2007
Country/TerritoryCzech Republic
CityPrague
Period6/23/076/30/07

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'A fully Bayesian approach to unsupervised part-of-speech tagging'. Together they form a unique fingerprint.

Cite this