Data noising as smoothing in neural network language models

Ziang Xie, Sida I. Wang, Jiwei Li, Daniel Lévy, Aiming Nie, Dan Jurafsky, Andrew Y. Ng

Research output: Contribution to conferencePaper

Abstract

Data noising is an effective technique for regularizing neural network models. While noising is widely adopted in application domains such as vision and speech, commonly used noising primitives have not been developed for discrete sequence-level settings such as language modeling. In this paper, we derive a connection between input noising in neural network language models and smoothing in ngram models. Using this connection, we draw upon ideas from smoothing to develop effective noising schemes. We demonstrate performance gains when applying the proposed schemes to language modeling and machine translation. Finally, we provide empirical analysis validating the relationship between noising and smoothing.

Original languageEnglish (US)
StatePublished - Jan 1 2019
Event5th International Conference on Learning Representations, ICLR 2017 - Toulon, France
Duration: Apr 24 2017Apr 26 2017

Conference

Conference5th International Conference on Learning Representations, ICLR 2017
CountryFrance
CityToulon
Period4/24/174/26/17

All Science Journal Classification (ASJC) codes

  • Education
  • Computer Science Applications
  • Linguistics and Language
  • Language and Linguistics

Fingerprint Dive into the research topics of 'Data noising as smoothing in neural network language models'. Together they form a unique fingerprint.

  • Cite this

    Xie, Z., Wang, S. I., Li, J., Lévy, D., Nie, A., Jurafsky, D., & Ng, A. Y. (2019). Data noising as smoothing in neural network language models. Paper presented at 5th International Conference on Learning Representations, ICLR 2017, Toulon, France.