Idiomatic Expression Paraphrasing without Strong Supervision

Jianing Zhou, Ziheng Zeng, Hongyu Gong, Suma Bhat

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Scopus citations

Abstract

Idiomatic expressions (IEs) play an essential role in natural language. In this paper, we study the task of idiomatic sentence paraphrasing (ISP), which aims to paraphrase a sentence with an IE by replacing the IE with its literal paraphrase. The lack of large-scale corpora with idiomatic-literal parallel sentences is a primary challenge for this task, for which we consider two separate solutions. First, we propose an unsupervised approach to ISP, which leverages an IE's contextual information and definition and does not require a parallel sentence training set. Second, we propose a weakly supervised approach using back-translation to jointly perform paraphrasing and generation of sentences with IEs to enlarge the small-scale parallel sentence training dataset. Other significant derivatives of the study include a model that replaces a literal phrase in a sentence with an IE to generate an idiomatic expression and a large scale parallel dataset with idiomatic/ literal sentence pairs. The effectiveness of the proposed solutions compared to competitive baselines is seen in the relative gains of over 5.16 points in BLEU, over 8.75 points in METEOR, and over 19.57 points in SARI when the generated sentences are empirically validated on a parallel dataset using automatic and manual evaluations. We demonstrate the practical utility of ISP as a preprocessing step in En-De machine translation.

Original languageEnglish (US)
Title of host publicationAAAI-22 Technical Tracks 10
PublisherAssociation for the Advancement of Artificial Intelligence
Pages11774-11782
Number of pages9
ISBN (Electronic)1577358767, 9781577358763
StatePublished - Jun 30 2022
Externally publishedYes
Event36th AAAI Conference on Artificial Intelligence, AAAI 2022 - Virtual, Online
Duration: Feb 22 2022Mar 1 2022

Publication series

NameProceedings of the 36th AAAI Conference on Artificial Intelligence, AAAI 2022
Volume36

Conference

Conference36th AAAI Conference on Artificial Intelligence, AAAI 2022
CityVirtual, Online
Period2/22/223/1/22

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Idiomatic Expression Paraphrasing without Strong Supervision'. Together they form a unique fingerprint.

Cite this