TY - GEN
T1 - GRUEN for evaluating linguistic quality of generated text
AU - Zhu, Wanzheng
AU - Bhat, Suma
N1 - Publisher Copyright:
© 2020 Association for Computational Linguistics
PY - 2020
Y1 - 2020
N2 - Automatic evaluation metrics are indispensable for evaluating generated text. To date, these metrics have focused almost exclusively on the content selection aspect of the system output, ignoring the linguistic quality aspect altogether. We bridge this gap by proposing GRUEN for evaluating Grammaticality, non-Redundancy, focUs, structure and coherENce of generated text.1 GRUEN utilizes a BERT-based model and a class of syntactic, semantic, and contextual features to examine the system output. Unlike most existing evaluation metrics which require human references as an input, GRUEN is reference-less and requires only the system output. Besides, it has the advantage of being unsupervised, deterministic, and adaptable to various tasks. Experiments on seven datasets over four language generation tasks show that the proposed metric correlates highly with human judgments.2
AB - Automatic evaluation metrics are indispensable for evaluating generated text. To date, these metrics have focused almost exclusively on the content selection aspect of the system output, ignoring the linguistic quality aspect altogether. We bridge this gap by proposing GRUEN for evaluating Grammaticality, non-Redundancy, focUs, structure and coherENce of generated text.1 GRUEN utilizes a BERT-based model and a class of syntactic, semantic, and contextual features to examine the system output. Unlike most existing evaluation metrics which require human references as an input, GRUEN is reference-less and requires only the system output. Besides, it has the advantage of being unsupervised, deterministic, and adaptable to various tasks. Experiments on seven datasets over four language generation tasks show that the proposed metric correlates highly with human judgments.2
UR - http://www.scopus.com/inward/record.url?scp=85115824201&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85115824201&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85115824201
T3 - Findings of the Association for Computational Linguistics Findings of ACL: EMNLP 2020
SP - 94
EP - 108
BT - Findings of the Association for Computational Linguistics Findings of ACL
PB - Association for Computational Linguistics (ACL)
T2 - Findings of the Association for Computational Linguistics, ACL 2020: EMNLP 2020
Y2 - 16 November 2020 through 20 November 2020
ER -