Context-aware automatic text simplification of health materials in low-resource domains

Tarek Sakakini, Jong Yoon Lee, Aditya Duri, Renato F.L. Azevedo, Kuangxiao Gu, Suma Bhat, Dan Morrow, Mark Hasegawa-Johnson, Thomas Huang, Victor Sadauskas, James Graumlich, Saqib Walayat, Ann Willemsen-Dunlap, Donald Halpin

Research output: Chapter in Book/Report/Conference proceedingConference contribution

10 Scopus citations

Abstract

Healthcare systems have increased patients' exposure to their own health materials to enhance patients' health levels, but this has been impeded by patients' lack of understanding of their health material. We address potential barriers to their comprehension by developing a context-aware text simplification system for health material. Given the scarcity of annotated parallel corpora in healthcare domains, we design our system to be independent of a parallel corpus, complementing the availability of data-driven neural methods when such corpora are available. Our system compensates for the lack of direct supervision using a biomedical lexical database: Unified Medical Language System (UMLS). Compared to a competitive prior approach that uses a tool for identifying biomedical concepts and a consumer-directed vocabulary list, we empirically show the enhanced accuracy of our system due to improved handling of ambiguous terms. We also show the enhanced accuracy of our system over directly-supervised neural methods in this low-resource setting. Finally, we show the direct impact of our system on laypeople's comprehension of health material via a human subjects' study (n = 160).

Original languageEnglish (US)
Title of host publicationEMNLP 2020 - 11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages115-126
Number of pages12
ISBN (Electronic)9781952148811
DOIs
StatePublished - 2020
Externally publishedYes
Event11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, co-located with EMNLP 2020 - Virtual, Online
Duration: Nov 20 2020 → …

Publication series

NameEMNLP 2020 - 11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, Proceedings of the Workshop

Conference

Conference11th International Workshop on Health Text Mining and Information Analysis, LOUHI 2020, co-located with EMNLP 2020
CityVirtual, Online
Period11/20/20 → …

All Science Journal Classification (ASJC) codes

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Context-aware automatic text simplification of health materials in low-resource domains'. Together they form a unique fingerprint.

Cite this