Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models

Mengzhou Xia, Mikel Artetxe, Jingfei Du, Danqi Chen, Ves Stoyanov

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations

Abstract

Pre-trained masked language models successfully perform few-shot learning by formulating downstream tasks as text infilling. However, as a strong alternative in full-shot settings, discriminative pre-trained models like ELECTRA do not fit into the paradigm. In this work, we adapt prompt-based few-shot learning to ELECTRA and show that it outperforms masked language models in a wide range of tasks. ELECTRA is pre-trained to distinguish if a token is generated or original. We naturally extend that to prompt-based few-shot learning by training to score the originality of the target options without introducing new parameters. Our method can be easily adapted to tasks involving multi-token predictions without extra computation overhead. Analysis shows that ELECTRA learns distributions that align better with downstream tasks.

Original languageEnglish (US)
Title of host publicationProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
EditorsYoav Goldberg, Zornitsa Kozareva, Yue Zhang
PublisherAssociation for Computational Linguistics (ACL)
Pages11351-11361
Number of pages11
ISBN (Electronic)9781959429401
DOIs
StatePublished - 2022
Event2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 - Hybrid, Abu Dhabi, United Arab Emirates
Duration: Dec 7 2022Dec 11 2022

Publication series

NameProceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

Conference

Conference2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022
Country/TerritoryUnited Arab Emirates
CityHybrid, Abu Dhabi
Period12/7/2212/11/22

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Computer Science Applications
  • Information Systems

Fingerprint

Dive into the research topics of 'Prompting ELECTRA: Few-Shot Learning with Discriminative Pre-Trained Models'. Together they form a unique fingerprint.

Cite this