InstructEval: Systematic Evaluation of Instruction Selection Methods

Anirudh Ajith, Chris Pan, Mengzhou Xia, Ameet Deshpande, Karthik Narasimhan

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In-context learning (ICL) performs tasks by prompting a large language model (LLM) using an instruction and a small set of annotated examples called demonstrations. Recent work has shown that precise details of the inputs used in the ICL prompt significantly impact performance, which has incentivized instruction selection algorithms. The effect of instruction-choice however is severely underexplored, with existing analyses restricted to shallow subsets of models and tasks, limiting the generalizability of their insights. We develop InstructEval, an ICL evaluation suite to conduct a thorough assessment of these techniques. The suite includes 13 open-sourced LLMs of varying scales from four model families, and covers nine tasks across three categories. Using the suite, we evaluate the relative performance of seven popular instruction selection methods over five metrics relevant to ICL. Our experiments reveal that using curated manually-written instructions or simple instructions without any task-specific descriptions often elicits superior ICL performance overall than that of automatic instruction-induction methods, pointing to a lack of generalizability among the latter. We release our evaluation suite for benchmarking instruction selection approaches and enabling more generalizable methods in this space.

Original languageEnglish (US)
Title of host publicationFindings of the Association for Computational Linguistics
Subtitle of host publicationNAACL 2024 - Findings
EditorsKevin Duh, Helena Gomez, Steven Bethard
PublisherAssociation for Computational Linguistics (ACL)
Pages4336-4350
Number of pages15
ISBN (Electronic)9798891761193
StatePublished - 2024
Externally publishedYes
Event2024 Findings of the Association for Computational Linguistics: NAACL 2024 - Mexico City, Mexico
Duration: Jun 16 2024Jun 21 2024

Publication series

NameFindings of the Association for Computational Linguistics: NAACL 2024 - Findings

Conference

Conference2024 Findings of the Association for Computational Linguistics: NAACL 2024
Country/TerritoryMexico
CityMexico City
Period6/16/246/21/24

All Science Journal Classification (ASJC) codes

  • Computational Theory and Mathematics
  • Software

Fingerprint

Dive into the research topics of 'InstructEval: Systematic Evaluation of Instruction Selection Methods'. Together they form a unique fingerprint.

Cite this