Skip to main navigation Skip to search Skip to main content

Hindsight Merging: Diverse Data Generation with Language Models

Research output: Contribution to journalConference articlepeer-review

Abstract

Pre-training a language model equips it with a broad understanding of the world, while finetuning refines it into a helpful assistant. However, fine-tuning does not exclusively enhance taskspecific behaviors but also suppresses some of the beneficial variability from pre-training. This reduction in diversity is partly due to the optimization process, which theoretically decreases model entropy in exchange for task performance. To counteract this, we introduce hindsight merging, a technique that combines a fine-tuned model with a previous training checkpoint using linear interpolation to restore entropy and improve performance. Hindsight-merged models retain strong instructionfollowing capabilities and alignment while displaying increased diversity present in the base model. Additionally, this results in improved inference scaling, achieving a consistent 20-50% increase in pass@10 relative to the instruction tuned model across a coding benchmark and series of models. Our findings suggest that hindsight merging is an effective strategy for generating diverse generations that follow instructions.

Original languageEnglish (US)
Pages (from-to)4349-4360
Number of pages12
JournalProceedings of Machine Learning Research
Volume286
StatePublished - 2025
Event41st Conference on Uncertainty in Artificial Intelligence, UAI 2025 - Rio de Janeiro, Brazil
Duration: Jul 21 2025Jul 25 2025

All Science Journal Classification (ASJC) codes

  • Software
  • Control and Systems Engineering
  • Statistics and Probability
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Hindsight Merging: Diverse Data Generation with Language Models'. Together they form a unique fingerprint.

Cite this