LoRA Training in the NTK Regime has No Spurious Local Minima

Uijeong Jang, Jason D. Lee, Ernest K. Ryu

Research output: Contribution to journalConference articlepeer-review

2 Scopus citations

Abstract

Low-rank adaptation (LoRA) has become the standard approach for parameter-efficient fine-tuning of large language models (LLM), but our theoretical understanding of LoRA has been limited. In this work, we theoretically analyze LoRA fine-tuning in the neural tangent kernel (NTK) regime with N data points, showing: (i) full finetuning (without LoRA) admits a low-rank solution of rank r ≲ √N; (ii) using LoRA with rank r ≳ √N eliminates spurious local minima, allowing (stochastic) gradient descent to find the low-rank solutions; (iii) the low-rank solution found using LoRA generalizes well.

Original languageEnglish (US)
Pages (from-to)21306-21328
Number of pages23
JournalProceedings of Machine Learning Research
Volume235
StatePublished - 2024
Externally publishedYes
Event41st International Conference on Machine Learning, ICML 2024 - Vienna, Austria
Duration: Jul 21 2024Jul 27 2024

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Software
  • Control and Systems Engineering
  • Statistics and Probability

Fingerprint

Dive into the research topics of 'LoRA Training in the NTK Regime has No Spurious Local Minima'. Together they form a unique fingerprint.

Cite this