MAXIMUM LIKELIHOOD ESTIMATION IS ALL YOU NEED FOR WELL-SPECIFIED COVARIATE SHIFT

Jiawei Ge, Shange Tang, Jianqing Fan, Cong Ma, Chi Jin

Research output: Contribution to conferencePaperpeer-review

1 Scopus citations

Abstract

A key challenge of modern machine learning systems is to achieve Out-of-Distribution (OOD) generalization-generalizing to target data whose distribution differs from that of source data. Despite its significant importance, the fundamental question of “what are the most effective algorithms for OOD generalization” remains open even under the standard setting of covariate shift. This paper addresses this fundamental question by proving that, surprisingly, classical Maximum Likelihood Estimation (MLE) purely using source data (without any modification) achieves the minimax optimality for covariate shift under the well-specified setting. That is, no algorithm performs better than MLE in this setting (up to a constant factor), justifying MLE is all you need. Our result holds for a very rich class of parametric models, and does not require any boundedness condition on the density ratio. We illustrate the wide applicability of our framework by instantiating it to three concrete examples-linear regression, logistic regression, and phase retrieval. This paper further complement the study by proving that, under the misspecified setting, MLE is no longer the optimal choice, whereas Maximum Weighted Likelihood Estimator (MWLE) emerges as minimax optimal in certain scenarios.

Original languageEnglish (US)
StatePublished - 2024
Event12th International Conference on Learning Representations, ICLR 2024 - Hybrid, Vienna, Austria
Duration: May 7 2024May 11 2024

Conference

Conference12th International Conference on Learning Representations, ICLR 2024
Country/TerritoryAustria
CityHybrid, Vienna
Period5/7/245/11/24

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Computer Science Applications
  • Education
  • Linguistics and Language

Fingerprint

Dive into the research topics of 'MAXIMUM LIKELIHOOD ESTIMATION IS ALL YOU NEED FOR WELL-SPECIFIED COVARIATE SHIFT'. Together they form a unique fingerprint.

Cite this