Homogeneity Pursuit

Zheng Tracy Ke, Jianqing Fan, Yichao Wu

Research output: Contribution to journalArticlepeer-review

83 Scopus citations

Abstract

This article explores the homogeneity of coefficients in high-dimensional regression, which extends the sparsity concept and is more general and suitable for many applications. Homogeneity arises when regression coefficients corresponding to neighboring geographical regions or a similar cluster of covariates are expected to be approximately the same. Sparsity corresponds to a special case of homogeneity with a large cluster of known atom zero. In this article, we propose a new method called clustering algorithm in regression via data-driven segmentation (CARDS) to explore homogeneity. New mathematics are provided on the gain that can be achieved by exploring homogeneity. Statistical properties of two versions of CARDS are analyzed. In particular, the asymptotic normality of our proposed CARDS estimator is established, which reveals better estimation accuracy for homogeneous parameters than that without homogeneity exploration. When our methods are combined with sparsity exploration, further efficiency can be achieved beyond the exploration of sparsity alone. This provides additional insights into the power of exploring low-dimensional structures in high-dimensional regression: homogeneity and sparsity. Our results also shed lights on the properties of the fused Lasso. The newly developed method is further illustrated by simulation studies and applications to real data. Supplementary materials for this article are available online.

Original languageEnglish (US)
Pages (from-to)175-194
Number of pages20
JournalJournal of the American Statistical Association
Volume110
Issue number509
DOIs
StatePublished - Jan 2 2015
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Keywords

  • Clustering
  • Sparsity

Fingerprint

Dive into the research topics of 'Homogeneity Pursuit'. Together they form a unique fingerprint.

Cite this