Principal component analysis on non-gaussian dependent data

Fang Han, Han Liu

Research output: Contribution to conferencePaperpeer-review

9 Scopus citations

Abstract

In this paper, we analyze the performance of a semiparametric principal component analysis named Copula Component Analysis (COCA) (Han & Liu, 2012) when the data are dependent. The semiparametric model assumes that, after unspecified marginally monotone transformations, the distributions are multivariate Gaussian. We study the scenario where the observations are drawn from non-i.i.d. processes (m-dependency or a more general φ-mixing case). We show that COCA can allow weak dependence. In particular, we provide the generalization bounds of convergence for both support recovery and parameter estimation of COCA for the dependent data. We provide explicit sufficient conditions on the degree of dependence, under which the parametric rate can be maintained. To our knowledge, this is the first work analyzing the theoretical performance of PCA for the dependent data in high dimensional settings. Our results strictly generalize the analysis in Han & Liu (2012) and the techniques we used have the separate interest for analyzing a variety of other multivariate statistical methods.

Original languageEnglish (US)
Pages240-248
Number of pages9
StatePublished - 2013
Event30th International Conference on Machine Learning, ICML 2013 - Atlanta, GA, United States
Duration: Jun 16 2013Jun 21 2013

Other

Other30th International Conference on Machine Learning, ICML 2013
Country/TerritoryUnited States
CityAtlanta, GA
Period6/16/136/21/13

All Science Journal Classification (ASJC) codes

  • Human-Computer Interaction
  • Sociology and Political Science

Fingerprint

Dive into the research topics of 'Principal component analysis on non-gaussian dependent data'. Together they form a unique fingerprint.

Cite this