High dimensional semiparametric latent graphical model for mixed data

Jianqing Fan, Han Liu, Yang Ning, Hui Zou

Research output: Contribution to journalArticlepeer-review

81 Scopus citations

Abstract

We propose a semiparametric latent Gaussian copula model for modelling mixed multivariate data, which contain a combination of both continuous and binary variables. The model assumes that the observed binary variables are obtained by dichotomizing latent variables that satisfy the Gaussian copula distribution. The goal is to infer the conditional independence relationship between the latent random variables, based on the observed mixed data. Our work has two main contributions: we propose a unified rank-based approach to estimate the correlation matrix of latent variables; we establish the concentration inequality of the proposed rank-based estimator. Consequently, our methods achieve the same rates of convergence for precision matrix estimation and graph recovery, as if the latent variables were observed. The methods proposed are numerically assessed through extensive simulation studies, and real data analysis.

Original languageEnglish (US)
Pages (from-to)405-421
Number of pages17
JournalJournal of the Royal Statistical Society. Series B: Statistical Methodology
Volume79
Issue number2
DOIs
StatePublished - Mar 1 2017

All Science Journal Classification (ASJC) codes

  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Keywords

  • Discrete data
  • Gaussian copula
  • Latent variable
  • Mixed data
  • Non-paranormal
  • Rank-based statistic

Fingerprint

Dive into the research topics of 'High dimensional semiparametric latent graphical model for mixed data'. Together they form a unique fingerprint.

Cite this