Features of Big Data and sparsest solution in high confidence set

Research output: Chapter in Book/Report/Conference proceedingChapter

7 Scopus citations

Abstract

This chapter summarizes some of the unique features of Big Data analysis. These features are shared neither by low-dimensional data nor by small samples. Big Data pose new computational challenges and hold great promises for understanding population heterogeneity as in personalized medicine or services. High dimensionality introduces spurious correlations, incidental endogeneity, noise accumulation, and measurement error. These unique features are very distinguished and statistical procedures should be designed with these issues in mind. To illustrate, a method called a sparsest solution in highconfidence set is introduced which is generally applicable to high-dimensional statistical inference. This method, whose properties are briefly examined, is natural as the information about parameters contained in the data is summarized by high-confident sets and the sparsest solution is a way to deal with the noise accumulation issue.

Original languageEnglish (US)
Title of host publicationPast, Present, and Future of Statistical Science
PublisherCRC Press
Pages507-523
Number of pages17
ISBN (Electronic)9781482204988
ISBN (Print)9781482204964
StatePublished - Jan 1 2014
Externally publishedYes

All Science Journal Classification (ASJC) codes

  • General Mathematics

Fingerprint

Dive into the research topics of 'Features of Big Data and sparsest solution in high confidence set'. Together they form a unique fingerprint.

Cite this