### Abstract

We consider the linearly transformed spiked model, where the observations Y_{i} are noisy linear transforms of unobserved signals of interest X_{i}: Y_{i} = A_{i}X_{i} + ε_{i}, for i = 1, . . ., n. The transform matrices A_{i} are also observed. We model the unobserved signals (or regression coefficients) X_{i} as vectors lying on an unknown low-dimensional space. Given only Y_{i} and A_{i} how should we predict or recover their values? The naive approach of performing regression for each observation separately is inaccurate due to the large noise level. Instead, we develop optimal methods for predicting X_{i} by “borrowing strength” across the different samples. Our linear empirical Bayes methods scale to large datasets and rely on weak moment assumptions. We show that this model has wide-ranging applications in signal processing, deconvolution, cryo-electron microscopy, and missing data with noise. For missing data, we show in simulations that our methods are more robust to noise and to unequal sampling than well-known matrix completion methods.

Original language | English (US) |
---|---|

Pages (from-to) | 491-513 |

Number of pages | 23 |

Journal | Annals of Statistics |

Volume | 48 |

Issue number | 1 |

DOIs | |

State | Published - Jan 1 2020 |

### All Science Journal Classification (ASJC) codes

- Statistics and Probability
- Statistics, Probability and Uncertainty

### Keywords

- High dimensional
- Matrix completion
- Missing data
- Principal component analysis
- Random matrix theory
- Shrinkage
- Spiked model

## Fingerprint Dive into the research topics of 'Optimal prediction in the linearly transformed spiked model'. Together they form a unique fingerprint.

## Cite this

*Annals of Statistics*,

*48*(1), 491-513. https://doi.org/10.1214/18-AOS1819