Abstract
In this work, we revisit the problem of estimating the mean and covariance of an unknown ddimensional Gaussian distribution in the presence of an ε-fraction of adversarial outliers. The work of Diakonikolas et al. (2016) gave a polynomial time algorithm for this task with optimal Õ(ε) error using n = poly(d, 1/ε) samples. On the other hand, Kothari and Steurer (2017) introduced a general framework for robust moment estimation via a canonical sum-of-squares relaxation that succeeds for the more general class of certifiably subgaussian and certifiably hypercontractive (Bakshi and Kothari, 2020) distributions. When specialized to Gaussians, this algorithm obtains the same Õ(ε) error guarantee as Diakonikolas et al. (2016) but incurs a super-polynomial sample complexity (n = dO(log 1/ε)) and running time (nO(log(1/ε))). This cost appears inherent to their analysis as it relies only on sum-of-squares certificates of upper bounds on directional moments while the analysis in Diakonikolas et al. (2016) relies on lower bounds on directional moments inferred from algebraic relationships between moments of Gaussian distributions. We give a new, simple analysis of the same canonical sum-of-squares relaxation used in Kothari and Steurer (2017) and Bakshi and Kothari (2020) and show that for Gaussian distributions, their algorithm achieves the same error, sample complexity and running time guarantees as of the specialized algorithm in Diakonikolas et al. (2016). Our key innovation is a new argument that allows using moment lower bounds without having sum-of-squares certificates for them. We believe that our proof technique will likely be useful in designing new robust estimation algorithms.
Original language | English (US) |
---|---|
Pages (from-to) | 638-667 |
Number of pages | 30 |
Journal | Proceedings of Machine Learning Research |
Volume | 167 |
State | Published - 2022 |
Externally published | Yes |
Event | 33rd International Conference on Algorithmic Learning Theory, ALT 2022 - Virtual, Online, France Duration: Mar 29 2022 → Apr 1 2022 |
All Science Journal Classification (ASJC) codes
- Artificial Intelligence
- Software
- Control and Systems Engineering
- Statistics and Probability
Keywords
- mean estimation
- Robust estimation
- sum-of-squares