### Abstract

We develop efficient algorithms for estimating low-degree moments of unknown distributions in the presence of adversarial outliers and design a new family of convex relaxations for k-means clustering based on sum-of-squares method. As an immediate corollary, for any > 0, we obtain an efficient algorithm for learning the means of a mixture of k arbitrary Poincaré distributions in ^{d} in time d^{O}(1/γ) so long as the means have separation Ω(k^{γ}). This in particular yields an algorithm for learning Gaussian mixtures with separation Ω(k^{γ}), thus partially resolving an open problem of Regev and Vijayaraghavan (2017). The guarantees of our robust estimation algorithms improve in many cases significantly over the best previous ones, obtained in the recent works. We also show that the guarantees of our algorithms match information-theoretic lower-bounds for the class of distributions we consider. These improved guarantees allow us to give improved algorithms for independent component analysis and learning mixtures of Gaussians in the presence of outliers. We also show a sharp upper bound on the sum-of-squares norms for moment tensors of any distribution that satisfies the Poincaré inequality. The Poincaré inequality is a central inequality in probability theory, and a large class of distributions satisfy it including Gaussians, product distributions, strongly log-concave distributions, and any sum or uniformly continuous transformation of such distributions. As a consequence, this yields that all of the above algorithmic improvements hold for distributions satisfying the Poincaré inequality.

Original language | English (US) |
---|---|

Title of host publication | STOC 2018 - Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing |

Editors | Monika Henzinger, David Kempe, Ilias Diakonikolas |

Publisher | Association for Computing Machinery |

Pages | 182-189 |

Number of pages | 8 |

ISBN (Electronic) | 9781450355599 |

DOIs | |

State | Published - Jun 20 2018 |

Event | 50th Annual ACM Symposium on Theory of Computing, STOC 2018 - Los Angeles, United States Duration: Jun 25 2018 → Jun 29 2018 |

### Publication series

Name | Proceedings of the Annual ACM Symposium on Theory of Computing |
---|---|

ISSN (Print) | 0737-8017 |

### Other

Other | 50th Annual ACM Symposium on Theory of Computing, STOC 2018 |
---|---|

Country | United States |

City | Los Angeles |

Period | 6/25/18 → 6/29/18 |

### All Science Journal Classification (ASJC) codes

- Software

### Keywords

- Clustering
- Moments
- Robust learning
- Sum-of-squares

## Fingerprint Dive into the research topics of 'Robust moment estimation and improved clustering via sum of squares'. Together they form a unique fingerprint.

## Cite this

*STOC 2018 - Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing*(pp. 182-189). (Proceedings of the Annual ACM Symposium on Theory of Computing). Association for Computing Machinery. https://doi.org/10.1145/3188745.3188970