TY - GEN
T1 - List-decodable covariance estimation
AU - Ivkov, Misha
AU - Kothari, Pravesh K.
N1 - Publisher Copyright:
© 2022 Owner/Author.
PY - 2022/9/6
Y1 - 2022/9/6
N2 - We give the first polynomial time algorithm for list-decodable covariance estimation. For any α > 0, our algorithm takes input a sample Y † d of size n≥ dpoly(1/α) obtained by adversarially corrupting an (1-α)n points in an i.i.d. sample X of size n from the Gaussian distribution with unknown mean μ∗ and covariance ς∗. In npoly(1/α) time, it outputs a constant-size list of k = k(α)= (1/α)poly(1/α) candidate parameters that, with high probability, contains a (μ,ς) such that the total variation distance TV(N(μ∗,ς∗),N(μ,ς))<1-Oα(1). This is a statistically strongest notion of distance and implies multiplicative spectral and relative Frobenius distance approximation with dimension independent error. Our algorithm works more generally for any distribution D that possesses low-degree sum-of-squares certificates of two natural analytic properties: 1) anti-concentration of one-dimensional marginals and 2) hypercontractivity of degree 2 polynomials. Prior to our work, the only known results for estimating covariance in the list-decodable setting were for the special cases of list-decodable linear regression and subspace recovery [Karmalkar-Klivans-Kothari 2019, Bakshi-Kothari 2020, Raghavendra-Yau'19, 20]. Even for these special cases, the known error guarantees are weak and in particular, the algorithms need super-polynomial time for any sub-constant (in dimension d) target error in natural norms. Our result, as a corollary, yields the first polynomial time exact algorithm for list-decodable linear regression and subspace recovery that, in particular, obtain 2-(d) error in polynomial-time in the underlying dimension.
AB - We give the first polynomial time algorithm for list-decodable covariance estimation. For any α > 0, our algorithm takes input a sample Y † d of size n≥ dpoly(1/α) obtained by adversarially corrupting an (1-α)n points in an i.i.d. sample X of size n from the Gaussian distribution with unknown mean μ∗ and covariance ς∗. In npoly(1/α) time, it outputs a constant-size list of k = k(α)= (1/α)poly(1/α) candidate parameters that, with high probability, contains a (μ,ς) such that the total variation distance TV(N(μ∗,ς∗),N(μ,ς))<1-Oα(1). This is a statistically strongest notion of distance and implies multiplicative spectral and relative Frobenius distance approximation with dimension independent error. Our algorithm works more generally for any distribution D that possesses low-degree sum-of-squares certificates of two natural analytic properties: 1) anti-concentration of one-dimensional marginals and 2) hypercontractivity of degree 2 polynomials. Prior to our work, the only known results for estimating covariance in the list-decodable setting were for the special cases of list-decodable linear regression and subspace recovery [Karmalkar-Klivans-Kothari 2019, Bakshi-Kothari 2020, Raghavendra-Yau'19, 20]. Even for these special cases, the known error guarantees are weak and in particular, the algorithms need super-polynomial time for any sub-constant (in dimension d) target error in natural norms. Our result, as a corollary, yields the first polynomial time exact algorithm for list-decodable linear regression and subspace recovery that, in particular, obtain 2-(d) error in polynomial-time in the underlying dimension.
KW - list-decodable learning
KW - robust statistics
KW - sum-of-squares
UR - http://www.scopus.com/inward/record.url?scp=85132781886&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132781886&partnerID=8YFLogxK
U2 - 10.1145/3519935.3520006
DO - 10.1145/3519935.3520006
M3 - Conference contribution
AN - SCOPUS:85132781886
T3 - Proceedings of the Annual ACM Symposium on Theory of Computing
SP - 1276
EP - 1283
BT - STOC 2022 - Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing
A2 - Leonardi, Stefano
A2 - Gupta, Anupam
PB - Association for Computing Machinery
T2 - 54th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2022
Y2 - 20 June 2022 through 24 June 2022
ER -