TY - JOUR

T1 - Computing a nonnegative matrix factorization-provably

AU - Arora, Sanjeev

AU - Ge, Rong

AU - Kannan, Ravi

AU - Moitra, Ankur

N1 - Funding Information:
The work of the first and second authors was supported by the NSF grants CCF-0832797 and CCF-1117309. The research of this author was supported in part by NSF grant DMS-0835373 and by an NSF Computing and Innovation Fellowship.
Publisher Copyright:
© 2016 the authors.

PY - 2016

Y1 - 2016

N2 - In the nonnegative matrix factorization (NMF) problem we are given an n × m nonnegative matrix M and an integer r > 0. Our goal is to express M as AW, where A and W are nonnegative matrices of size n×r and r×m, respectively. In some applications, it makes sense to ask instead for the product AW to approximate M, i.e. (approximately) minimize ||M - AWF||, where || ||F,denotes the Frobenius norm; we refer to this as approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where A and W are computed using a variety of local search heuristics. Vavasis recently proved that this problem is NP-complete. (Without the restriction that A and W be nonnegative, both the exact and approximate problems can be solved optimally via the singular value decomposition.) We initiate a study of when this problem is solvable in polynomial time. Our results are the following: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant r. Indeed NMF is most interesting in applications precisely when r is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time (nm)o(r), 3-SAT has a subexponential-time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in n, m, and r under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.

AB - In the nonnegative matrix factorization (NMF) problem we are given an n × m nonnegative matrix M and an integer r > 0. Our goal is to express M as AW, where A and W are nonnegative matrices of size n×r and r×m, respectively. In some applications, it makes sense to ask instead for the product AW to approximate M, i.e. (approximately) minimize ||M - AWF||, where || ||F,denotes the Frobenius norm; we refer to this as approximate NMF. This problem has a rich history spanning quantum mechanics, probability theory, data analysis, polyhedral combinatorics, communication complexity, demography, chemometrics, etc. In the past decade NMF has become enormously popular in machine learning, where A and W are computed using a variety of local search heuristics. Vavasis recently proved that this problem is NP-complete. (Without the restriction that A and W be nonnegative, both the exact and approximate problems can be solved optimally via the singular value decomposition.) We initiate a study of when this problem is solvable in polynomial time. Our results are the following: 1. We give a polynomial-time algorithm for exact and approximate NMF for every constant r. Indeed NMF is most interesting in applications precisely when r is small. 2. We complement this with a hardness result, that if exact NMF can be solved in time (nm)o(r), 3-SAT has a subexponential-time algorithm. This rules out substantial improvements to the above algorithm. 3. We give an algorithm that runs in time polynomial in n, m, and r under the separablity condition identified by Donoho and Stodden in 2003. The algorithm may be practical since it is simple and noise tolerant (under benign assumptions). Separability is believed to hold in many practical settings. To the best of our knowledge, this last result is the first example of a polynomial-time algorithm that provably works under a non-trivial condition on the input and we believe that this will be an interesting and important direction for future work.

KW - Nonnegative matrix factorization

KW - Provable algorithms

KW - Separability

UR - http://www.scopus.com/inward/record.url?scp=84990975264&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84990975264&partnerID=8YFLogxK

U2 - 10.1137/130913869

DO - 10.1137/130913869

M3 - Article

AN - SCOPUS:84990975264

VL - 45

SP - 1582

EP - 1611

JO - SIAM Journal on Computing

JF - SIAM Journal on Computing

SN - 0097-5397

IS - 4

ER -