TY - GEN

T1 - Provable learning of noisy-or networks

AU - Arora, Sanjeev

AU - Ge, Rong

AU - Ma, Tengyu

AU - Risteski, Andrej

N1 - Publisher Copyright:
© 2017 ACM.
Copyright:
Copyright 2018 Elsevier B.V., All rights reserved.

PY - 2017/6/19

Y1 - 2017/6/19

N2 - Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Finding parameters with the maximum likelihood is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structures: topic models, mixture models, hidden Markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer noisy or network, which is a textbook example of a Bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.

AB - Many machine learning applications use latent variable models to explain structure in data, whereby visible variables (= coordinates of the given datapoint) are explained as a probabilistic function of some hidden variables. Finding parameters with the maximum likelihood is NP-hard even in very simple settings. In recent years, provably efficient algorithms were nevertheless developed for models with linear structures: topic models, mixture models, hidden Markov models, etc. These algorithms use matrix or tensor decomposition, and make some reasonable assumptions about the parameters of the underlying model. But matrix or tensor decomposition seems of little use when the latent variable model has nonlinearities. The current paper shows how to make progress: tensor decomposition is applied for learning the single-layer noisy or network, which is a textbook example of a Bayes net, and used for example in the classic QMR-DT software for diagnosing which disease(s) a patient may have by observing the symptoms he/she exhibits. The technical novelty here, which should be useful in other settings in future, is analysis of tensor decomposition in presence of systematic error (i.e., where the noise/error is correlated with the signal, and doesn't decrease as number of samples goes to infinity). This requires rethinking all steps of tensor decomposition methods from the ground up. For simplicity our analysis is stated assuming that the network parameters were chosen from a probability distribution but the method seems more generally applicable.

KW - Davis-Kahan

KW - Log-linear

KW - Noisy-or

KW - Tensor decomposition

KW - Unsupervised learning

UR - http://www.scopus.com/inward/record.url?scp=85024381855&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85024381855&partnerID=8YFLogxK

U2 - 10.1145/3055399.3055482

DO - 10.1145/3055399.3055482

M3 - Conference contribution

AN - SCOPUS:85024381855

T3 - Proceedings of the Annual ACM Symposium on Theory of Computing

SP - 1057

EP - 1066

BT - STOC 2017 - Proceedings of the 49th Annual ACM SIGACT Symposium on Theory of Computing

A2 - McKenzie, Pierre

A2 - King, Valerie

A2 - Hatami, Hamed

PB - Association for Computing Machinery

T2 - 49th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2017

Y2 - 19 June 2017 through 23 June 2017

ER -