TY - JOUR
T1 - Estimating the Coefficients of a Mixture of Two Linear Regressions by Expectation Maximization
AU - Klusowski, Jason M.
AU - Yang, Dana
AU - Brinda, W. D.
N1 - Funding Information:
Manuscript received September 13, 2017; revised October 15, 2018; accepted December 22, 2018. Date of publication January 9, 2019; date of current version May 20, 2019. This work was supported by the Graduate Student Fellowship from Yale University. J. M. Klusowski is with the Department of Statistics, Rutgers University–New Brunswick, Piscataway, NJ 08854-8019 USA (e-mail: jason.klusowski@rutgers.edu). D. Yang and W. D. Brinda are with the Department of Statistics and Data Science, Yale University, New Haven, CT 06511 USA (e-mail: xiaoqian. yang@yale.edu; william.brinda@yale.edu). Communicated by R. Balan, Associate Editor for Detection and Estimation. Digital Object Identifier 10.1109/TIT.2019.2891628
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2019/6
Y1 - 2019/6
N2 - We give convergence guarantees for estimating the coefficients of a symmetric mixture of two linear regressions by expectation maximization (EM). In particular, we show that the empirical EM iterates converge to the target parameter vector at the parametric rate, provided the algorithm is initialized in an unbounded cone. In particular, if the initial guess has a sufficiently large cosine angle with the target parameter vector, a sample-splitting version of the EM algorithm converges to the true coefficient vector with high probability. Interestingly, our analysis borrows from tools used in the problem of estimating the centers of a symmetric mixture of two Gaussians by EM. We also show that the population of EM operator for mixtures of two regressions is anti-contractive from the target parameter vector if the cosine angle between the input vector and the target parameter vector is too small, thereby establishing the necessity of our conic condition. Finally, we give empirical evidence supporting this theoretical observation, which suggests that the sample-based EM algorithm may not converge to the target vector when initial guesses are drawn accordingly. Our simulation study also suggests that the EM algorithm performs well even under model misspecification (i.e., when the covariate and error distributions violate the model assumptions).
AB - We give convergence guarantees for estimating the coefficients of a symmetric mixture of two linear regressions by expectation maximization (EM). In particular, we show that the empirical EM iterates converge to the target parameter vector at the parametric rate, provided the algorithm is initialized in an unbounded cone. In particular, if the initial guess has a sufficiently large cosine angle with the target parameter vector, a sample-splitting version of the EM algorithm converges to the true coefficient vector with high probability. Interestingly, our analysis borrows from tools used in the problem of estimating the centers of a symmetric mixture of two Gaussians by EM. We also show that the population of EM operator for mixtures of two regressions is anti-contractive from the target parameter vector if the cosine angle between the input vector and the target parameter vector is too small, thereby establishing the necessity of our conic condition. Finally, we give empirical evidence supporting this theoretical observation, which suggests that the sample-based EM algorithm may not converge to the target vector when initial guesses are drawn accordingly. Our simulation study also suggests that the EM algorithm performs well even under model misspecification (i.e., when the covariate and error distributions violate the model assumptions).
KW - Mixture models
KW - clustering algorithms
KW - expectation-maximization algorithm
KW - iterative algorithms
KW - regression analysis
UR - http://www.scopus.com/inward/record.url?scp=85065988283&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85065988283&partnerID=8YFLogxK
U2 - 10.1109/TIT.2019.2891628
DO - 10.1109/TIT.2019.2891628
M3 - Article
AN - SCOPUS:85065988283
SN - 0018-9448
VL - 65
SP - 3515
EP - 3524
JO - IRE Professional Group on Information Theory
JF - IRE Professional Group on Information Theory
IS - 6
M1 - 8606170
ER -