Abstract
The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed.
Original language | English (US) |
---|---|
Pages (from-to) | 1-86 |
Number of pages | 86 |
Journal | Journal of Machine Learning Research |
Volume | 18 |
DOIs | |
State | Published - Apr 1 2018 |
All Science Journal Classification (ASJC) codes
- Software
- Artificial Intelligence
- Control and Systems Engineering
- Statistics and Probability
Keywords
- Clustering
- Community detection
- Computational gaps
- Network data analysis
- Random graphs
- Spectral algorithms
- Stochastic block models
- Unsupervised learning