Abstract
We develop an information-theoretic viewof the stochastic block model, a popular statistical model for the large-scale structure of complex networks. A graph G from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge probabilities depending on the labels of the endpoints. In the case of the symmetric two-group model, we establish an explicit 'single-letter' characterization of the per-vertex mutual information between the vertex labels and the graph, when the mean vertex degree diverges. The explicit expression of the mutual information is intimately related to estimation-theoretic quantities, and -in particular-reveals a phase transition at the critical point for community detection. Below the critical point the per-vertex mutual information is asymptotically the same as if edges were independent. Correspondingly, no algorithm can estimate the partition better than random guessing. Conversely, above the threshold, the per-vertex mutual information is strictly smaller than the independent-edges upper bound. In this regime, there exists a procedure that estimates the vertex labels better than random guessing.
Original language | English (US) |
---|---|
Pages (from-to) | 125-170 |
Number of pages | 46 |
Journal | Information and Inference |
Volume | 6 |
Issue number | 2 |
DOIs | |
State | Published - 2017 |
All Science Journal Classification (ASJC) codes
- Analysis
- Statistics and Probability
- Numerical Analysis
- Computational Theory and Mathematics
- Applied Mathematics
Keywords
- Approximate message passing
- Community detection
- Mutual information
- Stochastic block model