TY - JOUR

T1 - Asymptotic mutual information for the balanced binary stochastic block model

AU - Deshpande, Yash

AU - Abbe, Emmanuel

AU - Montanari, Andrea

N1 - Funding Information:
Y.D. and A.M. were partially supported by NSF grants CCF-1319979 and DMS-1106627, and the AFOSR grant FA9550-13-1-0036. Part of this work was done while the authors were visiting Simons Institute for the Theory of Computing, UC Berkeley.
Publisher Copyright:
© The authors 2016.

PY - 2017

Y1 - 2017

N2 - We develop an information-theoretic viewof the stochastic block model, a popular statistical model for the large-scale structure of complex networks. A graph G from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge probabilities depending on the labels of the endpoints. In the case of the symmetric two-group model, we establish an explicit 'single-letter' characterization of the per-vertex mutual information between the vertex labels and the graph, when the mean vertex degree diverges. The explicit expression of the mutual information is intimately related to estimation-theoretic quantities, and -in particular-reveals a phase transition at the critical point for community detection. Below the critical point the per-vertex mutual information is asymptotically the same as if edges were independent. Correspondingly, no algorithm can estimate the partition better than random guessing. Conversely, above the threshold, the per-vertex mutual information is strictly smaller than the independent-edges upper bound. In this regime, there exists a procedure that estimates the vertex labels better than random guessing.

AB - We develop an information-theoretic viewof the stochastic block model, a popular statistical model for the large-scale structure of complex networks. A graph G from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge probabilities depending on the labels of the endpoints. In the case of the symmetric two-group model, we establish an explicit 'single-letter' characterization of the per-vertex mutual information between the vertex labels and the graph, when the mean vertex degree diverges. The explicit expression of the mutual information is intimately related to estimation-theoretic quantities, and -in particular-reveals a phase transition at the critical point for community detection. Below the critical point the per-vertex mutual information is asymptotically the same as if edges were independent. Correspondingly, no algorithm can estimate the partition better than random guessing. Conversely, above the threshold, the per-vertex mutual information is strictly smaller than the independent-edges upper bound. In this regime, there exists a procedure that estimates the vertex labels better than random guessing.

KW - Approximate message passing

KW - Community detection

KW - Mutual information

KW - Stochastic block model

UR - http://www.scopus.com/inward/record.url?scp=85048590456&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85048590456&partnerID=8YFLogxK

U2 - 10.1093/imaiai/iaw017

DO - 10.1093/imaiai/iaw017

M3 - Article

AN - SCOPUS:85048590456

SN - 2049-8772

VL - 6

SP - 125

EP - 170

JO - Information and Inference

JF - Information and Inference

IS - 2

ER -