## Abstract

We develop an information-theoretic viewof the stochastic block model, a popular statistical model for the large-scale structure of complex networks. A graph G from such a model is generated by first assigning vertex labels at random from a finite alphabet, and then connecting vertices with edge probabilities depending on the labels of the endpoints. In the case of the symmetric two-group model, we establish an explicit 'single-letter' characterization of the per-vertex mutual information between the vertex labels and the graph, when the mean vertex degree diverges. The explicit expression of the mutual information is intimately related to estimation-theoretic quantities, and -in particular-reveals a phase transition at the critical point for community detection. Below the critical point the per-vertex mutual information is asymptotically the same as if edges were independent. Correspondingly, no algorithm can estimate the partition better than random guessing. Conversely, above the threshold, the per-vertex mutual information is strictly smaller than the independent-edges upper bound. In this regime, there exists a procedure that estimates the vertex labels better than random guessing.

Original language | English (US) |
---|---|

Pages (from-to) | 125-170 |

Number of pages | 46 |

Journal | Information and Inference |

Volume | 6 |

Issue number | 2 |

DOIs | |

State | Published - Jan 1 2017 |

## All Science Journal Classification (ASJC) codes

- Computational Theory and Mathematics
- Analysis
- Applied Mathematics
- Statistics and Probability
- Numerical Analysis

## Keywords

- Approximate message passing
- Community detection
- Mutual information
- Stochastic block model