Reconstructing Graph Diffusion History from a Single Snapshot

Ruizhong Qiu, Dingsu Wang, Lei Ying, H. Vincent Poor, Yifang Zhang, Hanghang Tong

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Diffusion on graphs is ubiquitous with numerous high-impact applications, ranging from the study of residential segregation in socioeconomics and activation cascading in neuroscience, to the modeling of disease contagion in epidemiology and malware spreading in cybersecurity. In these applications, complete diffusion histories play an essential role in terms of identifying dynamical patterns, reflecting on precaution actions, and forecasting intervention effects. Despite their importance, complete diffusion histories are rarely available and are highly challenging to reconstruct due to ill-posedness, explosive search space, and scarcity of training data. To date, few methods exist for diffusion history reconstruction. They are exclusively based on the maximum likelihood estimation (MLE) formulation and require to know true diffusion parameters. In this paper, we study an even harder problem, namely reconstructing Diffusion history from A single SnapsHot (DASH), where we seek to reconstruct the history from only the final snapshot without knowing true diffusion parameters. We start with theoretical analyses that reveal a fundamental limitation of the MLE formulation. We prove: (a) estimation error of diffusion parameters is unavoidable due to NP-hardness of diffusion parameter estimation, and (b) the MLE formulation is sensitive to estimation error of diffusion parameters. To overcome the inherent limitation of the MLE formulation, we propose a novel barycenter formulation: finding the barycenter of the posterior distribution of histories, which is provably stable against the estimation error of diffusion parameters. We further develop an effective solver named DIffusion hiTting Times with Optimal proposal (DITTO) by reducing the problem to estimating posterior expected hitting times via the Metropolis-Hastings Markov chain Monte Carlo method (M-H MCMC) and employing an unsupervised graph neural network to learn an optimal proposal to accelerate the convergence of M-H MCMC. We conduct extensive experiments to demonstrate the efficacy of the proposed method. Our code is available at https://github.com/q-rz/KDD23-DITTO. The appendix can be found at https://arxiv.org/abs/2306.00488.

Original languageEnglish (US)
Title of host publicationKDD 2023 - Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining
PublisherAssociation for Computing Machinery
Pages1978-1988
Number of pages11
ISBN (Electronic)9798400701030
DOIs
StatePublished - Aug 6 2023
Event29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023 - Long Beach, United States
Duration: Aug 6 2023Aug 10 2023

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

Conference

Conference29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD 2023
Country/TerritoryUnited States
CityLong Beach
Period8/6/238/10/23

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Keywords

  • graph diffusion
  • graph neural network (gnn)
  • history reconstruction
  • markov chain monte carlo (mcmc)

Fingerprint

Dive into the research topics of 'Reconstructing Graph Diffusion History from a Single Snapshot'. Together they form a unique fingerprint.

Cite this