Statistical inference framework for source detection of contagion processes on arbitrary network structures

Statistical inference framework for source detection of contagion   processes on arbitrary network structures
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In this paper we introduce a statistical inference framework for estimating the contagion source from a partially observed contagion spreading process on an arbitrary network structure. The framework is based on a maximum likelihood estimation of a partial epidemic realization and involves large scale simulation of contagion spreading processes from the set of potential source locations. We present a number of different likelihood estimators that are used to determine the conditional probabilities associated to observing partial epidemic realization with particular source location candidates. This statistical inference framework is also applicable for arbitrary compartment contagion spreading processes on networks. We compare estimation accuracy of these approaches in a number of computational experiments performed with the SIR (susceptible-infected-recovered), SI (susceptible-infected) and ISS (ignorant-spreading-stifler) contagion spreading models on synthetic and real-world complex networks.


💡 Research Summary

**
This paper introduces a comprehensive statistical inference framework for locating the origin of a contagion process on arbitrary network structures when only a partial observation of the spread is available. The authors formulate the source detection problem as a maximum‑likelihood estimation (MLE) task: given a set of candidate source nodes S and an observed epidemic realization ~r* (the set of infected and recovered nodes up to a known observation time T), they seek the node θ ∈ S that maximizes the conditional probability P(~R=~r* | Θ=θ). Assuming a uniform prior over candidates, the problem reduces to estimating these likelihoods efficiently.

The core of the framework consists of two computational stages. First, for each candidate source θ, the authors run n independent stochastic simulations of the chosen contagion model (SIR, SI, or ISS) on the static underlying graph G, using known infection probability p, recovery probability q, and the same observation horizon T. Each simulation yields a synthetic epidemic realization Rθ,i. Second, they compare the observed realization ~r* with each simulated realization using a similarity measure ϕ(~r*, Rθ,i). Two similarity functions are defined: (1) XNOR (ϕ_XNOR), which counts the proportion of nodes whose infection status (infected or not) matches between the two realizations; and (2) Jaccard (ϕ_J), which computes the size of the intersection of infected node sets divided by the size of their union. Both measures are implemented with fast bitwise operations (XOR, NOT, AND) and a Brian‑Kernighan pop‑count routine, enabling scalable computation on large graphs.

From the distribution of similarity scores for a given candidate, three likelihood estimators are derived:

  1. AU CDF (Area Under the Cumulative Distribution Function) – The empirical cumulative distribution F̂(x) = (1/n)∑ 1_{

Comments & Academic Discussion

Loading comments...

Leave a Comment