Uncovering the Temporal Dynamics of Diffusion Networks

Uncovering the Temporal Dynamics of Diffusion Networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Time plays an essential role in the diffusion of information, influence and disease over networks. In many cases we only observe when a node copies information, makes a decision or becomes infected – but the connectivity, transmission rates between nodes and transmission sources are unknown. Inferring the underlying dynamics is of outstanding interest since it enables forecasting, influencing and retarding infections, broadly construed. To this end, we model diffusion processes as discrete networks of continuous temporal processes occurring at different rates. Given cascade data – observed infection times of nodes – we infer the edges of the global diffusion network and estimate the transmission rates of each edge that best explain the observed data. The optimization problem is convex. The model naturally (without heuristics) imposes sparse solutions and requires no parameter tuning. The problem decouples into a collection of independent smaller problems, thus scaling easily to networks on the order of hundreds of thousands of nodes. Experiments on real and synthetic data show that our algorithm both recovers the edges of diffusion networks and accurately estimates their transmission rates from cascade data.


💡 Research Summary

The paper addresses the fundamental problem of inferring both the hidden structure of a diffusion network and the transmission rates on its edges using only observed infection (or adoption) timestamps, commonly referred to as cascades. The authors begin by assuming a static directed graph where each directed edge (j \rightarrow i) is associated with a non‑negative transmission rate (\alpha_{j,i}). They model the conditional transmission likelihood (f(t_i \mid t_j, \alpha_{j,i})) with three well‑known parametric families: exponential, power‑law, and Rayleigh. For each family they provide the probability density, survival function, and hazard function, noting that these functions are log‑concave in the parameters.

A cascade is represented as an (N)-dimensional vector of infection times, with (\infty) indicating nodes that remain uninfected within the observation window. The temporal ordering of infections induces a directed acyclic graph (DAG) for each cascade, which simplifies the likelihood computation. Assuming conditional independence of infections given their parents, the likelihood of a cascade factorizes over nodes. For an infected node (i), the probability that a particular earlier node (j) is the first parent is the product of the transmission density (f(t_i \mid t_j, \alpha_{j,i})) and the survival probabilities of all other earlier nodes. For nodes that never become infected, a survival term up to the observation horizon (T) is included.

Summing over all possible first parents yields a compact expression for the cascade likelihood (Equation 5 in the paper). Incorporating the survival terms for uninfected nodes leads to the final likelihood (Equation 7), which depends only on the transmission rates and the pairwise time differences (\Delta t = t_i - t_j). The log‑likelihood for a set of cascades is then a sum of three components (\Psi_1, \Psi_2,) and (\Psi_3), each linear in the log‑survival or log‑hazard functions.

The core contribution is the formulation of the network inference task as a convex optimization problem: \


Comments & Academic Discussion

Loading comments...

Leave a Comment