IRIE: Scalable and Robust Influence Maximization in Social Networks

IRIE: Scalable and Robust Influence Maximization in Social Networks
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Influence maximization is the problem of selecting top $k$ seed nodes in a social network to maximize their influence coverage under certain influence diffusion models. In this paper, we propose a novel algorithm IRIE that integrates a new message passing based influence ranking (IR), and influence estimation (IE) methods for influence maximization in both the independent cascade (IC) model and its extension IC-N that incorporates negative opinion propagations. Through extensive experiments, we demonstrate that IRIE matches the influence coverage of other algorithms while scales much better than all other algorithms. Moreover IRIE is more robust and stable than other algorithms both in running time and memory usage for various density of networks and cascade size. It runs up to two orders of magnitude faster than other state-of-the-art algorithms such as PMIA for large networks with tens of millions of nodes and edges, while using only a fraction of memory comparing with PMIA.


💡 Research Summary

The paper addresses the classic influence maximization problem: given a directed social network and a diffusion model, select k seed nodes that maximize the expected number of activated nodes. While the greedy algorithm with Monte‑Carlo simulations guarantees a (1‑1/e) approximation, it is computationally prohibitive for large graphs. Existing heuristics such as PMIA improve scalability but suffer from high memory overhead and sensitivity to graph density and cascade size.

IRIE (Influence Ranking + Influence Estimation) proposes a novel two‑stage framework that dramatically reduces both time and space while preserving influence quality. In the Influence Ranking (IR) stage, the authors formulate a system of linear equations that capture the expected spread of each node. For a tree, the exact influence σ(u) can be expressed as σ(u)=1+∑_{v∈N_out(u)}P_uv·m(v,u), where m(v,u) denotes the expected spread from v excluding the edge (v,u). For general graphs the same equations are used as an approximation; they are solved by an iterative message‑passing algorithm reminiscent of belief propagation. A damping factor α (typically 0.7–0.9) ensures convergence within a few (5–10) iterations, yielding a global ranking ˜σ(u).

Because a pure ranking ignores overlap among multiple seeds, the second stage, Influence Estimation (IE), updates the ranking after each seed is chosen. When a node u is added to the seed set S, the algorithm quickly computes the marginal contribution of u to every other node v using the previously computed messages: Δ(v)=∑_{u∈S}P_uv·˜m(u,v). This Δ(v) approximates the true marginal influence σ(v|S) without any Monte‑Carlo runs. The IR equations are then re‑evaluated with the Δ values subtracted, producing a refreshed ranking that reflects the already covered region.

The overall IRIE procedure repeats: (1) run IR to obtain a ranking, (2) pick the top node as a new seed, (3) run IE to adjust influence estimates, and (4) repeat until k seeds are selected. The computational cost per iteration is O(|E|·T) for the message passing (T≈5–10) plus O(|E|) for IE, resulting in near‑linear scalability. Memory usage is also linear in the number of edges because only the messages need to be stored, unlike PMIA which stores a local influence tree for every node.

Extensive experiments were conducted on synthetic graphs (Erdős‑Rényi and Barabási‑Albert) and five real‑world networks ranging from 29 K to 69 M edges (e.g., Facebook, Twitter, LiveJournal, Orkut, DBLP). IRIE was compared against the greedy Monte‑Carlo baseline, PMIA, Simulated Annealing, PageRank, and simple degree heuristics. Results show:

  • Influence spread: IRIE matches or slightly exceeds the greedy baseline and is on par with PMIA, while outperforming SA and PageRank by a large margin.
  • Running time: IRIE is 10–100× faster than PMIA and 100–1000× faster than the greedy method. Speedup is consistent across different densities and clustering coefficients, demonstrating robustness.
  • Memory: IRIE uses only a fraction (≈5–15 %) of the memory required by PMIA, enabling execution on the largest tested graphs without out‑of‑memory failures.
  • Extension to IC‑N (negative opinion) model: By adapting the same message equations, IRIE achieves comparable positive activation counts to the MIA‑N heuristic while being at least five times faster.

In summary, IRIE provides a scalable, memory‑efficient, and robust solution to influence maximization for both the classic IC model and its negative‑opinion extension. The authors suggest future work on applying the framework to the Linear Threshold model, handling dynamic networks, and integrating real marketing data for practical deployment.


Comments & Academic Discussion

Loading comments...

Leave a Comment