Beyond Worst-Case (In)approximability of Nonsubmodular Influence Maximization
We consider the problem of maximizing the spread of influence in a social network by choosing a fixed number of initial seeds, formally referred to as the influence maximization problem. It admits a $(1-1/e)$-factor approximation algorithm if the influence function is submodular. Otherwise, in the worst case, the problem is NP-hard to approximate to within a factor of $N^{1-\varepsilon}$. This paper studies whether this worst-case hardness result can be circumvented by making assumptions about either the underlying network topology or the cascade model. All of our assumptions are motivated by many real life social network cascades. First, we present strong inapproximability results for a very restricted class of networks called the (stochastic) hierarchical blockmodel, a special case of the well-studied (stochastic) blockmodel in which relationships between blocks admit a tree structure. We also provide a dynamic-program based polynomial time algorithm which optimally computes a directed variant of the influence maximization problem on hierarchical blockmodel networks. Our algorithm indicates that the inapproximability result is due to the bidirectionality of influence between agent-blocks. Second, we present strong inapproximability results for a class of influence functions that are “almost” submodular, called 2-quasi-submodular. Our inapproximability results hold even for any 2-quasi-submodular $f$ fixed in advance. This result also indicates that the “threshold” between submodularity and nonsubmodularity is sharp, regarding the approximability of influence maximization.
💡 Research Summary
The paper investigates the approximability of the Influence Maximization (InfMax) problem when the underlying diffusion process is non‑submodular. While submodular cascades admit a (1‑1/e)‑approximation via a simple greedy algorithm, the worst‑case hardness result for general influence functions states that no polynomial‑time algorithm can achieve an approximation factor better than N^{1‑ε} unless P=NP. The authors ask whether this bleak worst‑case bound can be avoided by imposing realistic assumptions on either the network topology or the cascade model, both of which are motivated by empirical observations of real‑world social networks.
Network‑topology restrictions.
The authors focus on hierarchical block models (HBM) and their stochastic counterpart. In a (stochastic) block model, vertices are partitioned into ℓ blocks and edge probabilities depend only on the blocks of the endpoints. The hierarchical variant further restricts the ℓ×ℓ inter‑block probability matrix to have a tree‑like structure, reflecting the natural hierarchy of communities (countries → provinces → cities, etc.). Even under this severe restriction—where the block hierarchy is a tree and all vertices have unit thresholds—the authors prove that InfMax remains NP‑hard to approximate within any factor N^{1‑ε}. This shows that the hardness does not stem from arbitrary graph density but persists even in highly structured, realistic networks.
A striking positive result is also presented: if the influence between blocks is one‑way (i.e., edges only go from a parent block to its children), the problem becomes tractable. The authors design a dynamic‑programming algorithm that processes the hierarchy bottom‑up, computes the optimal seed allocation for each subtree, and combines them in polynomial time. This algorithm exactly solves a directed variant of InfMax on hierarchical block models, highlighting that bidirectional influence between blocks is the source of the intractability.
Cascade‑model restrictions.
Empirical studies have observed that the marginal influence of the second neighbor often exceeds that of the first, after which marginal gains decline. To capture this phenomenon, the paper introduces 2‑quasi‑submodular influence functions. Formally, a function f is 2‑quasi‑submodular if the marginal gain of adding a second infected neighbor is larger than that of adding the first, while subsequent marginal gains are non‑increasing. This class includes the classic 2‑threshold cascade and many realistic “peak‑at‑two” models.
The authors prove a sharp threshold result: for any fixed 2‑quasi‑submodular function f, InfMax is NP‑hard to approximate within a factor N^{τ} for some constant τ>0 that depends only on f. Consequently, even an infinitesimal deviation from submodularity (the second‑neighbor effect) destroys the (1‑1/e) guarantee and yields essentially the same hardness as the unrestricted case. Moreover, the hardness persists when only a sublinear fraction N^{γ} (0<γ<1) of vertices use a 2‑quasi‑submodular local function while the remaining vertices are submodular (or even identical). This generalizes recent results that required a constant‑size “bad” set of vertices, showing that even a vanishingly small but super‑constant set suffices to retain the hardness.
Relation to prior work.
Earlier literature established the N^{1‑ε} hardness for arbitrary influence functions and gave positive results for submodular cascades. Some works explored “almost submodular” functions, but typically constructed specific functions within reductions. In contrast, this paper’s inapproximability holds for all 2‑quasi‑submodular functions, irrespective of how close they are to submodular. The dynamic‑programming algorithm for the directed hierarchical model also extends prior empirical work on hierarchical decomposition, providing a theoretical justification for why such heuristics work well on real networks.
Implications and open questions.
The results suggest two key takeaways for practitioners and theorists: (1) imposing realistic community hierarchies does not alleviate the worst‑case hardness of non‑submodular influence maximization unless influence is effectively unidirectional; (2) the boundary between tractable (submodular) and intractable (any non‑submodular) influence functions is extremely sharp—once the marginal gain of the second neighbor exceeds that of the first, all constant‑factor approximations become impossible. An open problem left by the authors is whether simultaneously restricting both the network (e.g., hierarchical, low‑treewidth) and the cascade (e.g., bounded deviation from submodularity) can lead to polynomial‑time algorithms with provable guarantees. This paper thus delineates the limits of current approximation techniques and points toward a nuanced exploration of combined structural and functional assumptions.
Comments & Academic Discussion
Loading comments...
Leave a Comment