Scalable Influence Maximization for Multiple Products in Continuous-Time Diffusion Networks

Continuous-Time Influence Maximiza tion of Mul tiple Items Scalable Inﬂuence Maximization for Multiple Pro ducts in Con tinuous-Time Diﬀusion Net w orks Nan Du dunan@google.com Go o gle R ese ar ch, 1600 A mphithe atr e Pkwy, Mountain View, CA 94043 Yingyu Liang yingyul@cs.princeton.edu Dep artment of Computer Scienc e, Princ eton University, Princ eton, NJ 08540 Maria-Florina Balcan ninamf@cs.cmu.edu Scho ol of Computer Scienc e, Carne gie Mel lon University, Pittsbur gh, P A 15213 Man uel Gomez-Ro driguez manuelgr@mpi-sws.org MPI for Softwar e Systems, Kaiserslautern, Germany 67663 Hongyuan Zha zha@cc.ga tech.edu Le Song lsong@cc.ga tech.edu Col le ge of Computing, Ge or gia Institute of T e chnolo gy, A tlanta, GA 30332 Abstract A t ypical viral marketing mo del identiﬁes inﬂuential users in a so cial netw ork to maxi- mize a single pro duct adoption assuming unlimited user attention, campaign budgets, and time. In reality , multiple pro ducts need campaigns, users hav e limited attention, convinc- ing users incurs costs, and advertisers hav e limited budgets and exp ect the adoptions to b e maximized so on. F acing these user, monetary , and timing constraints, we form ulate the problem as a submo dular maximization task in a con tinuous-time diﬀusion model under the in tersection of one matroid and multiple knapsack constraints. W e propose a randomized algorithm estimating the user inﬂuence 1 in a net work ( |V | no des, |E | edges) to an accuracy of  with n = O (1 / 2 ) randomizations and ˜ O ( n |E | + n |V | ) computations. By exploiting the inﬂuence estimation algorithm as a subroutine, we develop an adaptive threshold greedy algorithm ac hieving an appro ximation factor k a / (2 + 2 k ) of the optimal when k a out of the k knapsac k constrain ts are active. Extensiv e experiments on netw orks of millions of no des demonstrate that the proposed algorithms ac hieve the state-of-the-art in terms of eﬀectiv eness and scalability . Keyw ords: Inﬂuence Maximization, Inﬂuence Estimation, Contin uous-time Diﬀusion Mo del, Matroid, Knapsack 1. In tro duction Online so cial net w orks play an important role in the promotion of new pro ducts, the spread of news, the success of political campaigns, an d the diﬀusion of tec hnological inno v ations. In these con texts, the inﬂuence maximization problem (or viral mark eting problem) t ypically has the follo wing ﬂav or: iden tify a set of inﬂuential users in a so cial net work, who, when 1. P artial results in the pap er on inﬂuence estimation ha ve b een published in a conference pap er: Nan Du, Le Song, Manuel Gomez-Ro driguez, and Hongyuan Zha. Scalable inﬂuence estimation in contin uous time diﬀusion netw orks. In Adv ances in Neural Information Pro cessing Systems 26, 2013. 1 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song con vinced to adopt a pro duct, shall inﬂuence other users in the netw ork and trigger a large cascade of adoptions. This problem has b een studied extensiv ely in the literature from b oth the mo deling and the algorithmic asp ects (Richardson and Domingos, 2002; Kemp e et al., 2003; Lesk o vec et al., 2007; Chen et al., 2009, 2010a,b, 2011, 2012; Ienco et al., 2010; Goy al et al., 2011a,b; Gomez-Ro driguez and Sch¨ olk opf, 2012), where it has b een t ypically assumed that the host ( e.g. , the o wner of an online so cial platform) faces a single pro duct, endless user atten tion, unlimited budgets and un b ounded time. Ho w ever, in realit y , the host often encoun ters a muc h more constrained scenario: • Multiple-Item Constrain ts: m ultiple products can spread sim ultaneously among the same set of social entities. These pro ducts ma y hav e diﬀerent c haracteristics, suc h as their reven ues and sp eed of spread. • Timing Constrain ts: the advertisers exp ect the inﬂuence to o ccur within a certain time windo w, and diﬀeren t products may hav e diﬀeren t timing requirements. • User Constraints: users of the so cial net w ork, each of which can b e a p oten tial source, w ould like to b e exp osed to only a small n um b er of ads. F urthermore, users ma y b e group ed by their geographical lo cations, and advertisers may ha v e a target p opulation they w ant to reac h. • Pro duct Constrain ts: seeking initial adopters en tails a cost to the adv ertiser, who needs to pay to the host and often has a limited amount of money . F or example, F acebo ok ( i.e. , the host) needs to allo cate ads for v arious pro ducts with diﬀeren t characteristics, e.g. , clothes, b o oks, or cosmetics. While some pro ducts, such as clothes, aim at inﬂuencing within a short time window, some others, such as b ooks, may allo w for longer p eriods. Moreov er, F acebo ok limits the num b er of ads in each user’s side- bar (t ypically it shows less than ﬁve) and, as a consequence, it cannot assign all ads to a few highly inﬂuen tial users. Finally , each advertiser has a limited budget to pa y for ads on F aceb ook and th us eac h ad can only b e display ed to some subset of users. In our w ork, w e incorp orate these m yriads of practical and imp ortan t requirements into consideration in the inﬂuence maximization problem. W e accoun t for the multi-product and timing constrain ts by applying pro duct-speciﬁc con tin uous-time diﬀusion mo dels. Here, we opt for contin uous-time diﬀusion mo dels in- stead of discrete-time mo dels, which ha ve b een mostly used in previous w ork (Kemp e et al., 2003; Chen et al., 2009, 2010a,b, 2011, 2012; Borgs et al., 2012). This is b ecause arti- ﬁcially discretizing the time axis into bins introduces additional errors. One can adjust the additional tuning parameters, lik e the bin size, to balance the tradeoﬀ b et ween the error and the computational cost, but the parameters are not easy to choose optimally . Extensiv e exp erimental comparisons on b oth synthetic and real-world data hav e shown that discrete-time mo dels provide less accurate inﬂuence estimation than their con tinuous- time coun terparts (Gomez-Ro driguez et al., 2011; Gomez-Ro driguez and Sc h¨ olk opf, 2012; Gomez-Ro driguez et al., 2013; Du et al., 2013a,b). Ho w ever, maximizing inﬂuence based on contin uous-time diﬀusion mo dels also entails additional c hallenges. First, ev aluating the ob jectiv e function of the inﬂuence maximization problem ( i.e. , the inﬂuence estimation problem) in this setting is a diﬃcult graphical mo del 2 Continuous-Time Influence Maximiza tion of Mul tiple Items inference problem, i.e. , computing the marginal density of contin uous v ariables in lo op y graphical mo dels. The exact answ er can be computed only for very sp ecial cases. F or ex- ample, Gomez-Ro driguez and Sch¨ olk opf (2012) hav e sho wn that the problem can be solved exactly when the transmission functions are exponential densities, b y using con tin uous time Mark o v pro cesses theory . Ho w ever, the computational complexit y of such approac h, in gen- eral, scales exp onen tially with the size and densit y of the net work. Moreo ver, extending the approac h to deal with arbitrary transmission functions would require additional nontrivial appro ximations whic h would increase ev en more the computational complexit y . Second, it is unclear how to scale up inﬂuence estimation and maximization algorithms based on con tin uous-time diﬀusion mo dels to millions of no des. Esp ecially in the maximization case, the inﬂuence estimation pro cedure needs to b e called many times for diﬀeren t subsets of selected nodes. Thus, our ﬁrst go al is to design a sc alable algorithm which c an p erform inﬂuenc e estimation in the r e gime of networks with mil lions of no des. W e account for the user and product constraints by restricting the feasible domain ov er whic h the maximization is p erformed. W e ﬁrst show that the o v erall inﬂuence function of m ultiple pro ducts is a submo dular function and then realize that the user and pro duct con- strain ts correspond to constrain ts o v er the ground set of this submodular function. T o the b est of our knowledge, previous work has not considered b oth user and pro duct constraints sim ultaneously o v er general unkno wn diﬀeren t diﬀusion netw orks with non-uniform costs. In particular, (Datta et al., 2010) ﬁrst tried to mo del b oth the pro duct and user constraints only with uniform costs and inﬁnite time window, whic h essentially reduces to a special case of our formulations. Similarly , (Lu et al., 2013) considered the allo cation problem of multiple pro ducts whic h may hav e comp etitions within the inﬁnite time window. Be- sides, they all assume that m ultiple pro ducts spread within the same net work. In contrast, our formulations generally allow products to hav e diﬀerent diﬀusion netw orks, which can b e unknown in practice. Soma et al. (2014) studied the inﬂuence maximization problem for one product sub ject to one knapsac k constraint ov er a kno wn bipartite graph b et ween mark eting channels and p oten tial customers; Ienco et al. (2010) and Sun et al. (2011) con- sidered user constrain ts but disregarded product constraints during the initial assignment; and, Nara yanam and Nanav ati (2012) studied the cross-sell phenomenon (the selling of the ﬁrst pro duct raises the chance of selling the second) and included monetary constraints for all the pro ducts. How ever, no user constrain ts w ere considered, and the cost of eac h user w as still uniform for each pro duct. Thus, our se c ond go al is to design an eﬃcient submo d- ular maximization algorithm which c an take into ac c ount b oth user and pr o duct c onstr aints simultane ously. Ov erall, this article includes the follo wing ma jor contributions: • Unlike prior work that considers an a priori describ ed simplistic discrete-time diﬀu- sion mo del, we ﬁrst learn the diﬀusion netw orks from data b y using con tin uous-time diﬀusion mo dels. This allo ws us to address the timing constraints in a principled wa y . • W e provide a no vel formulation of the inﬂuence estimation problem in th e contin uous- time diﬀusion mo del from the p ersp ectiv e of probabilistic graphical mo dels, which allo ws heterogeneous diﬀusion dynamics o v er the edges. 3 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song • W e prop ose an eﬃcient randomized algorithm for con tinuous-time inﬂuence estima- tion, whic h can scale up to millions of no des and estimate the inﬂuence of each no de to an accuracy of  using n = O (1 / 2 ) randomizations. • W e formulate the inﬂuence maximization problem with the aforemen tioned constrain ts as a submo dular maximization under the intersection of matroid constrain ts and knap- sac k constraints. The submo dular function w e use is based on the actual diﬀusion mo del learned from the data for the time window constraint. This no vel formula- tion pro vides us a ﬁrm theoretical foundation for designing greedy algorithms with theoretical guaran tees. • W e dev elop an eﬃcien t adaptive-threshold greedy algorithm which is linear in the n um b er of pro ducts and prop ortional to e O ( |V | + |E ∗ | ), where |V | is the num b er of no des (users) and |E ∗ | is the num b er of edges in the largest diﬀusion netw ork. W e then pro v e that this algorithm is guaranteed to ﬁnd a solution with an ov erall inﬂuence of at least k a 2+2 k of the optimal v alue, when k a out of the k knapsack constraints are active. This impro ves ov er the best known approximation factor achiev ed by p olynomial time algorithms in the combinatorial optimization literature. Moreov er, whenev er advertising eac h pro duct to eac h user en tails the same cost, the constrain ts reduce to an in tersection of matroids, and w e obtain an approximation factor of 1 / 3, whic h is optimal for suc h optimization. • W e ev aluate our algorithms ov er large syn thetic and real-w orld datasets and sho w that our prop osed metho ds signiﬁcan tly improv e ov er previous state-of-the-arts in terms of both the accuracy of the estimated inﬂuence and the qualit y of the selected nodes in maximizing the inﬂuence o v er independently hold-out real testing data. In the remainder of the paper, we will ﬁrst tac kle the inﬂuence estimation problem in section 2. W e then formulate diﬀerent realistic constraints for the inﬂuence maximization in section 3 and presen t the adaptiv e-thresholding greedy algorithm with its theoretical analysis in section 4; we in v estigate the p erformance of the prop osed algorithms in b oth syn thetic and real-world datasets in section 5; and ﬁnally w e conclude in section 6. 2. Inﬂuence Estimation W e start b y revisiting the con tin uous-time diﬀusion mo del by Gomez-Ro driguez et al. (2011) and then explicitly form ulate the inﬂuence estimation problem from the p ersp ectiv e of probabilistic graphical mo dels. Because the eﬃcien t inference of the inﬂuence v alue for eac h no de is highly non-trivial, w e further develop a scalable inﬂuence estimation algorithm whic h is able to handle netw orks of millions of no des. The inﬂuence estimation pro cedure will be a k ey building blo c k for our later inﬂuence maximization algorithm. 2.1 Contin uous-Time Diﬀusion Netw orks The con tinuous-time diﬀusion mo del asso ciates eac h edge with a transmission function, that is, a density o v er the transmission time along the edge, in contrast to previous discrete-time mo dels which associate each edge with a ﬁxed infection probability (Kemp e et al., 2003). 4 Continuous-Time Influence Maximiza tion of Mul tiple Items Moreo v er, it also diﬀers from discrete-time mo dels in the sense that ev ents in a cascade are not generated iterativ ely in rounds, but ev ent timings are sampled directly from the transmission function in the con tin uous-time model. Con tin uous-Time Indep enden t Cascade Mo del. Given a dir e cte d contact netw ork, G = ( V , E ), we use the indep enden t cascade mo del for mo deling a diﬀusion pro cess (Kemp e et al., 2003; Gomez-Ro driguez et al., 2011). The pro cess b egins with a set of infected source no des, A , initially adopting certain contagion (idea, meme or product) at time zero. The contagion is transmitted from the sources along their out-going edges to their direct neigh b ors. Each transmission through an edge en tails r andom w aiting times, τ , drawn from diﬀeren t indep enden t pairwise waiting time distributions(one p er edge). Then, the infected neigh b ors transmit the contagion to their resp ective neigh b ors, and the pro cess con tin ues. W e assume that an infected no de remains infected for the entire diﬀusion pro cess. Thus, if a no de i is infected by multiple neigh bors, only the neigh b or that ﬁrst infects no de i will b e the true paren t. As a result, although the con tact netw ork can b e an arbitrary directed net w ork, eac h diﬀusion pro cess induces a Directed Acyclic Graph (DA G). Heterogeneous T ransmission F unctions. F ormally , the pairwise transmission func- tion f j i ( t i | t j ) for a directed edge j → i is the conditional density of no de i getting infected at time t i giv en that no de j w as infected at time t j . W e assume it is shift inv ariant: f j i ( t i | t j ) = f j i ( τ j i ), where τ j i := t i − t j , and causal: f j i ( τ j i ) = 0 if τ j i < 0. Both parametric transmission functions, such as the exp onen tial and Rayleigh function (Gomez-Ro driguez et al., 2011), and nonparametric functions (Du et al., 2012) can b e used and estimated from cascade data. Shortest-P ath Prop erty . The independent cascade mo del has a useful prop ert y w e will use later: giv en a sample of transmission times of all edges, the time t i tak en to infect a node i is the length of the shortest path in G from the sources to no de i , where the edge w eigh ts corresp ond to the associated transmission times. 2.2 Probabilistic Graphical Mo del for Con tin uous-Time Diﬀusion Net w orks The contin uous-time independent cascade mo del is essentially a directed graphical mo del for a set of dep endent random v ariables, that is, the infection times t i of the nodes, where the conditional indep endence structure is supp orted on the con tact net w ork G . Although the original con tact graph G can contain directed loops, each diﬀusion pro cess (or a cascade) induces a directed acyclic graph (D A G). F or those cascades consistent with a particular D A G, w e can mo del the joint densit y of t i using a directed graphical model: p ( { t i } i ∈V ) = Y i ∈V p ( t i |{ t j } j ∈ π i ) , (1) where eac h π i denotes the collection of parents of no de i in the induced D A G, and each term p ( t i |{ t j } j ∈ π i ) corresp onds to a conditional densit y of t i giv en the infection times of the paren ts of node i . This is true b ecause giv en the infection times of node i ’s parents, t i is indep enden t of other infection times, satisfying the local Mark o v property of a directed graphical mo del. W e note that the indep endent cascade mo del only sp eciﬁes explicitly the pairwise transmission function of each directed edge, but do es not directly deﬁne the conditional densit y p ( t i |{ t j } j ∈ π i ). 5 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Ho w ever, these conditional densities can be deriv ed from the pairwise transmission func- tions based on the Independent-Infection prop erty: p ( t i |{ t j } j ∈ π i ) = X j ∈ π i f j i ( t i | t j ) Y l ∈ π i ,l 6 = j S ( t i | t l ) , (2) whic h is the sum of the lik eliho o ds that no de i is infected by each parent no de j . More precisely , each term in the summation can b e interpreted as the lik eliho od f j i ( t i | t j ) of no de i being infected at t i b y no de j multiplied by the probability S ( t i | t l ) that it has survived from the infection of eac h other parent no de l 6 = j until time t i . P erhaps surprisingly , the factorization in Equation (1) is the same factorization that can b e used for an arbitrary induced D A G consistent with the con tact netw ork G . In this case, w e only need to replace the deﬁnition of π i (the paren t of node i in the D A G) to the set of neigh b ors of node i with an edge p oin ting to no de i in G . This is not immediately ob vious from Equation (1), since the con tact net w ork G can con tain directed lo ops whic h seems to b e in conﬂict with the conditional indep endence seman tics of directed graphical mo dels. The reason why it is p ossible to do so is as follows: an y ﬁxed set of infection times, t 1 , . . . , t d , induces an ordering of the infection times. If t i ≤ t j for an edge j → i in G , h j i ( t i | t j ) = 0, and the corresp onding term in Equation (2) is zeroed out, making the conditional densit y consistent with the seman tics of directed graphical mo dels. Instead of directly modeling the infection times t i , w e can focus on the set of m utually indep endent random transmission times τ j i = t i − t j . Interestingly , b y switching from a no de-cen tric view to an edge-centric view, we obtain a fully factorized join t density of the set of transmission times p  { τ j i } ( j,i ) ∈E  = Y ( j,i ) ∈E f j i ( τ j i ) , (3) Based on the Shortest-Path property of the independent cascade mo del, each v ariable t i can b e view ed as a transformation from the collection of v ariables { τ j i } ( j,i ) ∈E . More sp eciﬁcally , let Q i b e the collection of directed paths in G from the source no des to no de i , where eac h path q ∈ Q i con tains a sequence of directed edges ( j, l ). Assuming all source no des are infected at time zero, then w e obtain v ariable t i via t i = g i  { τ j i } ( j,i ) ∈E |A  = min q ∈Q i X ( j,l ) ∈ q τ j l , (4) where the transformation g i ( ·|A ) is the v alue of the shortest-path minimization. As a sp ecial case, w e can now compute the probabilit y of no de i infected b efore T using a set of indep enden t v ariables: Pr { t i ≤ T |A} = Pr  g i  { τ j i } ( j,i ) ∈E |A  ≤ T  . (5) The signiﬁcance of the relation is that it allo ws us to transform a problem in volving a sequence of dep enden t v ariables { t i } i ∈V to one with indep enden t v ariables { τ j i } ( j,i ) ∈E . F urthermore, the t w o p ersp ectiv es are connected via the shortest path algorithm in w eighted directed graph, a standard w ell-studied operation in graph analysis. 6 Continuous-Time Influence Maximiza tion of Mul tiple Items 2.3 Inﬂuence Estimation Problem in Contin uous-Time Diﬀusion Netw orks In tuitiv ely , giv en a time window, the wider the spread of infection, the more inﬂuen tial the set of sources. W e adopt the deﬁnition of inﬂuence as the exp ected num b er of infected nodes giv en a set of source nodes and a time windo w, as in previous work (Gomez-Ro driguez and Sc h¨ olkopf, 2012). More formally , consider a set of source nodes A ⊆ V , |A| ≤ C whic h get infected at time zero. Then, given a time window T , a no de i is infected within the time windo w if t i ≤ T . The exp ected num b er of infected nodes (or the inﬂuence) given the set of transmission functions { f j i } ( j,i ) ∈E can be computed as σ ( A , T ) = E h X i ∈V I { t i ≤ T |A} i = X i ∈V Pr { t i ≤ T |A} , (6) where I {·} is the indicator function and the exp ectation is taken ov er the the set of dep endent v ariables { t i } i ∈V . By construction, σ ( A , T ) is a non-negative, monotonic nondecreasing submo dular function in the set of source no des sho wn by Gomez-Rodriguez and Sc h¨ olkopf (2012). Essen tially , the inﬂuence estimation problem in Equation (6) is an inference problem for graphical mo dels, where the probabilit y of even t t i ≤ T given sources in A can b e obtained b y summing out the p ossible conﬁguration of other v ariables { t j } j 6 = i . That is Pr { t i ≤ T |A} = Z ∞ 0 · · · Z T t i =0 · · · Z ∞ 0  Y j ∈V p  t j |{ t l } l ∈ π j    Y j ∈V dt j  , (7) whic h is, in general, a v ery c hallenging problem. First, the corresp onding directed graphical mo dels can con tain no des with high in-degree and high out-degree. F or example, in Twitter, a user can follo w dozens of other users, and another user can hav e h undreds of “follow ers”. The tree-width corresp onding to this directed graphical mo del can b e very high, and we need to p erform integration for functions inv olving many con tin uous v ariables. Sec ond, the integral in general can not b e ev aluated analytically for heterogeneous transmission functions, whic h means that w e need to resort to n umerical in tegration b y discretizing the domain [0 , ∞ ). If we use N levels of discretization for eac h v ariable, w e w ould need to en umerate O ( N | π i | ) en tries, exp onen tial in the num b er of parents. Only in very sp ecial cases, can one deriv e the closed-form equation for computing Pr { t i ≤ T |A} . F or instance, Gomez-Ro driguez and Sch¨ olk opf (2012) prop osed an approach for exp onen tial transmission functions, where the special prop erties of exp onen tial densit y are used to map the problem in to a con tinuous time Mark ov process problem, and the com- putation can be carried out via a matrix exp onential. Ho w ever, without further heuristic appro ximation, the computational complexity of the algorithm is exp onen tial in the size and densit y of the netw ork. The intrinsic complexity of the problem en tails the utilization of approximation algorithms, suc h as mean ﬁeld algorithms or message passing algorithms. W e will design an eﬃcien t randomized (or sampling) algorithm in the next section. 2.4 Eﬃcient Inﬂuence Estimation in Contin uous-Time Diﬀusion Netw orks Our ﬁrst key observ ation is that we can transform the inﬂuence estimation problem in Equation (6) in to a problem with indep endent v ariables. With the relation in Equation (5), 7 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song w e can derive the inﬂuence function as σ ( A , T ) = X i ∈V Pr  g i  { τ j i } ( j,i ) ∈E |A  ≤ T  = E h X i ∈V I  g i  { τ j i } ( j,i ) ∈E |A  ≤ T  i , (8) where the exp ectation is with resp ect to the set of indep enden t v ariables { τ j i } ( j,i ) ∈E . This equiv alent formulation suggests a naive sampling (NS) algorithm for appro ximating σ ( A , T ): dra w n samples of { τ j i } ( j,i ) ∈E , run a shortest path algorithm for each sample, and ﬁnally a v erage the results (see App endix A for more details). How ev er, this naive sampling ap- proac h has a computational complexity of O ( nC |V ||E | + nC |V | 2 log |V | ) due to the rep eated calling of the shortest path algorithm. This is quadratic to the netw ork size, and hence not scalable to millions of nodes. Our second k ey observ ation is that for each sample { τ j i } ( j,i ) ∈E , we are only interested in the neigh b orho o d size of the source no des, i.e. , the summation P i ∈V I {·} in Equation (8), rather than in the individual shortest paths. F ortunately , the neighborho o d size estimation problem has b een studied in the theoretical computer science literature. Here, w e adapt a v ery eﬃcient randomized algorithm by Cohen (1997) to our inﬂuence estimation problem. This randomized algorithm has a computational complexity of O ( |E | log |V | + |V | log 2 |V | ) and it estimates the neigh b orhoo d sizes for al l possible single source node lo cations. Since it needs to run once for each sample of { τ j i } ( j,i ) ∈E , we obtain an o v erall inﬂuence estimation algorithm with O ( n |E | log |V | + n |V | log 2 |V | ) computation, nearly linear in net w ork size. Next w e will revisit Cohen’s algorithm for neigh borho o d estimation. 2.4.1 Randomized Algorithm for Single-Source Neighborhood-Size Estima tion Giv en a ﬁxed set of edge transmission times { τ j i } ( j,i ) ∈E and a source no de s , infected at time zero, the neigh b orhoo d N ( s, T ) of a source no de s giv en a time windo w T is the set of nodes within distance T from s , i.e. , N ( s, T ) =  i   g i  { τ j i } ( j,i ) ∈E  ≤ T , i ∈ V  . (9) Instead of estimating N ( s, T ) directly , the algorithm will assign an exponentially distributed random lab el r i to eac h netw ork no de i . Then, it makes use of the fact that the minimum of a set of exp onen tial random v ariables { r i } i ∈N ( s,T ) is still an exp onen tial random v ariable, but with its parameter b eing equal to the total num b er of v ariables, that is, if each r i ∼ exp( − r i ), then the smallest lab el within distance T from source s , r ∗ := min i ∈N ( s,T ) r i , will distribute as r ∗ ∼ exp {−|N ( s, T ) | r ∗ } . Supp ose w e randomize ov er the lab eling m times and obtain m suc h least lab els, { r u ∗ } m u =1 . Then the neigh b orho od size can b e estimated as |N ( s, T ) | ≈ m − 1 P m u =1 r u ∗ . (10) whic h is sho wn b y Cohen (1997) to b e an unbiased estimator of |N ( s, T ) | . This is an elegant relation since it allows us to transform the coun ting problem in (9) to a problem of ﬁnding the minimum random lab el r ∗ . The key question is whether w e can compute the least lab el r ∗ eﬃcien tly , giv en random lab els { r i } i ∈V and an y source no de s . 8 Continuous-Time Influence Maximiza tion of Mul tiple Items Cohen (1997) designed a modiﬁed Dijkstra’s algorithm (Algorithm 3) to construct a data structure r ∗ ( s ), called least label list, for each node s to supp ort suc h query . Essentially , the algorithm starts with the no de i with the smallest lab el r i , and then it trav erses in breadth- ﬁrst searc h fashion along the rev erse direction of the graph edges to ﬁnd all reac hable nodes. F or each reac hable no de s , the distance d ∗ b et ween i and s , and r i are added to the end of r ∗ ( s ). Then the algorithm mov es to the node i 0 with the second smallest lab el r i 0 , and similarly ﬁnd all reac hable no des. F or eac h reac hable no de s , the algorithm will compare the curren t distance d ∗ b et ween i 0 and s with the last recorded distance in r ∗ ( s ). If the curren t distance is smaller, then the curren t d ∗ and r i 0 are added to the end of r ∗ ( s ). Then the algorithm mov e to the no de with the third smallest lab el and so on. The algorithm is summarized in Algorithm 3 in Appendix B. Algorithm 3 returns a list r ∗ ( s ) p er no de s ∈ V , whic h contains information ab out distance to the smallest reachable lab els from s . In particular, each list con tains pairs of distance and random lab els, ( d, r ), and these pairs are ordered as ∞ > d (1) > d (2) > . . . > d ( | r ∗ ( s ) | ) = 0 (11) r (1) < r (2) < . . . < r ( | r ∗ ( s ) | ) , (12) where {·} ( l ) denotes the l -th elemen t in the list. (see App endix B for an example). If we wan t to query the smallest reachable random lab el r ∗ for a giv en source s and a time T , w e only need to p erform a binary search on the list for no de s : r ∗ = r ( l ) , where d ( l − 1) > T ≥ d ( l ) . (13) Finally , to estimate |N ( s, T ) | , w e generate m i.i.d. collections of random labels, run Algo- rithm 3 on eac h collection, and obtain m v alues { r u ∗ } m u =1 , which w e use in Equation (10) to estimate |N ( i, T ) | . The computational complexit y of Algorithm 3 is O ( |E | log |V | + |V | log 2 |V | ), with ex- p ected size of eac h r ∗ ( s ) b eing O (log |V | ). Then the exp ected time for querying r ∗ is O (log log |V | ) using binary search. Since we need to generate m set of random lab els and run Algorithm 3 m times, the o verall computational complexit y for estimating the single- source neigh b orhoo d size for all s ∈ V is O ( m |E | log |V | + m |V | log 2 |V | + m |V | log log |V | ). F or large-scale net w ork, and when m  min {|V | , |E |} , this randomized algorithm can b e m uc h more eﬃcien t than approac hes based on directly calculating the shortest paths. 2.4.2 Constructing Estima tion for Mul tiple-Source Neighborhood Size When w e ha v e a set of sources, A , its neighborho o d is the union of the neighborho o ds of its constituen t sources N ( A , T ) = [ i ∈A N ( i, T ) . (14) This is true b ecause each source indep enden tly infects its do wnstream no des. F urthermore, to calculate the least lab el list r ∗ corresp onding to N ( A , T ), we can simply reuse the least lab el list r ∗ ( i ) of each individual source i ∈ A . More formally , r ∗ = min i ∈A min j ∈N ( i,T ) r j , (15) 9 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song where the inner minimization can b e carried out by querying r ∗ ( i ). Similarly , after we obtain m samples of r ∗ , we can estimate |N ( A , T ) | using Equation (10). Importantly , very little additional work is needed when we w ant to calculate r ∗ for a set of sources A , and w e can reuse work done for a single source. This is very diﬀeren t from a naive sampling approac h where the sampling process needs to b e done completely anew if we increase the source set. In contrast, using the randomized algorithm, only an additional constan t-time minimization o ver |A| num b ers is needed. 2.4.3 Overall Algorithm So far, w e ha ve achiev ed eﬃcien t neighborho o d size estimation of |N ( A , T ) | with resp ect to a giv en set of transmission times { τ j i } ( j,i ) ∈E . Next, w e will estimate the inﬂuence b y av eraging o v er m ultiple sets of samples for { τ j i } ( j,i ) ∈E . More speciﬁcally , the relation from (8) σ ( A , T ) = E { τ j i } ( j,i ) ∈E [ |N ( A , T ) | ] = E { τ j i } E { r 1 ,...,r m }|{ τ j i }  m − 1 P m u =1 r u ∗  , (16) suggests the following ov erall algorithm : Con tin uous-Time Inﬂuence Estimation ( ConTinEst ): 1. Sample n sets of random transmission times { τ l ij } ( j,i ) ∈E ∼ Q ( j,i ) ∈E f j i ( τ j i ) . 2. Given a set of { τ l ij } ( j,i ) ∈E , sample m sets of random lab els { r u i } i ∈V ∼ Q i ∈V exp( − r i ) . 3. Estimate σ ( A , T ) b y sample av erages σ ( A , T ) ≈ 1 n P n l =1  ( m − 1) / P m u l =1 r u l ∗  . What is ev en more imp ortan t is that the n um b er of random labels, m , do es not need to b e very large. Since the estimator for |N ( A, T ) | is unbiased (Cohen, 1997), essen tially the outer-lo op of av eraging o v er n samples of random transmission times further reduces the v ariance of the estimator in a rate of O (1 /n ). In practice, we can use a v ery small m ( e.g. , 5 or 10) and still achiev e go od results, which is also conﬁrmed by our later exp eri- men ts. Compared to (Chen et al., 2009), the no v el application of Cohen’s algorithm arises for estimating inﬂuence for multiple sources, which drastically reduces the computation b y clev erly using the least-lab el list from single source. Moreov er, we ha ve the follo wing theoretical guaran tee (see App endix C for the pro of ). Theorem 1 Dr aw the fol lowing numb er of samples for the set of r andom tr ansmission times n ≥ C Λ  2 log  2 |V | α  (17) wher e Λ := max A : |A|≤ C 2 σ ( A , T ) 2 / ( m − 2) + 2 V ar ( |N ( A , T ) | )( m − 1) / ( m − 2) + 2 a/ 3 and |N ( A , T ) | ≤ |V | , and for e ach set of r andom tr ansmission times, dr aw m sets of r andom lab els. Then | b σ ( A , T ) − σ ( A , T ) | ≤  uniformly for al l A with |A| ≤ C , with pr ob ability at le ast 1 − α . 10 Continuous-Time Influence Maximiza tion of Mul tiple Items The theorem indicates that the minimum n umber of samples, n , needed to ac hiev e certain accuracy is related to the actual size of the inﬂuence σ ( A , T ), and the v ariance of the neigh b orhoo d size |N ( A , T ) | o v er the random dra w of samples. The n um b er of random lab els, m , drawn in the inner lo op of the algorithm will monotonically decrease the dep endency of n on σ ( A , T ). It suﬃces to dra w a small n um b er of random lab els, as long as the v alue of σ ( A , T ) 2 / ( m − 2) matc hes that of V ar ( |N ( A , T ) | ). Another implication is that inﬂuence at larger time windo w T is harder to estimate, since σ ( A , T ) will generally b e larger and hence require more random samples. 3. Constrain ts of Practical Imp ortance By treating our prop osed inﬂuence estimation algorithm ConTinEst as a building blo c k, we can no w tac kle the inﬂuence maximization problem under v arious constrain ts of practical imp ortance. Here, since ConTinEst can estimate the inﬂuence v alue of an y source set with resp ect to any given time windo w T , the Timing Constrain ts can thus b e naturally satisﬁed. Therefore, in the following sections, w e mainly fo cus on mo deling the Multiple- Item Constrain ts , the User Constraints and the Pro duct Constrain ts . 3.1 Multiple-Item Constraints Multiple pro ducts can spread simultaneously across the same set of so cial entities o ver diﬀeren t diﬀusion channels. Since these pro ducts ma y hav e diﬀeren t characteristics, such as the rev enue and the sp eed of spread, and th us may follow diﬀeren t diﬀusion dynamics, w e will use multiple diﬀusion net w orks for diﬀeren t types of pro ducts. Supp ose w e ha ve a set of pro ducts L that propagate on the same set of no des V . The diﬀusion net w ork for pro duct i is denoted as G i = ( V , E i ). F or eac h product i ∈ L , w e searc h for a set of source no des R i ⊆ V to whic h we can assign the pro duct i to start its campaign. W e can represent the selection of R i ’s using an assignment matrix A ∈ { 0 , 1 } |L|×|V | as follo ws: A ij = 1 if j ∈ R i and A ij = 0 otherwise. Based on this representation, we deﬁne a new ground set Z = L × V of size N = |L| × |V | . Each elemen t of Z corresp onds to the index ( i, j ) of an en try in the assignmen t matrix A , and selecting elemen t z = ( i, j ) means assigning product i to user j (see Figure 1 for an illustration). W e also denote Z ∗ j = L × { j } and Z i ∗ = { i } × V as the j -th column and i -th ro w of matrix A , resp ectiv ely . Then, under the ab o v e men tioned additional requirements, we would like to ﬁnd a set of assignmen ts S ⊆ Z so as to maximize the follo wing o verall inﬂuence f ( S ) = X i ∈L a i σ i ( R i , T i ) , (18) where σ i ( R i , T i ) denote the inﬂuence of product i for a giv en time T i , { a i > 0 } is a set of w eigh ts reﬂecting the diﬀerent beneﬁts of the products and R i = { j ∈ V : ( i, j ) ∈ S } . W e no w sho w that the o v erall inﬂuence function f ( S ) in Equation (18) is submo dular ov er the ground set Z . Lemma 2 Under the c ontinuous-time indep endent c asc ade mo del, the over al l inﬂuenc e f ( S ) is a normalize d monotone submo dular function of S . 11 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song i j Figure 1: Illustration of the assignmen t matrix A associated with partition matroid M 1 and group knapsac k constraints. If product i is assigned to user j , then A ij = 1 (colored in red). The ground set Z is the set of indices of the entries in A , and selecting an elemen t ( i, j ) ∈ Z means assigning pro duct i to user j . The user constraint means that there are at most u j elemen ts selected in the j -th column; the product constraint means that the total cost of the elements selected in the i -th ro w is at most B i . Pro of By deﬁnition, f ( ∅ ) = 0 and f ( S ) is monotone. By Theorem 4 in Gomez-Ro driguez and Sc h¨ olkopf (2012), the component inﬂuence function σ i ( R i , T i ) for pro duct i is submodu- lar in R i ⊆ V . Since non-negative linear combinations of submo dular functions are still sub- mo dular, f i ( S ) := a i σ i ( R i , T i ) is also submo dular in S ⊆ Z = L × V , and f ( S ) = P i ∈L f i ( S ) is submodular. 3.2 User Constraints Eac h so cial net work user can b e a p oten tial source and would like to be exp osed only to a small num b er of ads. F urthermore, users ma y b e group ed according to their geographical lo cations, and adv ertisers may ha ve a target p opulation they w an t to reach. Here, w e will incorp orate these constraints using the matroids which are combinatorial structures that generalize the notion of linear indep endence in matrices (Sc hrijver, 2003; F ujishige, 2005). F ormulating our constrained inﬂuence maximization task with matroids allo ws us to design a greedy algorithm with pro v able guaran tees. F ormally , supp ose that eac h user j can b e assigned to at most u j pro ducts. A matroid can be deﬁned as follows: Deﬁnition 3 A matr oid is a p air, M = ( Z , I ) , deﬁne d over a ﬁnite set (the gr ound set) Z and a family of sets (the indep endent sets) I , that satisﬁes thr e e axioms: 1. Non-emptiness: The empty set ∅ ∈ I . 2. Her e dity: If Y ∈ I and X ⊆ Y , then X ∈ I . 3. Exchange: If X ∈ I , Y ∈ I and | Y | > | X | , then ther e exists z ∈ Y \ X such that X ∪ { z } ∈ I . 12 Continuous-Time Influence Maximiza tion of Mul tiple Items An imp ortan t type of matroid is the partition matroid where the ground set Z is par- titioned in to disjoint subsets Z 1 , Z 2 , . . . , Z t for some t and I = { S | S ⊆ Z and | S ∩ Z i | ≤ u i , ∀ i = 1 , . . . , t } for some given parameters u 1 , . . . , u t . The user constrain ts can then b e form ulated as P artition matroid M 1 : partition the ground set Z in to Z ∗ j = L × { j } each of whic h corresponds to a column of A . Then M 1 = {Z , I 1 } is I 1 = { S | S ⊆ Z and | S ∩ Z ∗ j | ≤ u j , ∀ j } . 3.3 Pro duct Constrain ts Seeking initial adopters en tails a cost to the adv ertiser, which needs to b e paid to the host, while the adv ertisers of eac h pro duct ha ve a limited amoun t of money . Here, we will incorp orate these requiremen ts using knapsac k constraints which we describ e below. F ormally , suppose that each product i has a budget B i , and assigning item i to user j costs c ij > 0. Next, w e in tro duce the follo wing notation to describ e pro duct constraints o v er the ground set Z . F or an element z = ( i, j ) ∈ Z , deﬁne its cost as c ( z ) := c ij . Abusing the notation slightly , w e denote the cost of a subset S ⊆ Z as c ( S ) := P z ∈ S c ( z ). Then, in a feasible solution S ⊆ Z , the cost of assigning product i , c ( S ∩ Z i ∗ ), should not b e larger than its budget B i . No w, without loss of generality , we can assume B i = 1 (by normalizing c ij with B i ), and also c ij ∈ (0 , 1] (b y thro wing a wa y an y elemen t ( i, j ) with c ij > 1), and deﬁne Group-knapsac k: partition the ground set into Z i ∗ = { i } × V each of which corresp onds to one ro w of A . Then a feasible solution S ⊆ Z satisﬁes c ( S ∩ Z i ∗ ) ≤ 1 , ∀ i. Imp ortan tly , these knapsack constraints ha v e v ery sp eciﬁc structure: they are on diﬀer- en t groups of a partition {Z i ∗ } of the ground set and the submo dular function f ( S ) = P i a i σ i ( R i , T i ) is deﬁned ov er the partition. In consequence, such structures allow us to design an eﬃcient algorithm with impro v ed guaran tees o ver the known results. 3.4 Overall Problem F ormulation Based on the ab o v e discussion of v arious constrain ts in viral marketing and our design c hoices for tackling them, we can think of the inﬂuence maximization problem as a s pecial case of the follo wing constrained submo dular maximization problem with P = 1 matroid constrain ts and k = |L| knapsack constraints, max S ⊆Z f ( S ) (19) sub ject to c ( S ∩ Z i ∗ ) ≤ 1 , 1 ≤ i ≤ k , S ∈ P \ p =1 I p , where, for simplicity , we will denote all the feasible solutions S ⊆ Z as F . This form ulation in general includes the follo wing cases of practical importance : 13 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Uniform User-Cost. An imp ortant case of inﬂuence maximization, whic h we denote as the Uniform Cost, is that for eac h pro duct i , all users hav e the same cost c i ∗ , i.e. , c ij = c i ∗ . Equiv alently , each pro duct i can b e assigned to at most b i := b B i /c i ∗ c users. Then the pro duct constraints are simpliﬁed to P artition matroid M 2 : for the pro duct constrain ts with uniform cost, deﬁne a matroid M 2 = {Z , I 2 } where I 2 = { S | S ⊆ Z and | S ∩ Z i ∗ | ≤ b i , ∀ i } . In this case, the inﬂuence maximization problem deﬁned by Equation (19) b ecomes the problem with P = 2 matroid constraints and no knapsac k constraints ( k = 0). In addition, if we assume only one pro duct needs campaign, the formulation of Equation (19) further reduces to the classic inﬂuence maximization problem with the simple cardinality constraint. User Group Constraint. Our formulation in Equation (19) essen tially allo ws for gen- eral matroids which can mo del more sophisticated real-w orld constrain ts, and the prop osed form ulation, algorithms, and analysis can still hold. F or instance, suppose there is a hier- arc hical communit y structure on the users, i.e. , a tree T where lea ves are the users and the internal no des are communities consisting of all users underneath, suc h as customers in diﬀerent countries around the w orld. In consequence of marketing strategies, on each comm unit y C ∈ T , there are at most u C slots for assigning the products. Such constraints are readily mo deled b y the Laminar Matroid, which generalizes the partition matroid b y allo wing the set {Z i } to b e a laminar family ( i.e. , for an y Z i 6 = Z j , either Z i ⊆ Z j , or Z j ⊆ Z i , or Z i ∩ Z j = ∅ ). It can b e shown that the communit y constraints can b e captured b y the matroid M = ( Z , I ) where I = { S ⊆ Z : | S ∩ C | ≤ u C , ∀ C ∈ T } . In the next sec- tion, w e ﬁrst presen t our algorithm, then provide the analysis for the uniform cost case and ﬁnally lev erage such analysis for the general case. 4. Inﬂuence Maximization In this section, w e ﬁrst dev elop a simple, practical and in tuitive adaptiv e-thresholding greedy algorithm to solve the contin uous-time inﬂuence maximization problem with the aforemen- tioned constrain ts. Then, w e pro vide a detailed theoretical analysis of its p erformance. 4.1 Overall Algorithm There exist algorithms for submo dular maximization under m ultiple knapsac k constrain ts ac hieving a 1 − 1 e appro ximation factor by (Sviridenk o, 2004). Th us, one ma y b e tempted to conv ert the matroid constrain t in the problem deﬁned by Equation (19) to |V | knapsac k constrain ts, so that the problem becomes a submo dular maximization problem under |L| + |V | knapsac k constraints. How ev er, this naiv e approac h is not practical for large-scale scenarios b ecause the running time of suc h algorithms is exp onen tial in the n um b er of knapsac k constraints. Instead, if we opt for algorithms for submo dular maximization under k knapsac k constraints and P matroids constraints, the b est appro ximation factor achiev ed b y p olynomial time algorithms is 1 P +2 k +1 (Badanidiyuru and V ondr´ ak, 2014). How ever, 14 Continuous-Time Influence Maximiza tion of Mul tiple Items Algorithm 1: Density Threshold En umeration Input : parameter δ ; ob jectiv e f or its approximation b f ; assignmen t cost c ( z ) , z ∈ Z 1 Set d = max { f ( { z } ) : z ∈ Z } ; 2 for ρ ∈ n 2 d P +2 k +1 , (1 + δ ) 2 d P +2 k +1 , . . . , 2 |Z | d P +2 k +1 o do 3 Call Algorithm 2 to get S ρ ; Output : argmax S ρ f ( S ρ ) this is not go od enough yet, since in our problem k = |L| can b e large, though P = 1 is small. Here, we will design an algorithm that ac hieves a b etter approximation factor b y ex- ploiting the follo wing k ey observ ation about the structure of the problem deﬁned b y Equa- tion (19): the knapsack constrain ts are o ver diﬀerent groups Z i ∗ of the whole ground set, and the ob jectiv e function is a sum of submo dular functions ov er these diﬀeren t groups. The details of the algorithm, called BudgetMax , are describ ed in Algorithm 1. Bud- getMax enumerates diﬀerent v alues of a so-called density threshold ρ , runs a subroutine to ﬁnd a solution for eac h ρ , which quan tiﬁes the cost-eﬀectiv eness of assigning a particu- lar pro duct to a sp eciﬁc user, and ﬁnally outputs the solution with the maxim um ob jectiv e v alue. Intuitiv ely , the algorithm restricts the searc h space to b e the set of most cost-eﬀective allo cations. The details of the subroutine to ﬁnd a solution for a ﬁxed densit y threshold ρ are describ ed in Algorithm 2. Inspired by the lazy ev aluation heuristic (Lesko vec et al., 2007), the algorithm maintains a w orking set G and a marginal gain threshold w t , which geometrically decreases b y a factor of 1 + δ until it is suﬃciently small to b e set to zero. At eac h w t , the subroutine selects eac h new element z that satisﬁes the following prop erties: 1. It is feasible and the densit y ratio (the ratio b et w een the marginal gain and the cost) is abov e the curren t densit y threshold; 2. Its marginal gain f ( z | G ) := f ( G ∪ { z } ) − f ( G ) is abov e the curren t marginal gain threshold. The term “density” comes from the knapsack problem, where the marginal gain is the mass and the cost is the v olume. A large density means gaining a lot without pa ying muc h. In short, the algorithm considers only high-qualit y assignments and rep eatedly selects feasible ones with marginal gain ranging from large to small. Remark 1. The traditional lazy ev aluation heuristic also k eeps a threshold, ho wev er, it only uses the threshold to speed up selecting the element with maximum marginal gain. Instead, Algorithm 2 can add m ultiple elements z from the ground set at eac h threshold, and th us reduces the n umber of rounds from the size of the solution to the num b er of thresholds O ( 1 δ log N δ ). This allo ws us to trade oﬀ betw een the runtime and the appro ximation ratio (refer to our theoretical guaran tees in section 4.2). 15 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Algorithm 2: Adaptive Threshold Greedy for Fixed Density Input : parameters ρ , δ ; ob jectiv e f or its appro ximation b f ; assignmen t cost c ( z ) , z ∈ Z ;set of feasible solutions F ; and d from Algorithm 1. 1 Set d ρ = max { f ( { z } ) : z ∈ Z , f ( { z } ) ≥ c ( z ) ρ } ; 2 Set w t = d ρ (1+ δ ) t for t = 0 , . . . , L = argmin i  w i ≤ δ d N  and w L +1 = 0; 3 Set G = ∅ ; 4 for t = 0 , 1 , . . . , L, L + 1 do 5 for z 6∈ G with G ∪ { z } ∈ F and f ( z | G ) ≥ c ( z ) ρ do 6 if f ( z | G ) ≥ w t then 7 Set G ← G ∪ { z } ; Output : S ρ = G Remark 2. Ev aluating the inﬂuence of the assigned products f is exp ensiv e. Therefore, w e will use the randomized algorithm in Section 2.4.3 to compute an estimation b f ( · ) of the quan tit y f ( · ). 4.2 Theoretical Guarantees Although our algorithm is quite in tuitiv e, it is highly non-trivial to obtain the theoretical guaran tees. F or clarity , we ﬁrst analyze the simpler case with uniform cost, which then pro vides the base for analyzing the general case. 4.2.1 Uniform Cost As sho wn at the end of Section 3.4, the inﬂuence maximization, in this case, corresponds to the problem deﬁned by Equation (19) with P = 2 and no knapsac k constraints. Thus, w e can simply run Algorithm 2 with ρ = 0 to obtain a solution G , which is roughly a 1 P +1 -appro ximation. In tuition. The algorithm greedily selects feasible elements with suﬃciently large marginal gain. How ev er, it is unclear whether our algorithm will ﬁnd go o d solutions and whether it will b e r obust to noise. Regarding the former, one might w onder whether the algorithm will select just a few elements while many elemen ts in the optimal solution O will b ecome infea- sible and will not b e selected, in which case the greedy solution G is a p o or approximation. Regarding the latter, we only use the estimation b f of the inﬂuence f ( i.e. , | b f ( S ) − f ( S ) | ≤  for any S ⊆ Z ), which in tro duces additional error to the function v alue. A crucial question, whic h has not b een addressed b efore (Badanidiyuru and V ondr´ ak, 2014), is whether the adaptiv e threshold greedy algorithm is robust to suc h pe rturbations. F ortunately , it turns out that the algorithm will pro v ably select suﬃciently man y ele- men ts of high qualit y . First, the elemen ts selected in the optimal solution O but not selected in G can be partitioned into | G | groups, each of which is asso ciated with an elemen t in G . Th us, the num b er of elemen ts in the groups asso ciated with the ﬁrst t elemen ts in G , by the prop ert y of the intersection of matroids, are b ounded b y P t . See Figure 2 for an illus- tration. Second, the marginal gain of eac h element in G is at least as large as that of an y 16 Continuous-Time Influence Maximiza tion of Mul tiple Items G O \ G g 1 C 1 g t − 1 C t − 1 g t C t g | G | C | G | · · · · · · · · · · · · g 2 C 2 S t i =1 C i G t Figure 2: Notation for analyzing Algorithm 2. The elements in the greedy solution G are arranged according to the order in which Algorithm 2 selects them in Step 3. The elemen ts in the optimal solution O but not in the greedy solution G are partitioned into groups { C t } 1 ≤ t ≤| G | , where C t are those elemen ts in O \ G that are still feasible b efore selecting g t but are infeasible after selecting g t . elemen t in the group asso ciated with it (up to some small error). This means that even if the submo dular function ev aluation is inexact, the quality of the elements in the greedy solution is still goo d. The t w o claims together sho w that the marginal gain of O \ G is not m uc h larger than the gain of G , and th us G is a goo d appro ximation for the problem. F ormally , supp ose we use an inexact ev aluation of the inﬂuence f such that | b f ( S ) − f ( S ) | ≤  for an y S ⊆ Z , let pro duct i ∈ L spread according to a diﬀusion net w ork G i = ( V , E i ), and i ∗ = argmax i ∈L |E i | . Then, w e ha ve: Theorem 4 Supp ose b f is evaluate d up to err or  = δ / 16 with ConTinEst . F or inﬂuenc e maximization with uniform c ost, Algorithm 2 (with ρ = 0 ) outputs a solution G with f ( G ) ≥ 1 − 2 δ 3 f ( O ) in exp e cte d time e O  |E i ∗ | + |V | δ 2 + |L||V | δ 3  . The parameter δ introduces a tradeoﬀ b etw een the approximation guarantee and the run time: larger δ decreases the approximation ratio but results in fewer inﬂuence ev alua- tions. Moreov er, the runtime has a linear dep endence on the netw ork size and the n umber of pro ducts to propagate (ignoring some small logarithmic terms) and, as a consequence, the algorithm is scalable to large netw orks. Analysis. Supp ose G = { g 1 , . . . , g | G | } in the order of selection, and let G t = { g 1 , . . . , g t } . Let C t denote all those elemen ts in O \ G that satisfy the following: they are still feasible b efore selecting the t -th elemen t g t but are infeasible after selecting g t . Equiv alen tly , C t are all those elemen ts j ∈ O \ G such that (1) j ∪ G t − 1 do es not violate the matroid constraints but (2) j ∪ G t violates the matroid constrain ts. In other words, w e can think of C t as the optimal elemen ts “blo c k ed” b y g t . Then, w e proceed as follo ws. By the property of the in tersection of matroids, the size of the preﬁx S t i =1 C t is b ounded b y P t . As a consequence of this prop ert y , for any Q ⊆ Z , the sizes of an y tw o maximal indep enden t subsets T 1 and T 2 of Q can only diﬀer by a m ultiplicativ e factor at most P . This can b e realized with the follo wing argumen t. First, note that an y element z ∈ T 1 \ T 2 , 17 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song { z } ∪ T 2 violates at least one of the matroid constraints since T 2 is maximal. Then, let { V i } 1 ≤ i ≤ P denote all elements in T 1 \ T 2 that violate the i -th matroid, and partition T 1 ∩ T 2 arbitrarily among these V i ’s so that they co v er T 1 . In this construction, the size of eac h V i m ust b e at most | T 2 | , since otherwise b y the Exc hange axiom, there would exist z ∈ V i \ T 2 that can b e added to T 2 , without violating the i -th matroid, leading to a contradiction. Therefore, | T 1 | is at most P times | T 2 | . Next, w e apply the abov e prop ert y as follows. Let Q b e the union of G t and S t i =1 C t . On one hand, G t is a maximal indep enden t subset of Q , since no elemen t in S t i =1 C t can b e added to G t without violating the matroid constrain ts. On the other hand, S t i =1 C t is an indep enden t subset of Q , since it is part of the optimal solution. Therefore, S t i =1 C t has size at most P times | G t | , which is P t . Note that the prop erties of matroids are crucial for this analysis, which justiﬁes our form ulation using matroids. In summary , w e ha v e Claim 1 P t i =1 | C i | ≤ P t , for t = 1 , . . . , | G | . No w, we consider the marginal gain of eac h element in C t asso ciated with g t . First, supp ose g t is selected at the threshold τ t > 0. Then, an y j ∈ C t has marginal gain b ounded b y (1 + δ ) τ t + 2  , since otherwise j would ha ve b een selected at a larger threshold b efore τ t b y the greedy criterion. Second, supp ose g t is selected at the threshold w L +1 = 0. Then, an y j ∈ C t has marginal gain approximately b ounded by δ N d . Since the greedy algorithm m ust pick g 1 with b f ( g 1 ) = d and d ≤ f ( g 1 ) +  , an y j ∈ C t has marginal gain b ounded by δ N f ( G ) + O (  ). Putting ev erything together we hav e: Claim 2 Supp ose g t is sele cte d at the thr eshold τ t . Then f ( j | G t − 1 ) ≤ (1 + δ ) τ t + 4  + δ N f ( G ) for any j ∈ C t . Since the ev aluation of the marginal gain of g t should be at least τ t , this claims essentially indicates that the marginal gain of j is approximately b ounded b y that of g t . Since there are not many elements in C t (Claim 1) and the marginal gain of each of its elemen ts is not muc h larger than that of g t (Claim 2), w e can conclude that the marginal gain of O \ G = S | G | i =1 C t is not muc h larger than that of G , whic h is just f ( G ). Claim 3 The mar ginal gain of O \ G satisﬁes X j ∈ O \ G f ( j | G ) ≤ [(1 + δ ) P + δ ] f ( G ) + (6 + 2 δ ) P | G | . Finally , since by submodularity , f ( O ) ≤ f ( O ∪ G ) ≤ f ( G ) + P j ∈ O \ G f ( j | G ), Claim 3 sho ws that f ( G ) is close to f ( O ) up to a m ultiplicative factor roughly (1 + P ) and additiv e factor O ( P | G | ). Given that f ( G ) > | G | , it leads to roughly a 1 / 3-appro ximation for our inﬂuence maximization problem b y setting  = δ / 16 when ev aluating b f with ConTinEst . Com bining the ab o ve analysis and the runtime of the inﬂuence estimation algorithm, w e ha v e our ﬁnal guarantee in Theorem 4. App endix D.1 presen ts the complete pro ofs. 4.2.2 General Case In this section, w e consider the general case, in which users may hav e diﬀeren t associated costs. Recall that this case corresponds to the problem deﬁned b y Equation (19) with P = 1 matroid constraints and k = |L| group-knapsac k constrain ts. Here, we will sho w that there is a step in Algorithm 1 which outputs a solution S ρ that is a go od approximation. 18 Continuous-Time Influence Maximiza tion of Mul tiple Items In tuition. The key idea b ehind Algorithm 1 and Algorithm 2 is simple: sp end the budgets eﬃciently and sp end them as much as p ossible . By spending them eﬃcien tly , we mean to only select those elements whose density ratio b et w een the marginal gain and the cost is ab o ve the threshold ρ . That is, we assign pro duct i to user j only if the assignment leads to large marginal gain without pa ying to o muc h. By sp ending the budgets as muc h as p ossible, w e mean to stop assigning pro duct i only if its budget is almost exhausted or no more assignments are possible without violating the matroid constraints. Here we mak e use of the sp ecial structure of the knapsac k constrain ts on the budgets: each constraint is only related to the assignment of the corresp onding pro duct and its budget, so that when the budget of one pro duct is exhausted, it do es not aﬀect the assignment of the other pro ducts. In the language of submo dular optimization, the knapsac k constrain ts are on a partition Z i ∗ of the ground set and the ob jectiv e function is a sum of submodular functions ov er the partition. Ho w ever, there seems to b e a hidden con tradiction b et ween sp ending the budgets eﬃ- cien tly and sp ending them as m uch as p ossible. On one hand, eﬃciency means the densit y ratio should be large, so the threshold ρ should b e large; on the other hand, if ρ is large, there are just a few elements that can b e considered, and thus the budget might not b e exhausted. After all, if w e set ρ to be ev en larger than the maximum possible v alue, then no element is considered and no gain is achiev ed. In the other extreme, if we set ρ = 0 and consider all the elemen ts, then a few elemen ts with large costs ma y b e selected, exhausting all the budgets and leading to a p oor solution. F ortunately , there exists a suitable threshold ρ that achiev es a go o d tradeoﬀ b etw een the t wo and leads to a goo d appro ximation. On one hand, the threshold is suﬃcien tly small, so that the optimal elemen ts w e abandon ( i.e. , those with low-densit y ratio) hav e a total gain at most a fraction of the optim um; on the other hand, it is also suﬃcien tly large, so that the elemen ts selected are of high qualit y ( i.e. , of high-densit y ratio), and we ac hieve suﬃcien t gain even if the budgets of some items are exhausted. Theorem 5 Supp ose b f is evaluate d up to err or  = δ / 16 with ConTinEst . In Algorithm 1, ther e exists a ρ such that f ( S ρ ) ≥ max { k a , 1 } (2 |L| + 2)(1 + 3 δ ) f ( O ) wher e k a is the numb er of active knapsack c onstr aints: k a = |{ i : S ρ ∪ { z } 6∈ F , ∀ z ∈ Z i ∗ }| . The exp e cte d running time is e O  |E i ∗ | + |V | δ 2 + |L||V | δ 4  . Imp ortan tly , the appro ximation factor improv es ov er the b est known guaran tee 1 P +2 k +1 = 1 2 |L| +2 for eﬃcien tly maximizing submo dular functions o v er P matroids and k general knap- sac k constraints. Moreov er, since the runtime has a linear dependence on the netw ork size, the algorithm easily scales to large net w orks. As in the uniform cost case, the parameter δ in tro duces a tradeoﬀ b et ween the approximation and the run time. 19 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Analysis. The analysis follo ws the in tuition. Pic k ρ = 2 f ( O ) P +2 k +1 , where O is the optimal solution, and deﬁne O − := { z ∈ O \ S ρ : f ( z | S ρ ) < c ( z ) ρ + 2  } , O + := { z ∈ O \ S ρ : z 6∈ O − } . Note that, b y submodularity , O − is a sup erset of the elements in the optimal solution that w e abandon due to the densit y threshold and, b y construction, its marginal gain is small: f ( O − | S ρ ) ≤ ρc ( O − ) + O (  | S ρ | ) ≤ k ρ + O (  | S ρ | ) , where the small additive term O (  | S ρ | ) is due to inexact function ev aluations. Next, we pro ceed as follo ws. First, if no knapsac k constraints are activ e, then the algorithm runs as if there w ere no knapsac k constrain ts (but only on elemen ts with densit y ratio ab o v e ρ ). Therefore, we can apply the same argument as in the case of uniform cost (refer to the analysis up to Claim 3 in Section 4.2.1); the only ca v eat is that w e apply the argument to O + instead of O \ S ρ . F ormally , similar to Claim 3, the marginal gain of O + satisﬁes f ( O + | S ρ ) ≤ [(1 + δ ) P + δ ] f ( S ρ ) + O ( P | S ρ | ) , where the small additive term O ( P | S ρ | ) is due to inexact function ev aluations. Using that f ( O ) ≤ f ( S ρ ) + f ( O − | S ρ ) + f ( O + | S ρ ), w e can conclude that S ρ is roughly a 1 P +2 k +1 - appro ximation. Second, supp ose k a > 0 knapsac k constrain ts are activ e and the algorithm disco v ers that the budget of product i is exhausted when trying to add elemen t z to the set G i = G ∩ Z i ∗ of selected elemen ts at that time. Since c ( G i ∪ { z } ) > 1 and each of these elemen ts has density ab o ve ρ , the gain of G i ∪ { z } is ab o v e ρ . How ever, only G i is included in our ﬁnal solution, so w e need to sho w that the marginal gain of z is not large compared to that of G i . T o do so, we ﬁrst realize that the algorithm greedily selects elements with marginal gain ab o ve a decreasing threshold w t . Then, since z is the last elemen t selected and G i is nonempt y (otherwise adding z will not exhaust the budget), the marginal gain of z m ust b e bounded b y roughly that of G i , which is at least roughly 1 2 ρ . Since this holds for all activ e knapsac k constrain ts, then the solution has v alue at least k a 2 ρ , whic h is an k a P +2 k +1 -appro ximation. Finally , combining b oth cases, and setting k = |L| and P = 1 as in our problem, w e ha v e our ﬁnal guarantee in Theorem 5. App endix D.2 presen ts the complete pro ofs. 5. Exp erimen ts on Synthetic and Real Data In this section, we ﬁrst ev aluate the accuracy of the estimated inﬂuence giv en b y Con- TinEst and then in vestigate the p erformance of inﬂuence maximization on syn thetic and real netw orks by incorp orating ConTinEst into the framew ork of BudgetMax . W e show that our approac h signiﬁcan tly outp erforms the state-of-the-art methods in terms of both sp eed and solution qualit y . 20 Continuous-Time Influence Maximiza tion of Mul tiple Items Core-p erphery T 2 4 6 8 10 Influence 0 50 100 150 200 NS ConTinEst 10 2 10 3 10 4 0 0.02 0.04 0.06 0.08 #samples relative error 5 10 20 30 40 50 0 2 4 6 8 x 10 −3 #labels relative error Random T 2 4 6 8 10 Influence 0 5 10 15 20 NS ConTinEst 10 2 10 3 10 4 0 0.02 0.04 0.06 0.08 #samples relative error 10 20 30 40 50 0 2 4 6 8 x 10 −3 #labels relative error Hierarc h y T 2 4 6 8 10 Influence 1 2 3 4 5 6 7 8 9 NS ConTinEst 10 2 10 3 10 4 0 0.02 0.04 0.06 0.08 #samples relative error 10 20 30 40 50 0 2 4 6 8 x 10 −3 #labels relative error (a) Inﬂuence vs. time (b) Inﬂuence vs. #samples (c) Error vs. #labels Figure 3: Inﬂuence estimation for core-periphery , random, and hierarc hical net works with 1,024 no des and 2,048 edges. Column (a) shows estimated inﬂuence by NS (near ground truth), and ConTinEst for increasing time windo w T ; Column (b) shows ConTinEst ’s relativ e error against n um b er of samples with 5 random lab els and T = 10; Column (c) re- p orts ConTinEst ’s relativ e error against the num b er of random labels with 10,000 random samples and T = 10. 21 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song 5.1 Exp erimen ts on Syn thetic Data W e generate three types of Kronec ker netw orks (Lesk ov ec et al., 2010) which are syn thetic net w orks generated by a recursiv e Kroneck er pro duct of a base 2-b y-2 parameter matrix with itself to generate self-similar graphs. By tuning the base parameter matrix, we are able to generate the Kroneck er net works which can mimic diﬀerent structural prop erties of many real netw orks. In the following, w e consider netw orks of three diﬀerent t yp es of structures: ( i ) core-p eriphery net works (parameter matrix: [0.9 0.5; 0.5 0.3]), which mimic the information diﬀusion traces in real-w orld net works (Gomez-Ro driguez et al., 2011), ( ii ) random netw orks ([0.5 0.5; 0.5 0.5]), typically used in ph ysics and graph theory (Easley and Kleinberg, 2010) and ( iii ) hierarc hical netw orks ([0.9 0.1; 0.1 0.9]) (Clauset et al., 2008). Next, we assign a pairwise transmission function for every directed edge in eac h t yp e of net w ork and set its parameters at random. In our exp erimen ts, w e use the W eibull distribution from (Aalen et al., 2008), f ( t ; α, β ) = β α  t α  β − 1 e − ( t/α ) β , t ≥ 0 , (20) where α > 0 is a scale parameter and β > 0 is a shap e parameter. The W eibull distribution (Wbl) has often b een used to mo del lifetime even ts in surviv al analysis, pro viding more ﬂexibilit y than an exp onential distribution. W e c ho ose α and β from 0 to 10 uniformly at random for each edge in order to hav e heterogeneous temporal dynamics. Finally , for eac h t yp e of Kronec ker net work, w e generate 10 sample netw orks, eac h of whic h has diﬀeren t α and β c hosen for every edge. 5.1.1 Influence Estima tion T o the b est of our kno wledge, there is no analytical solution to the inﬂuence estimation giv en W eibull transmission function. Therefore, w e compare ConTinEst with the Naiv e Sampling (NS) approac h by considering the highest degree no de in a netw ork as the source, and draw 1,000,000 samples for NS to obtain near ground truth. In Figure 3, Column (a) compares ConTinEst with the ground truth provided by NS at diﬀeren t time window T , from 0 . 1 to 10 in net w orks of diﬀeren t structures. F or ConTinEst , we generate up to 10,000 random samples (or sets of random w aiting times), and 5 random lab els in the inner lo op. In all three netw orks, estimation pro vided by ConTinEst ﬁts the ground truth accurately , and the relative error decreases quic kly as we increase the num b er of samples and labels (Column (b) and Column (c)). F or 10,000 random samples with 5 random lab els, the relativ e error is smaller than 0.01. 5.1.2 Influence Maximiza tion with Unif orm Cost In this section, w e ﬁrst ev aluate the eﬀectiveness of ConTinEst to the classic inﬂuence maximization problem where w e ha ve only one product to assign with the simple cardinalit y constrain t on the users. W e compare to other inﬂuence maximization metho ds dev elop ed based on discrete-time diﬀusion mo dels: traditional greedy by (Kemp e et al., 2003), with discrete-time Linear Threshold Model (L T) and Indep endent Cascade Model (IC) diﬀusion mo dels, and the heuristic methods SP1M, PMIA and MIA-M by (Chen et al., 2009, 2010a, 2012). F or Influmax , since it only supp orts exp onen tial pairwise transmission functions, 22 Continuous-Time Influence Maximiza tion of Mul tiple Items Inﬂuence b y Size 0 10 20 30 40 50 #sources 0 50 100 150 200 250 300 350 400 450 influence ConTinEst(Wbl) ConTinEst(Exp) Greedy(IC) SP1M PMIA MIA-M 0 10 20 30 40 50 #sources 0 50 100 150 200 250 300 350 400 influence ConTinEst(Wbl) ConTinEst(Exp) Greedy(IC) SP1M PMIA MIA-M 0 10 20 30 40 50 #sources 0 50 100 150 200 250 300 influence ConTinEst(Wbl) ConTinEst(Exp) Greedy(IC) SP1M PMIA MIA-M Inﬂuence b y Time 0 1 2 3 4 5 T 0 50 100 150 200 250 300 350 400 450 influence ConTinEst(Wbl) ConTinEst(Exp) Greedy(IC) SP1M PMIA MIA-M 0 1 2 3 4 5 T 0 50 100 150 200 250 300 350 400 influence ConTinEst(Wbl) ConTinEst(Exp) Greedy(IC) SP1M PMIA MIA-M 0 1 2 3 4 5 T 0 50 100 150 200 250 300 influence ConTinEst(Wbl) ConTinEst(Exp) Greedy(IC) SP1M PMIA MIA-M (a) Core-periphery (b) Random (c) Hierarc hy Figure 4: Inﬂuence σ ( A , T ) ac hiev ed by v arying num b er of sources |A| and observ ation windo w T on the netw orks of diﬀerent structures with 1,024 no des, 2,048 edges and het- erogeneous W eibull transmission functions. T op row: inﬂuence against #sources b y T = 5; Bottom ro w: inﬂuence against the time windo w T using 50 sources. 4 8 16 32 64 0 0.5 1 1.5 2 x 10 4 # products influence Uniform Cost BudgetMax GreedyDegree Random 4 8 12 16 20 0 0.5 1 1.5 2 2.5 3 x 10 4 product constraints influence Uniform Cost BudgetMax GreedyDegree Random 2 4 6 8 10 0 0.5 1 1.5 2 2.5 x 10 4 user constraints influence Uniform Cost BudgetMax GreedyDegree Random (a) By pro ducts (b) By pro duct constrain ts (c) By user constraints Figure 5: Ov er the 64 pro duct-sp eciﬁc diﬀusion net w orks, each of whic h has 1,048,576 no des, the estimated inﬂuence (a) for increasing the n um b er of pro ducts b y ﬁxing the pro duct-constrain t at 8 and user-constraint at 2; (b) for increasing pro duct-constrain t b y user-constrain t at 2; and (c) for increasing user-constrain t by ﬁxing pro duct-constrain t at 8. F or all experiments, we hav e T = 5 time windo w. w e ﬁt an exp onen tial distribution per edge b y NetRa te (Gomez-Ro driguez et al., 2011). F urthermore, Influmax is not scalable. When the av erage net work density (deﬁned as the a v erage degree p er no de) of the synthetic netw orks is ∼ 2 . 0, the run time for Influmax 23 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song is more than 24 hours. In consequence, we presen t the results of ConTinEst using ﬁtted exp onen tial distributions (Exp). F or the discrete-time IC mo del, we learn the infection probabilit y within time window T using Netrapalli’s metho d (Netrapalli and Sangha vi, 2012). The learned pairwise infection probabilities are also served for SP1M and PMIA , whic h appro ximately calculate the inﬂuence based on the IC mo del. F or the discrete- time L T mo del, we set the weigh t of each incoming edge to a no de u to the in verse of its in-degree, as in previous w ork (Kempe et al., 2003), and choose each no de’s threshold uniformly at random. The top ro w of Figure 4 compares the exp ected num b er of infected no des against the source set size for diﬀeren t methods. ConTinEst outp erforms the rest, and the comp etitiv e adv antage b ecomes more dramatic the larger the source set gro ws. The b ottom row of Figure 4 shows the exp ected num b er of infected no des against the time windo w for 50 selected sources. Again, ConTinEst p erforms the b est for all three types of net works. Next, using ConTinEst as a subroutine for inﬂuence estimation, w e ev aluate the p erfor- mance of BudgetMax with the uniform-cost constraints on the users. In our exp eriments w e consider up to 64 pro ducts, eac h of which diﬀuses o ver one of the abov e three diﬀerent t yp es of Kronec k er netw orks with ∼ one million no des. F urther, we randomly select a subset of 512 no des V S ⊆ V as our candidate target users, who will receive the giv en pro ducts, and ev aluate the p oten tial inﬂuence of an allo cation ov er the underlying one-million-no de net w orks. F or BudgetMax , w e se t the adaptiv e threshold δ to 0.01 and the cost p er user and pro duct to 1. F or ConTinEst , we use 2,048 samples with 5 random lab els on each of the product-sp eciﬁc diﬀusion net works. W e rep eat our exp erimen ts 10 times and rep ort the a verage p erformance. W e compare BudgetMax with a no des’ degree-based heuristic, which w e refer to as GreedyDegree, where the degree is treated as a natural measure of inﬂuence, and a baseline metho d, which assigns the pro ducts to the target no des randomly . W e opt for the nodes’ degree-based heuristic since, in practice, large-degree no des, suc h as users with millions of follo w ers in Twitter, are often the targeted users who will receive a considerable pa yment if he (she) agrees to p ost the adoption of some pro ducts (or ads) from merchan ts. GreedyDe- gree pro ceeds as follo ws. It ﬁrst sorts the list of all pairs of pro ducts i and no des j ∈ V S in descending order of no de- j ’s degree in the diﬀusion netw ork asso ciated to pro duct i . Then, starting from the b eginning of the list, it considers each pair one b y one: if the addition of the curren t pair to the existing solution does not violate the predeﬁned matroid constrain ts, it is added to the solution, and otherwise, it is skipp ed. This pro cess con tinues un til the end of the list is reached. In other w ords, we greedily assign products to no des with the largest degree. Due to the large size of the underlying diﬀusion net w orks, we do not apply other more exp ensiv e no de cen tralit y measures such as the clustering co eﬃcien t and b et weenness. Figure 5 summarizes the results. Panel (a) sho ws the achiev ed inﬂuence against num b er of products, ﬁxing the budget per pro duct to 8 and the budget p er user to 2. As the n um b er of pro ducts increases, on the one hand, more and more no des b ecome assigned, so the total inﬂuence will increase. Y et, on the other hand, the comp etition among pro ducts for a few existing inﬂuential no des also increases. GreedyDegree achiev es a mo dest p erformance, since high degree nodes ma y ha ve many ov erlapping c hildren. In con trast, BudgetMax , by taking the submo dularit y of the problem, the netw ork structure and the diﬀusion dynamics of the edges in to consideration, ac hiev es a superior performance, esp ecially as the n umber 24 Continuous-Time Influence Maximiza tion of Mul tiple Items of pro duct ( i.e. , the competition) increases. Panel (b) shows the ac hieved inﬂuence against the budget p er pro duct, considering 64 pro ducts and ﬁxing the budget per user to 2. W e ﬁnd that, as the budget p er pro duct increases, the performance of GreedyDegree tends to ﬂatten and the comp etitiv e adv antage of BudgetMax b ecomes more dramatic. Finally , P anel (c) sho ws the achiev ed inﬂuence against the budget p er user, considering 64 pro ducts and ﬁxing the budget per product to 8. W e ﬁnd that, as the budget p er user increases, the inﬂuence only increases slowly . This is due to the ﬁxed budget p er pro duct, whic h prev ents additional new no des to be assigned. This meets our intuition: by making a ﬁxed n um b er of people watc hing more ads p er day , w e can hardly bo ost the popularity of the pro duct. Additionally , even though the same node can b e assigned to more pro ducts, there is hardly ev er a no de that is the p erfe ct source from whic h all pro ducts can eﬃcien tly spread. 5.1.3 Influence Maximiza tion with Non-Unif orm Cost In this section, w e ev aluate the performance of BudgetMax under non-uniform cost con- strain ts, using again ConTinEst as a subroutine for inﬂuence estimation. Our designing of user-cost aim to mimic a real scenario, where adv ertisers pay m uch more money to celebri- ties with millions of so cial net work follo w ers than to normal citizens. T o do so, we let c i ∝ d 1 /n i where c i is the cost, d i is the degree, and n ≥ 1 controls the increasing speed of cost with resp ect to the degree. In our exp eriments, we use n = 3 and normalize c i to b e within (0 , 1]. Moreo ver, we set the pro duct-budget to a base v alue from 1 to 10 and add a random adjustmen t drawn from a uniform distribution U (0 , 1). W e compare our method to t wo modiﬁed v ersions of the abov e men tioned no des’ degree- based heuristic GreedyDegree and to the same baseline metho d. In the ﬁrst mo diﬁed v ersion of the heuristic, whic h w e still refer to as GreedyDegree, tak es b oth the degree and the corresp onding cost into consideration. In particular, it sorts the list of all pairs of pro ducts i and no des j ∈ V S in descending order of degree-cost ratio d j /c j in the diﬀusion net w ork asso ciated to product i , instead of simply the node- j ’s degree, and then pro ceeds similarly as b efore. In the second modiﬁed version of the heuristic, which we refer as GreedyLo calDegree, we use the same degree-cost ratio but allow the target users to b e partitioned in to distinct groups (or communities) and pic k the most cost-eﬀective pairs within eac h group lo cally instead. Figure 6 compares the performance of our method with the comp eting metho ds against four factors: (a) the num b er of pro ducts, (b) the budget p er pro duct, (c) the budget p er user and (d) the time window T , while ﬁxing the other factors. In all cases, BudgetMax signiﬁcantly outp erforms the other metho ds, and the ac hiev ed inﬂuence increases monotonically with resp ect to the factor v alue, as one ma y hav e exp ected. In addition, in Figure 6(e), we study the eﬀect of the Laminar matroid combined with group knapsack constraints, whic h is the most general t yp e of constraint w e handle in this paper (refer to Section 3.4). The selected target users are further partitioned into K groups randomly , eac h of which has, Q i , i = 1 . . . K , limit whic h constrains the maximum allo cations allow ed in each group. In practical scenarios, eac h group might correspond to a geographical comm unit y or organization. In our exp erimen t, we divide the users into 8 equal-size groups and set Q i = 16 , i = 1 . . . K to indicate that w e w an t a balanced allocation in each group. Figure 6(e) shows the ac hiev ed inﬂuence against the budget p er user for K = 8 equally-sized groups and Q i = 16 , i = 1 . . . K . In contrast to Figure 6(b), the total 25 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song 4 8 16 32 64 0 2000 4000 6000 8000 10000 12000 # products influence Non−uniform cost BudgetMax GreedyDegree Random 1 1.5 2 2.5 3 0 0.5 1 1.5 2 x 10 4 product budget influence Non−uniform cost BudgetMax GreedyDegree Random 2 4 6 8 10 0 0.5 1 1.5 2 x 10 4 user constraints influence Non−uniform cost BudgetMax GreedyDegree Random (a) By pro ducts (b) By pro duct budgets (c) By user constraints 2 5 10 15 10 2 10 4 10 6 10 8 time influence Non−uniform cost BudgetMax GreedyDegree Random 2 4 6 8 10 0 2000 4000 6000 8000 10000 12000 user constraints influence Non−uniform cost BudgetMax GreedyDegree GreedyLocalDegree Random (d) By time (e) By group limits Figure 6: Ov er the 64 pro duct-sp eciﬁc diﬀusion net works, each of whic h has a total 1,048,576 no des, the estimated inﬂuence (a) for increasing the num b er of pro ducts b y ﬁxing the pro duct-budget at 1.0 and user-constrain t at 2; (b) for increasing pro duct-budget by ﬁxing user-constraint at 2; (c) for increasing user-constraint by ﬁxing pro duct-budget at 1.0; (d) for diﬀerent time windo w T; and (e) for increasing user-constrain t with group-limit 16 b y ﬁxing pro duct-budget at 1.0. 26 Continuous-Time Influence Maximiza tion of Mul tiple Items 1 5 10 15 20 25 30 35 40 45 50 0 0.2 0.4 0.6 0.8 1 δ accuracy Uniform cost BudgetMax(Adaptive) BudgetMax(Lazy) 1 5 10 15 20 25 30 35 40 45 50 0 10 20 30 40 50 60 70 δ time(s) Uniform cost BudgetMax(Adaptive) BudgetMax(Lazy) (a) δ vs. accuracy (b) δ vs. time Figure 7: The relative accuracy and the run-time for diﬀeren t threshold parameter δ . estimated inﬂuence does not increase signiﬁcantly with resp ect to the budget ( i.e. , n um b er of slots) p er user. This is due to the ﬁxed budget p er group, which prev en ts additional new no des to b e assigned, even though the n umber of a v ailable slots per user increases. 5.1.4 Effects of Adaptive Thresholding In Figure 7, we in vestigate the impact that the threshold v alue δ has on the accuracy and run time of our adaptive thresholding algorithm and compare it with the lazy ev aluation metho d. Note that the p erformance and runtime of lazy ev aluation do not change with resp ect to δ b ecause it do es not dep end on it. P anel (a) shows the ac hiev ed inﬂuence against the threshold δ . As exp ected, the larger the δ v alue, the low er the accuracy . How ever, our metho d is relatively robust to the particular choice of δ since its p erformance is alw a ys ov er a 90-p ercent relativ e accuracy even for large δ . P anel (b) shows the runtime against the threshold δ . In this case, the larger the δ v alue, the lo w er the runtime. In other words, Figure 7 veriﬁes the in tuition that δ is able to trade oﬀ the solution qualit y of the allo cation with the runtime time. 5.1.5 Scalability In this section, we start with ev aluating the scalability of the prop osed algorithms on the classic inﬂuence maximization problem where w e only hav e one product with the cardinalit y constrain t on the users. W e compare it to the state-of-the-art metho d Influmax (Gomez-Ro driguez and Sch¨ olk opf, 2012) and the Naive Sampling (NS) metho d in terms of runtime for the contin uous-time inﬂuence estimation and maximization. F or ConTinEst , we draw 10,000 samples in the outer loop, eac h ha ving 5 random lab els in the inner lo op. W e plug ConTinEst as a sub- routine into the classic greedy algorithm by (Nemhauser et al., 1978). F or NS, we also draw 10,000 samples. The ﬁrst tw o exp erimen ts are carried out in a single 2.4GHz processor. 27 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song 1 2 3 4 5 6 7 8 9 10 10 0 10 1 10 2 10 3 10 4 10 5 #sources time(s) ConTinEst NS Influmax > 24 hours 1 2 3 4 5 6 7 8 9 10 10 0 10 1 10 2 10 3 10 4 10 5 #sources time(s) ConTinEst NS Influmax > 24 hours 1 2 3 4 5 6 7 8 9 10 10 0 10 1 10 2 10 3 10 4 10 5 #sources time(s) ConTinEst NS Influmax > 24 hours (a) Core-P eriphery (b) Random (c) Hierarc hy Figure 8: Runtime of selecting increasing num b er of sources on Kroneck er net works of 128 no des and 320 edges with T = 10. 1.5 2 2.5 3 3.5 4 4.5 5 10 0 10 1 10 2 10 3 10 4 10 5 density time(s) ConTinEst NS Influmax > 24 hours 1.5 2 2.5 3 3.5 4 4.5 5 10 0 10 1 10 2 10 3 10 4 10 5 density time(s) ConTinEst NS Influmax > 24 hours 1.5 2 2.5 3 3.5 4 4.5 5 10 0 10 1 10 2 10 3 10 4 10 5 density time(s) ConTinEst NS Influmax > 24 hours (a) Core-P eriphery (b) Random (c) Hierarc hy Figure 9: Run time of selecting 10 sources in netw orks of 128 no des with increasing densit y b y T = 10. Figure 8 compares the performance of increasingly selecting sources (from 1 to 10) on small Kroneck er net w orks. When the num b er of selected sources is 1, diﬀerent algorithms essen tially spend time estimating the inﬂuence for eac h no de. ConTinEst outp erforms other methods by order of magnitude and for the num b er of sources larger than 1, it can eﬃcien tly reuse computations for estimating inﬂuence for individual no des. Dashed lines mean that a metho d did not ﬁnish in 24 hours, and the estimated run time is plotted. Next, we compare the run time for selecting 10 sources with increasing densities (or the n um b er of edges) in Figure 9. Again, Influmax and NS are order of magnitude slow er due to their resp ectiv e exp onen tial and quadratic computational complexit y in net work densit y . In con trast, the run time of ConTinEst only increases sligh tly with the increasing densit y since its computational complexit y is linear in the num b er of edges. W e ev aluate the sp eed on large core-p eriphery netw orks, ranging from 100 to 1,000,000 no des with density 1.5 in Figure 10. W e rep ort the parallel run time only for ConTinEst and NS (b oth are implemen ted by MPI running on 192 cores of 2.4Ghz) since Influmax is not scalable. In con trast to NS, the p erformance of ConTinEst increases linearly with the netw ork size and can easily scale up to one million nodes. 28 Continuous-Time Influence Maximiza tion of Mul tiple Items 10 2 10 3 10 4 10 5 10 6 10 2 10 3 10 4 10 5 10 6 #nodes time(s) ConTinEst NS > 48 hours Figure 10: F or core-p eriphery net w orks by T = 10, run time of selecting 10 sources with increasing net work size from 100 to 1,000,000 b y ﬁxing 1.5 netw ork density . 4 8 16 32 64 10 0 10 1 10 2 time(s) # products Uniform Cost 10 4 10 5 10 6 35 40 45 50 time(s) #nodes Uniform Cost (a) Speed b y pro ducts (b) Speed b y no des Figure 11: Over the 64 pro duct-speciﬁc diﬀusion netw orks, each of whic h has 1,048,576 no des, the runtime (a) of allocating increasing n um b er of pro ducts and (b) of allocating 64 pro ducts to 512 users on netw orks of v arying size. or all exp erimen ts, w e ha ve T = 5 time windo w and ﬁx pro duct-constrain t at 8 and user-constrain t at 2. 29 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song 10 20 30 40 50 0.5 1 1.5 2 2.5 3 3.5 T MAE ConTinEst IC LT SP1M PMIA Figure 12: In MemeT rac k er dataset, comparison of the accuracy of the estimated inﬂuence in terms of mean absolute error. Finally , we in v estigate the performance of BudgetMax in terms of run time when us- ing ConTinEst as subroutine to estimate the inﬂuence. W e can precompute the data structures and store the samples needed to estimate the inﬂuence function in adv ance. Therefore, we fo cus only on the run time for the constrained inﬂuence maximization algo- rithm. BudgetMax runs on 64 cores of 2.4Ghz b y using Op enMP to accelerate the ﬁrst round of the optimization. W e rep ort the allo cation time for increasing num b er of pro ducts in Figure 11(a), which clearly shows a linear time complexity with resp ect to the size of the ground set. Figure 11(b) ev aluates the run time of allocation by v arying the size of the net w ork from 16,384 to 1,048,576 nodes. 5.2 Exp erimen ts on Real Data In this section, w e ﬁrst quan tify how w ell our prop osed algorithm can estimate the true inﬂuence in the real-w orld dataset. Then, w e ev aluate the solution qualit y of the selected sources for inﬂuence maximization under diﬀerent constraints. W e hav e used the public MemeT rack er datasets (Lesko vec et al., 2009), whic h contains more than 172 million news articles and blog p osts from 1 million mainstream media sites and blogs. 5.2.1 Influence Estima tion W e ﬁrst trace the ﬂow of information from one site to another by using the h yp erlinks among articles and p osts as in the work of Gomez-Ro driguez et al. (2011); Du et al. (2012). In detail, w e extracted 10,967 h yp erlink cascades among top 600 media sites. W e then ev aluate the accuracy of ConTinEst as follo ws. First, w e rep eatedly split all cascades in to a 80% training set and a 20% test set at random for ﬁve times. On eac h training set, we learn one con tin uous-time mo del, whic h we use for ConTinEst , and a discrete-time model, which w e use for the comp etitive metho ds: IC, SP1M , PMIA and MIAM-M. F or the contin uous-time 30 Continuous-Time Influence Maximiza tion of Mul tiple Items 0 10 20 30 40 50 #sources 0 10 20 30 40 50 60 influence ConTinEst Greedy(IC) SP1M PMIA MIAM-M 0 5 10 15 20 T 0 10 20 30 40 50 60 70 80 influence ConTinEst(Wbl) Greedy(IC) SP1M PMIA MIA-M (a) Inﬂuence vs. #sources (b) Inﬂuence vs. time Figure 13: In MemeT rack er dataset, (a) comparison of the inﬂuence of the selected no des b y ﬁxing the observ ation windo w T = 5 and v arying the n umber sources, and (b) comparison of the inﬂuence of the selected nodes by ﬁxing the num b er of sources to 50 and v arying the time windo w. mo del, we opt for NetRa te (Gomez-Ro driguez et al., 2011) with exp onential transmission functions (ﬁxing the shap e parameter of the W eibull family to b e one) to learn the diﬀusion net w orks by maximizing the lik eliho o d of the observed cascades. F or the discrete-time mo del, w e learn the infection probabilities using the metho d by Netrapalli and Sanghavi (2012). Second, let C ( u ) b e the set of all cascades where u was the source no de. By coun t- ing the total num b er of distinct no des infected before T in C ( u ), w e can quan tify the real inﬂuence of no de u up to time T . Th us, we can ev aluate the qualit y of the inﬂuence es- timation by computing the av erage (across no des) Mean Absolute Error (MAE) b etw een the real and the estimated inﬂuence on the test set, which w e show in Figure 12. Clearly , ConTinEst p erforms the b est statistically . Since the length of real cascades empirically conforms to a pow er-la w distribution, where most cascades are v ery short (2-4 no des), the gap of the estimation error is not to o large. Ho w ever, we emphasize that suc h accuracy impro v ement is critical for maximizing long-term inﬂuence since the estimation error for in- dividuals will accum ulate along the spreading paths. Hence, any consisten t improv ement in inﬂuence estimation can lead to signiﬁcant improv emen t to the ov erall inﬂuence estimation and maximization task, which is further conﬁrmed in the follo wing sections. 5.2.2 Influence Maximiza tion with Unif orm Cost W e ﬁrst apply ConTinEst to the contin uous-time inﬂuence maximization task with the simple cardinalit y constrain t on the users. W e ev aluate the inﬂuence of the selected no des in the same spirit as inﬂuence estimation: the true inﬂuence is calculated as the total n umber of distinct no des infected b efore T based on C ( u ) of the selected no des. Figure 13 shows that the selected sources given b y ConTinEst achiev e the b est performance as w e v ary the n um b er of selected sources and the observ ation time windo w. 31 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Next, we ev aluate the p erformance of BudgetMax on cascades from Memetrack er traced from quotes whic h are short textual phrases spreading through the websites. Because all published do cumen ts containing a particular quote are time-stamp ed, a cascade induced b y the same quote is a collection of times when the media site ﬁrst mentioned it. In detail, w e use the public dataset released b y Gomez-Rodriguez et al. (2013), whic h splits the original Memetrac k er dataset into groups, eac h associated to a topic or real-w orld ev en t. Each group consists of cascades built from quotes which were mentioned in posts con taining particular k eyw ords. W e considered 64 groups, with at least 100,000 cascades, which play the role of pro ducts. Therein, w e distinguish w ell-kno wn topics, suc h as “Apple” and “Occupy W all-Street”, or real-w orld even ts, suc h as the F ukushima nuclear disaster in 2013 and the marriage betw een Kate Middleton and Prince William in 2011. W e then ev aluate the accuracy of BudgetMax in the follo wing wa y . First, w e evenly split eac h group in to a training and a test set and then learn one contin uous-time model and a discrete-time model p er group using the training sets. As previously , for the con tin uous- time mo del, we opt for NetRa te (Gomez-Ro driguez et al., 2011) with exp onen tial trans- mission functions, and for the discrete-time model, we learn the infection probabilities using the metho d by Netrapalli and Sanghavi (2012), where the step-length is set to one. Second, w e run BudgetMax using b oth the contin uous-time mo del and the discrete-time mo del. W e refer to BudgetMax with the discrete-time mo del as the Greedy(discrete) metho d. Since we do ha v e no ground-truth information ab out cost of eac h no de, we fo cus our ex- p erimen ts using a uniform cost. Third, once we ha v e found an allo cation o v er the learned net w orks, we ev aluate the performance of the t wo metho ds using the cascades in the test set as follows: given a group-no de pair ( i, j ), let C ( j ) denote the set of cascades induced by group i that con tains node j . Then, we tak e the av erage n umber of no des coming after j for all the cascades in C ( j ) as a proxy of the a v erage inﬂuence induced by assigning group i to no de j . Finally , the inﬂuence of an allo cation is just the sum of the a v erage inﬂuence of eac h group-no de pair in the solution. In our experiments, we randomly select 128 no des as our target users. Figure 14 summarizes the ac hieved inﬂuence against four factors: (a) the num b er of pro ducts, (b) the budget p er pro duct, (c) the budget per user and (d) the time windo w T, while ﬁxing the other factors. In comparison with the Greedy(IC) and a random allo cation, BudgetMax ﬁnds an allocation that indeed induces the largest diﬀusion in the test data, with an av erage 20-p ercen t impro v emen t o v erall. In the end, Figure 15 inv estigates qualitativ ely the actual allo cations of groups (top- ics or real-w orld ev en ts; in red) and sites (in blac k). Here, we ﬁnd examples that intu- itiv ely one could exp ect: “japantoday .com” is assigned to F ukushima Nuclear disaster or “ﬁnance.y aho o.com” is assigned to “Occup y W all-street”. Moreo v er, b ecause w e consider sev eral topics and real-world even ts with diﬀerent underlying diﬀusion netw orks, the se- lected no des are not only very popular media sites such as n ytimes.com or cnn.com but also sev eral mo dest sites (Baksh y et al., 2011), often sp ecialized or lo cal, suc h as freep.com or lo calnews8.com. 32 Continuous-Time Influence Maximiza tion of Mul tiple Items 4 8 16 32 64 0 1000 2000 3000 4000 # products influence Uniform cost BudgetMax Greedy(discrete) Random 4 8 16 32 64 0 1000 2000 3000 4000 # product constraints influence Uniform cost BudgetMax Greedy(discrete) Random (a) By pro ducts (b) By pro duct constrain ts 4 8 16 32 64 500 1000 1500 2000 2500 3000 3500 # user constraints influence Uniform cost BudgetMax Greedy(discrete) Random 2 5 10 15 200 400 600 800 1000 1200 1400 time influence Uniform cost BudgetMax Greedy(discrete) Random (c) By user constraints (d) By time Figure 14: Over the inferred 64 pro duct-speciﬁc diﬀusion netw orks, the true inﬂuence estimated from separated testing data (a) for increasing the num b er of products b y ﬁxing the pro duct-constrain t at 8 and user-constraint at 2; (b) for increasing product-constraint b y ﬁxing user-constraint at 2; (c) for increasing user-constraint b y ﬁxing pro duct-constrain t at 8; (d) for diﬀeren t time window T. 6. Conclusion W e hav e studied the inﬂuence estimation and maximization problems in the con tin uous-time diﬀusion mo del. W e ﬁrst prop ose a randomized inﬂuence estimation algorithm ConTinEst , whic h can scale up to net works of millions of no des while signiﬁcan tly impro ves o ver pre- vious state of the art metho ds in terms of the accuracy of the estimated inﬂuence. Once w e ha ve a subroutine for eﬃcien t inﬂuence estimation in large netw orks, we then tackle the problem of maximizing the inﬂuence of m ultiple types of pro ducts (or information) in 33 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Fukushima Nuclear Navy Seals Syrian Uprise Al-Qaeda Occupy Wallstreet Prince William's Wedding Steve Jobs Mass Protests Europe Debt finance.yahoo.com centredaily.com cnn.com nytimes.com japantoday.com bournelocal.co.uk newarkadvocate.com daytondailynews.com hurriyetdailynews.com freep.com utsandiego.com bangordailynews.com wmbfnews.com articleshub.org kwch.com dalje.com i n t e r n a t i o n a l entertainment.blogspot.com dailybreeze.com livingstondaily.com hindustantimes.com elecodiario.es local10.com wgme13.com localnews8.com mlive.com Figure 15: The allo cation of memes to media sites. realistic contin uous-time diﬀusion netw orks sub ject to v arious practical constraints: dif- feren t pro ducts can ha ve diﬀeren t diﬀusion structures; only inﬂuence within a giv en time windo w is considered; each user can only b e recommended a small num b er of pro ducts; and each pro duct has a limited campaign budget, and assigning it to users incurs costs. W e provide a no v el formulation as a submo dular maximization under an intersection of matroid constrain ts and group-knapsack constrain ts, and then design an eﬃcien t adaptiv e threshold greedy algorithm with prov able approximation guaran tees, which we call Bud- getMax . Experimental results show that the prop osed algorithm p erforms remark ably b etter than other scalable alternatives in b oth synthetic and real-world datasets. There are also a few interesting op en problems. F or example, when the inﬂuence is estimated using ConTinEst , its error is a random v ariable. How do es this aﬀect the submodularity of the inﬂuence function? Is there an inﬂuence maximization algorithm that has better tolerance to the random error? These questions are left for future work. 34 Continuous-Time Influence Maximiza tion of Mul tiple Items Ac kno wledgmen ts Nan Du is supported b y the F aceb o ok Graduate F ello wship 2014-2015. Maria-Florina Bal- can and Yingyu Liang are supp orted in part b y NSF gran t CCF-1101283 and CCF-0953192, AF OSR gran t F A9550-09-1-0538, ONR gran t N00014-09-1-075, a Microsoft F acult y F el- lo wship, and a Raytheon F aculty F ellowship. Le Song is supp orted in part by NSF/NIH BIGD A T A 1R01GM108341, ONR N00014-15-1-2340, NSF IIS-1218749, NSF CAREER IIS- 1350983, Nvidia and In tel. Hongyuan Zha is supported in part by NSF/NIH BIGDA T A 1R01GM108341, NSF DMS-1317424 and NSF CNS-1409635. References Odd Aalen, Oern ulf Borgan, and H ˚ ak on K Gjessing. Survival and event history analysis: a pr o c ess p oint of view . Springer, 2008. Ash winkumar Badanidiyuru and Jan V ondr´ ak. F ast algorithms for maximizing submo dular functions. In SOD A . SIAM, 2014. Eytan Baksh y , Jak e M. Hofman, Winter A. Mason, and Duncan J. W atts. Every one’s an inﬂuencer: Quantifying inﬂuence on t witter. In WSDM , pages 65–74, 2011. Christian Borgs, Mic hael Brautbar, Jennifer Cha yes, and Brendan Lucier. Inﬂuence max- imization in so cial net works: T ow ards an optimal algorithmic solution. arXiv pr eprint arXiv:1212.0884 , 2012. W ei Chen, Y a jun W ang, and Siyu Y ang. Eﬃcien t inﬂuence maximization in so cial net- w orks. In Pr o c e e dings of the 15th A CM SIGKDD international c onfer enc e on Know le dge disc overy and data mining , pages 199–208. ACM, 2009. W ei Chen, Chi W ang, and Y a jun W ang. Scalable inﬂuence maximization for prev alent viral mark eting in large-scale social netw orks. In Pr o c e e dings of the 16th ACM SIGKDD in- ternational c onfer enc e on Know le dge disc overy and data mining , pages 1029–1038. A CM, 2010a. W ei Chen, Yifei Y uan, and Li Zhang. Scalable inﬂuence maximization in so cial net w orks under the linear threshold model. In Data Mining (ICDM), 2010 IEEE 10th International Confer enc e on , pages 88–97. IEEE, 2010b. W ei Chen, Alex Collins, Rac hel Cummings, T e Ke, Zhenming Liu, David Rincon, Xiaorui Sun, Y a jun W ang, W ei W ei, and Yifei Y uan. Inﬂuence maximization in so cial netw orks when negativ e opinions ma y emerge and propagate. In SDM , pages 379–390. SIAM, 2011. W ei Chen, W ei Lu, and Ning Zhang. Time-critical inﬂuence maximization in so cial net w orks with time-dela yed diﬀusion pro cess. In AAAI , 2012. Aaron Clauset, Cristopher Mo ore, and M.E.J. Newman. Hierarc hical structure and the prediction of missing links in netw orks. Natur e , 453(7191):98–101, 2008. 35 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Edith Cohen. Size-estimation framework with applications to transitive closure and reac h- abilit y . Journal of Computer and System Scienc es , 55(3):441–453, 1997. Samik Datta, Anirban Ma jumder, and Nisheeth Shriv asta v a. Viral mark eting for multiple pro ducts. In Pr o c e e dings of the 2010 IEEE International Confer enc e on Data Mining , pages 118–127, 2010. Nan Du, Le Song, Alex Smola, and Ming Y uan. Learning netw orks of heterogeneous in- ﬂuence. In A dvanc es in Neur al Information Pr o c essing Systems 25 , pages 2789–2797, 2012. Nan Du, Le Song, Manuel Gomez-Ro driguez, and Hongyuhan Zha. Scalable inﬂuence estimation in contin uous time diﬀusion netw orks. In A dvanc es in Neur al Information Pr o c essing Systems 26 , 2013a. Nan Du, Le Song, Hyenkyun W o o, and Hongyuan Zha. Uncov er topic-sensitive information diﬀusion net works. In A rtiﬁcial Intel ligenc e and Statistics (AIST A TS) , 2013b. Da vid Easley and Jon Klein b erg. Networks, Cr owds, and Markets: R e asoning Ab out a Highly Conne cte d World . Cambridge Universit y Press, 2010. Satoru F ujishige. Submo dular functions and optimization , v olume 58. Elsevier Science Limited, 2005. Man uel Gomez-Rodriguez and Bernhard Sc h¨ olkopf. Inﬂuence maximization in contin uous time diﬀusion netw orks. In Pr o c e e dings of the 29th International Confer enc e on Machine L e arning , pages 313–320, 2012. Man uel Gomez-Ro driguez, Da vid Balduzzi, and Bernhard Sc h¨ olkopf. Unco vering the tem- p oral dynamics of diﬀusion net works. In Pr o c e e dings of the 28th International Confer enc e on Machine L e arning , 2011. Man uel Gomez-Ro driguez, Jure Lesko v ec, and Bernhard Sc h¨ olk opf. Structure and Dynam- ics of Information P ath wa ys in On-line Media. In Pr o c e e dings of the 6th International Confer enc e on Web Se ar ch and Web Data Mining , 2013. Amit Goy al, W ei Lu, and Laks V. S. Lakshmanan. Celf++: optimizing the greedy algorithm for inﬂuence maximization in social net works. In WWW (Comp anion V olume) , pages 47– 48, 2011a. Amit Goy al, W ei Lu, and Laks V. S. Lakshmanan. Simpath: An eﬃcient algorithm for inﬂuence maximization under the linear threshold model. In ICDM , pages 211–220, 2011b. Dino Ienco, F rancesco Bonchi, and Carlos Castillo. The meme ranking problem: Maximizing microblogging viralit y . In ICDM Workshops , 2010. Da vid Kemp e, Jon Kleinberg, and ´ Ev a T ardos. Maximizing the spread of inﬂuence through a so cial netw ork. In Pr o c e e dings of the ninth A CM SIGKDD international c onfer enc e on Know le dge disc overy and data mining , pages 137–146. ACM, 2003. 36 Continuous-Time Influence Maximiza tion of Mul tiple Items Jure Lesko vec, Andreas Krause, Carlos Guestrin, Christos F aloutsos, Jeanne V anBriesen, and Natalie Glance. Cost-eﬀective outbreak detection in netw orks. In P . Berkhin, R. Caru- ana, and X. W u, editors, Confer enc e on Know le dge Disc overy and Data Mining , pages 420–429. A CM, 2007. URL http://doi.acm.o rg/10.1145/1281192.1281239 . Jure Lesko vec, Lars Backstrom, and Jon Kleinberg. Meme-trac king and the dynamics of the news cycle. In Pr o c e e dings of the 15th ACM SIGKDD international c onfer enc e on Know le dge disc overy and data mining , pages 497–506. ACM, 2009. Jure Lesk ov ec, Deepay an Chakrabarti, Jon Klein b erg, Christos F aloutsos, and Zoubin Ghahramani. Kroneck er graphs: An approac h to mo deling netw orks. Journal of Ma- chine L e arning R ese ar ch , 11(F eb):985–1042, 2010. W ei Lu, F rancesco Bonchi, Goy al Amit, and Laks V. S. Lakshmanan. The bang for the buc k: fair comp etitiv e viral marketing from the host p erspective. In KDD , pages 928–936, 2013. Ramasuri Nara y anam and Amit A Nana v ati. Viral mark eting for pro duct cross-sell through so cial netw orks. In Machine L e arning and Know le dge Disc overy in Datab ases . 2012. George Nemhauser, Laurence W olsey , and Marshall Fisher. An analysis of the appro xima- tions for maximizing submodular set functions. Mathematic al Pr o gr amming , 14:265–294, 1978. Praneeth Netrapalli and Sujay Sanghavi. Learning the graph of epidemic cascades. In SIGMETRICS/PERF ORMANCE , pages 211–222. A CM, 2012. ISBN 978-1-4503-1097-0. Matthew Richardson and Pedro Domingos. Mining kno wledge-sharing sites for viral mark et- ing. In Pr o c e e dings of the eighth ACM SIGKDD international c onfer enc e on Know le dge disc overy and data mining , pages 61–70. ACM, 2002. Alexander Schrijv er. Combinatorial Optimization , volume 24 of Algorithms and Combina- torics . Springer, 2003. T asuku Soma, Naonori Kakim ura, Kazuhiro Inaba, and Ken-ichi Ka waraba y ashi. Optimal budget allo cation: Theoretical guaran tee and eﬃcien t algorithm. In Pr o c e e dings of The 31st International Confer enc e on Machine L e arning , pages 351–359, 2014. T ao Sun, W ei Chen, Zhenming Liu, Y a jun W ang, Xiaorui Sun, Ming Zhang, and Chin-Y ew Lin. Participation maximization based on so cial inﬂuence in online discussion forums. In Pr o c e e dings of the International AAAI Confer enc e on Weblo gs and So cial Me dia , 2011. Maxim Sviridenko. A note on maximizing a submo dular set function sub ject to knapsack constrain t. Op er ations R ese ar ch L etters , 32:41–43, 2004. 37 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song App endix A. Naive Sampling Algorithm The graphical mo del p erspective described in Section 2.2 suggests a naive sampling (NS) algorithm for approximating σ ( A , T ): 1. Draw n samples,  n τ l j i o ( j,i ) ∈E  n l =1 , i.i.d. from the waiting time product distribution Q ( j,i ) ∈E f j i ( τ j i ); 2. F or eac h sample n τ l j i o ( j,i ) ∈E and for eac h node i , ﬁnd the shortest path from source no des to no de i ; count the n um b er of no des with g i  n τ l j i o ( j,i ) ∈E  ≤ T ; 3. Average the coun ts across n samples. Although the naive sampling algorithm can handle arbitrary transmission function, it is not scalable to net w orks with millions of no des. W e need to compute the short- est path for each no de and each sample, whic h results in a computational complexit y of O ( n |E | + n |V | log |V | ) for a single source no de. The problem is even more pressing in the inﬂuence maximization problem, where w e need to estimate the inﬂuence of source no des at diﬀeren t lo cation and with increasing num b er of source no des. T o do this, the algorithm needs to b e rep eated, adding a m ultiplicative factor of C |V | to the computational complex- it y ( C is the num b er of no des to select). Then, the algorithm b ecomes quadratic in the net w ork size. When the net w ork size is in the order of thousands and millions, typical in mo dern so cial netw ork analysis, the naiv e sampling algorithm b ecome prohibitively exp en- siv e. Additionally , we may need to dra w thousands of samples ( n is large), further making the algorithm impractical for large-scale problems. App endix B. Least Lab el List The notation “argsort(( r 1 , . . . , r |V | ) , ascend)” in line 2 of Algorithm 3 means that w e sort the collection of random lab els in ascending order and return the argumen t of the sort as an ordered list.                 • No de lab eling : e (0 . 2) < b (0 . 3) < d (0 . 4) < a (1 . 5) < c (1 . 8) < g (2 . 2) < f (3 . 7) • Neigh b orhoo ds: N ( c, 2) = { a, b, c, e } ; N ( c, 3) = { a, b, c, d, e, f } ; • Least-lab el list: r ∗ ( c ) : (2 , 0 . 2) , (1 , 0 . 3) , (0 . 5 , 1 . 5) , (0 , 1 . 8) • Query: r ∗ ( c, 0 . 8) = r ( a ) = 1 . 5 Figure 16: Graph G = ( V , E ), edge w eights { τ j i } ( j,i ) ∈E , and no de lab eling { r i } i ∈V with the asso ciated output from Algorithm 3. 38 Continuous-Time Influence Maximiza tion of Mul tiple Items Algorithm 3: Least Lab el List Input : a rev ersed directed graph G = ( V , E ) with edge weigh ts { τ j i } ( j,i ) ∈E , a no de lab eling { r i } i ∈V Output : A list r ∗ ( s ) for each s ∈ V 1 for e ach s ∈ V do d s ← ∞ , r ∗ ( s ) ← ∅ 2 for i in ar gsort (( r 1 , . . . , r |V | ) , asc end ) do 3 empt y heap H ← ∅ ; 4 set all no des except i as un visited; 5 push (0 , i ) into heap H ; 6 while H 6 = ∅ do 7 p op ( d ∗ , s ) with the minimum d ∗ from H ; 8 add ( d ∗ , r i ) to the end of list r ∗ ( s ); 9 d s ← d ∗ ; 10 for e ach unvisite d out-neighb or j of s do 11 set j as visited; 12 if ( d, j ) in he ap H then 13 P op ( d, j ) from heap H ; 14 Push (min { d, d ∗ + τ j s } , j ) into heap H ; 15 else if d ∗ + τ j s < d j then 16 Push ( d ∗ + τ j s , j ) into heap H ; Figure 16 sho ws an example of the Least-Lab el-List. The no des from a to g are assigned to exponentially distributed lab els with mean one shown in eac h paren theses. Given a query distance 0.8 for node c , we can binary-search its Least-lab el-list r ∗ ( c ) to ﬁnd that no de a b elongs to this range with the smallest label r ( a ) = 1 . 5. App endix C. Theorem 1 Theorem 1 Sample the fol lowing numb er of sets of r andom tr ansmission times n > C Λ  2 log  2 |V | α  wher e Λ := max A : |A|≤ C 2 σ ( A , T ) 2 / ( m − 2) + 2 V ar ( |N ( A , T ) | )( m − 1) / ( m − 2) + 2 a/ 3 , |N ( A , T ) | 6 a , and for e ach set of r andom tr ansmission times, sample m set of r andom lab els. Then we c an guar ante e that | b σ ( A , T ) − σ ( A , T ) | 6  simultane ously for al l A with |A| 6 C , with pr ob ability at le ast 1 − α . Pro of Let S τ := |N ( A , T ) | for a ﬁxed set of { τ j i } and then σ ( A , T ) = E τ [ S τ ]. The randomized algorithm with m randomizations pro duces an unbiased estimator b S τ = ( m − 1) / ( P m u =1 r u ∗ ) for S τ , i.e. , E r | τ [ b S τ ] = S τ , with v ariance E r | τ [( b S τ − S τ ) 2 ] = S 2 τ / ( m − 2) (Cohen, 1997). 39 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song Then b S τ is also an un biased estimator for σ ( A , T ), since E τ ,r [ b S τ ] = E τ E r | τ [ b S τ ] = E τ [ S τ ] = σ ( A , T ). Its v ariance is V ar ( b S τ ) := E τ ,r [( b S τ − σ ( A , T )) 2 ] = E τ ,r [( b S τ − S τ + S τ − σ ( A , T )) 2 ] = E τ ,r [( b S τ − S τ ) 2 ] + 2 E τ ,r [( b S τ − S τ )( S τ − σ ( A , T ))] + E τ ,r [( S τ − σ ( A , T )) 2 ] = E τ [ S 2 τ / ( m − 2)] + 0 + V ar ( S τ ) = σ ( A , T ) 2 / ( m − 2) + V ar ( S τ )( m − 1) / ( m − 2) Then using Bernstein’s inequalit y , w e hav e, for our ﬁnal estimator b σ ( A , T ) = 1 n P n l =1 b S τ l , that Pr {| b σ ( A , T ) − σ ( A , T ) | >  } 6 2 exp − n 2 2 V ar ( b S τ ) + 2 a/ 3 ! (21) where b S τ < a 6 |V | . Setting the right hand side of relation (21) to α , w e hav e that, with probabilit y 1 − α , sampling the following num ber sets of random transmission times n > 2 V ar ( b S τ ) + 2 a/ 3  2 log  2 α  = 2 σ ( A , T ) 2 / ( m − 2) + 2 V ar ( S τ )( m − 1) / ( m − 2) + 2 a/ 3  2 log  2 α  w e can guarantee that our estimator to hav e error | b σ ( A , T ) − σ ( A , T ) | 6  . If we wan t to insure that | b σ ( A , T ) − σ ( A , T ) | 6  sim ultaneously hold for all A such that |A| 6 C  |V | , w e can ﬁrst use union b ound with relation (21). In this case, we ha ve that, with probability 1 − α , sampling the following n um b er sets of random transmission times n > C Λ  2 log  2 |V | α  w e can guaran tee that our estimator to ha v e error | b σ ( A , T ) − σ ( A , T ) | 6  for all A with |A| 6 C . Note that we hav e deﬁne the constant Λ := max A : |A|≤ C 2 σ ( A , T ) 2 / ( m − 2) + 2 V ar ( S τ )( m − 1) / ( m − 2) + 2 a/ 3. App endix D. Complete Pro ofs for Section 4 D.1 Uniform Cost In this section, we ﬁrst pro ve a theorem for the general problem deﬁned by Equation (19), considering a normalized monotonic submodular function f ( S ) and general P (Theorem 7) and k = 0, and then obtain the guaran tee for our inﬂuence maximization problem (Theo- rem 4). 40 Continuous-Time Influence Maximiza tion of Mul tiple Items Supp ose G =  g 1 , . . . , g | G |  in the order of selection, and let G t = { g 1 , . . . , g t } . Let C t denote all those elemen ts in O \ G that satisfy the follo wing: they are still feasible b efore selecting the t -th element g t but are infeasible after selecting g t . F ormally , C t =  z ∈ O \ G : { z } ∪ G t − 1 ∈ F , { z } ∪ G t 6∈ F  . In the following, we will pro ve three claims and then use them to pro ve the Theorems 7 and 4. Recall that for any i ∈ Z and S ⊆ Z , the marginal gain of z with respect to S is denoted as f ( z | S ) := f ( S ∪ { z } ) − f ( S ) and its approximation is denoted b y b f ( z | S ) = b f ( S ∪ { z } ) − b f ( S ) . Also, when | f ( S ) − b f ( S ) | ≤  for an y S ⊆ Z , we hav e | b f ( z | S ) − f ( z | S ) | ≤ 2  for an y z ∈ Z and S ⊆ Z . Claim 1. P t i =1 | C i | ≤ P t , for t = 1 , . . . , | G | . Pro of W e ﬁrst sho w the follo wing prop erty ab out matroids: for an y Q ⊆ Z , the sizes of an y tw o maximal indep enden t subsets T 1 and T 2 of Q can only diﬀer b y a multiplicativ e factor at most P . Here, T is a maximal independent subset of Q if and only if: • T ⊆ Q ; • T ∈ F = T P i =1 I p ; • T ∪ { z } 6∈ F for an y z ∈ Q \ T . T o prov e the prop ert y , note that for an y elemen t z ∈ T 1 \ T 2 , { z } ∪ T 2 violates at least one of the matroid constraints since T 2 is maximal. Let { V i } 1 ≤ i ≤ P denote all elemen ts in T 1 \ T 2 that violate the i -th matroid, and then partition T 1 ∩ T 2 using these V i ’s so that they co v er T 1 . Note that the size of each V i m ust b e at most that of T 2 , since otherwise by the Exc hange axiom, there would exist z ∈ V i \ T 2 that can b e added to T 2 without violating the i -th matroid, leading to a contradiction. Therefore, | T 1 | is at most P times | T 2 | . No w w e apply the property to prov e the claim. Let Q b e the union of G t and S t i =1 C t . On one hand, G t is a maximal indep enden t subset of Q , since no elemen t in S t i =1 C t can b e added to G t without violating the matroid constrain ts. On the other hand, S t i =1 C t is an indep enden t subset of Q , since it is part of the optimal solution. Therefore, S t i =1 C t has size at most P times | G t | , whic h is P t . Claim 2. Supp ose g t is sele cte d at the thr eshold τ t . Then, f ( j | G t − 1 ) ≤ (1 + δ ) τ t + 4  + δ N f ( G ) , ∀ j ∈ C t . Pro of First, consider τ t > w L +1 = 0. Since g t is selected at the threshold τ t , we hav e that b f ( g t | G t − 1 ) ≥ τ t and thu s f ( g t | G t − 1 ) ≥ τ t − 2  . Any j ∈ C t could hav e b een selected at an 41 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song earlier stage, since adding j to G t − 1 w ould not hav e violated the constrain t. Ho w ever, since j 6∈ G t − 1 , that means that b f ( j | G t − 1 ) ≤ (1 + δ ) τ t . Then, f ( j | G t − 1 ) ≤ (1 + δ ) τ t + 2 . Second, consider τ t = w L +1 = 0. F or each j ∈ C t , we hav e b f ( j | G ) < δ N d . Since the greedy algorithm must pick g 1 with b f ( g 1 ) = d and d ≤ f ( g 1 ) +  , then f ( j | G ) < δ N f ( G ) + 4 . The claim follows by combining the tw o cases. Claim 3. The mar ginal gain of O \ G satisﬁes X j ∈ O \ G f ( j | G ) ≤ [(1 + δ ) P + δ ] f ( G ) + (6 + 2 δ ) P | G | . Pro of Com bining Claim 1 and Claim 2, we hav e: X j ∈ O \ G f ( j | G ) = | G | X t =1 X j ∈ C t f ( j | G ) ≤ (1 + δ ) | G | X t =1 | C t | τ t + δ f ( G ) + 4  | G | X t =1 | C t | ≤ (1 + δ ) | G | X t =1 | C t | τ t + δ f ( G ) + 4 P | G | . F urther, P | G | t =1 | C t | τ t ≤ P P | G | t =1 τ t b y Claim 1 and a tec hnical lemma (Lemma 6). Finally , the claim follows from the fact that f ( G ) = P t f ( g t | G t − 1 ) ≥ P t ( τ t − 2  ). Lemma 6 If P t i =1 σ i − 1 ≤ t for t = 1 , . . . , K and ρ i − 1 ≥ ρ i for i = 1 , . . . , K − 1 with ρ i , σ i ≥ 0 , then P K i =1 ρ i σ i ≤ P K i =1 ρ i − 1 . Pro of Consider the linear program V = max σ K X i =1 ρ i σ i s.t. t X i =1 σ i − 1 ≤ t, t = 1 , . . . , K, σ i ≥ 0 , i = 1 , . . . , K − 1 with dual W = min u K X i =1 tu t − 1 s.t. K − 1 X t = i u t ≥ ρ i , i = 0 , . . . , K − 1 , u t ≥ 0 , t = 0 , . . . , K − 1 . 42 Continuous-Time Influence Maximiza tion of Mul tiple Items As ρ i ≥ ρ i +1 , the solution u i = ρ i − ρ i +1 , i = 0 , . . . , K − 1 (where ρ K = 0) is dual fea- sible with v alue P K t =1 t ( ρ t − 1 − ρ t ) = P K i =1 ρ i − 1 . By weak linear programming duality , P K i =1 ρ i σ i ≤ V ≤ W ≤ P K i =1 ρ i − 1 . Theorem 7 Supp ose we use Algor ithm 2 to solve the pr oblem deﬁne d by Equation (19) with k = 0 , using ρ = 0 and b f to estimate the function f , wher e | b f ( S ) − f ( S ) | ≤  for al l S ⊆ Z . It holds that the algorithm r eturns a gr e e dy solution G with f ( G ) ≥ 1 (1 + 2 δ )( P + 1) f ( O ) − 4 P | G | P + c f  wher e O is the optimal solution, using O ( N δ log N δ ) evaluations of b f . Pro of By submodularity and Claim 3, we ha ve: f ( O ) ≤ f ( O ∪ G ) ≤ f ( G ) + X j ∈ O \ G f ( j | G ) ≤ (1 + δ )( P + 1) f ( G ) + (6 + 2 δ ) P | G | , whic h leads to the b ound in the theorem. Since there are O ( 1 δ log N δ ) thresholds, and there are O ( N ) ev aluations at each thresh- old, the num b er of ev aluations is b ounded by O ( N δ log N δ ). Theorem 7 essentially sho ws f ( G ) is close to f ( O ) up to a factor roughly (1 + P ), which then leads to the following guaran tee for our inﬂuence maximization problem. Supp ose pro d- uct i ∈ L spreads according to diﬀusion net w ork G i = ( V , E i ), and let i ∗ = argmax i ∈L |E i | . Theorem 4. In the inﬂuenc e maximization pr oblem with uniform c ost, A lgorithm 2 (with ρ = 0 ) is able to output a solution G that satisﬁes f ( G ) ≥ 1 − 2 δ 3 f ( O ) in exp e cte d time e O  |E i ∗ | + |V | δ 2 + |L||V | δ 3  . Pro of In the inﬂuence maximization problem, the num b er of matroids is P = 2. Also note that | G | ≤ f ( G ) ≤ f ( O ), which leads to 4 | G |  ≤ 4 f ( O ). The approximation guarantee then follo ws from setting  ≤ δ / 16 when using ConTinEst (Du et al., 2013a) to estimate the inﬂuence. The runtime is bounded as follows. In Algorithm 2, we need to estimate the marginal gain of adding one more product to the current solution. In ConTinEst (Du et al., 2013a), building the initial data structure takes time O  ( |E i ∗ | log |V | + |V | log 2 |V | ) 1 δ 2 log |V | δ  and afterw ards each function ev aluation tak es time O  1 δ 2 log |V | δ log log |V |  . As there are O  N δ log N δ  ev aluations where N = |L||V | , the run time of our algorithm fol- lo ws. 43 Du, Liang, Balcan, Gomez-R odriguez, Zha and Song D.2 General case As in the previous section, w e ﬁrst prov e that a theorem for the problem deﬁned by Equa- tion (19) with general normalized monotonic submo dular function f ( S ) and general P (Theorem 8), and then obtain the guarantee for our inﬂuence maximization problem (The- orem 5). Theorem 8 Supp ose A lgorithm 1 uses b f to estimate the function f which satisﬁes | b f ( S ) − f ( S ) | ≤  for al l S ⊆ Z . Then, ther e exists a ρ such that f ( S ρ ) ≥ max { 1 , | A ρ |} ( P + 2 k + 1)(1 + 2 δ ) f ( O ) − 8  | S ρ | wher e A ρ is the set of active knapsack c onstr aints: A ρ = { i : S ρ ∪ { z } 6∈ F , ∀ z ∈ Z i ∗ } . Pro of Consider the optimal solution O and set ρ ∗ = 2 P +2 k +1 f ( O ). By submo dularit y , w e ha v e d ≤ f ( O ) ≤ |Z | d , so ρ ∈ h 2 d P +2 k +1 , 2 |Z | d P +2 k +1 i , and there is a run of Algorithm 2 with ρ suc h that ρ ∗ ∈ [ ρ, (1 + δ ) ρ ]. In the follo wing w e consider this run. Case 1 Supp ose | A ρ | = 0. The k ey observ ation in this case is that since no knapsack constrain ts are activ e, the algorithm runs as if there were only matroid constraints. Then, the argumen t for matroid constrain ts can b e applied. More precisely , let O + := { z ∈ O \ S ρ : f ( z | S ρ ) ≥ c ( z ) ρ + 2  } O − := { z ∈ O \ S ρ : z 6∈ O + } . Note that all elements in O + are feasible. F ollo wing the argumen t of Claim 3 in Theorem 7, w e ha v e: f ( O + | S ρ ) ≤ ((1 + δ ) P + δ ) f ( S ρ ) + (4 + 2 δ ) P | S ρ | . (22) Also, b y deﬁnition, the marginal gain of O − is: f ( O − | S ρ ) ≤ k ρ + 2  | O − | ≤ k ρ + 2 P | S ρ | , (23) where the last inequality follo ws from the fact that S ρ is a maximal indep enden t subset, O − is an indep endent subset of O ∪ S ρ , and the sizes of an y tw o maximal independent subsets in the in tersection of P matroids can diﬀer by a factor of at most P . Plugging (22)(23) into f ( O ) ≤ f ( O + | S ρ ) + f ( O − | S ρ ) + f ( S ρ ), w e obtain the bound f ( S ρ ) ≥ f ( O ) ( P + 2 k + 1)(1 + δ ) − (6 + 2 δ ) P | S ρ | ( P + 1)(1 + δ ) . Case 2 Supp ose | A ρ | > 0. F or any i ∈ A ρ ( i.e. , the i -th knapsack constraint is active), consider the step when i is added to A ρ . Let G i = G ∩ Z i ∗ , and w e ha ve c ( G i ) + c ( z ) > 1. 44 Continuous-Time Influence Maximiza tion of Mul tiple Items Since ev ery element g we include in G i satisﬁes b f ( g | G ) ≥ c ( g ) ρ with resp ect to the solution G i when g is added. Then f ( g | G ) = f i ( g | G i ) ≥ c ( g ) ρ − 2  , and we ha ve: f i ( G i ∪ { z } ) ≥ ρ [ c ( G i ) + c ( z )] − 2  ( | G i | + 1) > ρ − 2  ( | G i | + 1) . (24) Note that G i is non-empty since otherwise the knapsack constrain t will not b e active. An y elemen t in G i is selected b efore or at w t , so f i ( G i ) ≥ w t − 2  . Also, note that z is not selected in previous thresholds before w t , so f i ( { z } | G i ) ≤ (1 + δ ) w t + 2  and th us, f i ( { z } | G i ) ≤ (1 + δ ) f i ( G i ) + 2  (2 + δ ) . (25) Com bining Eqs. 24 and 25 in to f i ( G i ∪ { z } ) = f i ( G i ) + f i ( { z } | G i ) leads to f i ( G i ) ≥ ρ (2 + δ ) − 2  ( | G i | + 3 + δ ) (2 + δ ) ≥ 1 2(1 + 2 δ ) ρ ∗ − 2  ( | G i | + 3 + δ ) (2 + δ ) ≥ f ( O ) ( P + 2 k + 1)(1 + 2 δ ) − 5  | G i | . Summing up ov er all i ∈ A ρ leads to the desired bound. Supp ose item i ∈ L spreads according to the diﬀusion netw ork G i = ( V , E i ). Let i ∗ = argmax i ∈L |E i | . By setting  = δ / 16 in Theorem 8, w e ha ve: Theorem 5. In Algorithm 1, ther e exists a ρ such that f ( S ρ ) ≥ max { k a , 1 } (2 |L| + 2)(1 + 3 δ ) f ( O ) wher e k a is the numb er of active knapsack c onstr aints. The exp e cte d runtime to obtain the solution is e O  |E i ∗ | + |V | δ 2 + |L||V | δ 4  . 45

Scalable Influence Maximization for Multiple Products in Continuous-Time Diffusion Networks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment