Point process modeling for directed interaction networks
Network data often take the form of repeated interactions between senders and receivers tabulated over time. A primary question to ask of such data is which traits and behaviors are predictive of interaction. To answer this question, a model is introduced for treating directed interactions as a multivariate point process: a Cox multiplicative intensity model using covariates that depend on the history of the process. Consistency and asymptotic normality are proved for the resulting partial-likelihood-based estimators under suitable regularity conditions, and an efficient fitting procedure is described. Multicast interactions–those involving a single sender but multiple receivers–are treated explicitly. The resulting inferential framework is then employed to model message sending behavior in a corporate e-mail network. The analysis gives a precise quantification of which static shared traits and dynamic network effects are predictive of message recipient selection.
💡 Research Summary
The paper tackles the problem of modeling directed interaction networks—datasets in which a sender repeatedly contacts one or more receivers over time—by treating each interaction as an event in a multivariate point‑process framework. The authors adopt a Cox multiplicative intensity model, where the instantaneous hazard (or intensity) for a particular sender‑receiver (or sender‑receiver set) pair at time t is expressed as a baseline intensity λ₀(t) multiplied by an exponential of a linear combination of covariates: λᵢⱼ(t)=λ₀(t)·exp{βᵀXᵢⱼ(t)}. Two families of covariates are incorporated. Static covariates capture immutable traits such as department, rank, or tenure, while dynamic covariates are functions of the past event history, including recent reciprocity, the number of common contacts, triangle formation, and the size of a multicast group. By allowing covariates to depend on the filtration generated by past events, the model naturally accounts for temporal dependence and network evolution, something that conventional static graph models cannot do.
Parameter estimation proceeds via the partial‑likelihood approach familiar from survival analysis. The baseline intensity λ₀(t) is left non‑parametric, while the regression coefficients β are estimated by maximizing the partial likelihood, which only involves the covariate‑dependent part of the intensity. The authors prove that, under standard regularity conditions (boundedness and continuity of the intensity, predictability of covariates, and sufficiently fine observation of the process), the partial‑likelihood estimator is consistent and asymptotically normal. These results extend the classical Cox theory to the multivariate, history‑dependent setting and provide the usual Wald‑type inference tools.
A major methodological contribution concerns multicast interactions, where a single event involves one sender and multiple receivers simultaneously. Treating such an event as a collection of independent single‑receiver events would violate the independence assumptions underlying the partial likelihood. To resolve this, the authors introduce a weighted correction: each multicast event contributes to the partial likelihood with a weight that depends on the number of receivers, effectively adjusting for the combinatorial multiplicity. They show that the corrected estimator retains the same asymptotic properties as in the unicast case, thereby offering a principled way to handle group communications.
From a computational standpoint, the paper proposes an efficient algorithm that updates sufficient statistics incrementally as events are processed in chronological order. By maintaining cumulative sums of covariate contributions and exploiting sparsity in the interaction matrix, the algorithm scales roughly as O(N log N) where N is the number of events, making it feasible for large corporate email logs or online messaging platforms.
The empirical application uses a three‑month email dataset from a large corporation, comprising over 200,000 messages exchanged among roughly 1,000 employees. The model incorporates static covariates (department, job level, tenure) and dynamic covariates (recency of prior contact, number of shared contacts, reciprocity, and multicast size). Key findings include: (1) intra‑department communication is significantly stronger than inter‑department communication; (2) messages are more likely to be sent to recent contacts, confirming a strong “re‑contact” effect; (3) the presence of a common third party (triangle formation) raises the probability of selecting a receiver by about 20%; and (4) multicast events have an intensity roughly 1.3 times that of comparable unicast events, with a non‑linear increase as the number of recipients grows. These results quantitatively demonstrate that both immutable traits and evolving network structures jointly shape communication patterns.
In summary, the paper delivers a rigorous statistical framework for directed, time‑stamped interaction data, extending Cox proportional‑hazards modeling to multivariate point processes with history‑dependent covariates and multicast events. The theoretical guarantees (consistency, asymptotic normality) and the practical fitting algorithm together make the approach applicable to a wide range of domains—corporate email, social media messaging, phone call records, and online forums—where understanding the drivers of who contacts whom, when, and why is essential for both scientific insight and operational decision‑making. Future extensions could incorporate non‑linear covariate effects, time‑varying coefficients, or Bayesian hierarchical structures to capture even richer dynamics.
Comments & Academic Discussion
Loading comments...
Leave a Comment