Self-exciting Point Processes: Infections and Implementations
This is a comment on Reinhart’s “Review of Self-Exciting Spatio-Temporal Point Processes and Their Applications” (arXiv:1708.02647v1). I contribute some experiences from modelling the spread of infectious diseases. Furthermore, I try to complement the review with regard to the availability of software for the described models, which I think is essential in “paving the way for new uses”.
💡 Research Summary
Sebastian Meyer’s short note is a focused commentary on Reinhart’s 2017 review of self‑exciting spatio‑temporal point processes, with a particular emphasis on how these models have been, and can be, applied to infectious disease epidemiology. The author first reminds readers that routine public‑health surveillance data are usually available only as aggregated time‑series of case counts, often analyzed with negative‑binomial autoregressive models, Gaussian‑transformed ARIMA, or Prophet. While these approaches capture overall trends, they ignore the spatial heterogeneity and the mechanistic transmission dynamics that point‑process models can represent.
Meyer then discusses three practical challenges that arise when trying to fit self‑exciting point‑process models to real‑world infection data. The first is limited spatial resolution due to privacy constraints (“areal censoring”). When cases are reported only at the postcode level, multiple events may appear to occur at exactly the same coordinates, violating the simple point‑process assumption of distinct locations. He recommends adding random jitter of a magnitude comparable to the censoring level, repeating the analysis over several seeds, and performing a sensitivity analysis. This procedure reduces artificial spikes in the residual diagnostics described by Ogata (1988).
The second challenge concerns temporal censoring and reporting delays. Observed timestamps correspond to specimen collection or notification dates, not to actual infection times. Variable latent periods and reporting lags can scramble the true order of infections, leading to biased estimates of the triggering function and, consequently, of the background intensity. Meyer stresses that any inference about the effective reproduction number (R₀), which is obtained by integrating the triggering kernel over space and time, will be a lower bound when under‑reporting is present.
The third issue is the mismatch between geographic distance and true contact patterns. While many epidemiological point‑process studies (e.g., Diggle 2006; Scheel et al. 2007; Hohle 2009) have used distance‑based kernels to model livestock disease spread, human infections are more accurately driven by social or movement networks. In a continuous‑space model λ(s,t) the location s is a proxy for exposure, but it cannot capture heterogeneous contact rates without additional network information. Multivariate point‑process formulations λ_i(t), which operate on a discrete set of individuals or farms, are better suited for incorporating contact matrices, as Reinhart’s examples from social‑network analysis illustrate.
Having outlined the data‑related limitations, Meyer turns to software. He notes that the majority of publicly available implementations focus on the ETAS (Epidemic Type Aftershock Sequence) model, originally developed for earthquake catalogs. Implementations include the Fortran program etas_solve (Kasahara et al. 2016), R packages SAPP, PtProcess, and bayesianETAS, and a C/C++ port called ETAS. Purely spatial cluster models are supported by the R package spatstat. However, most of these tools assume the classic Omori decay for the triggering function, limiting their applicability to other domains such as crime or disease where alternative kernels (Gaussian, power‑law, Student‑t, piecewise constant) are more appropriate.
For epidemiological applications, Meyer highlights the R package surveillance (Meyer, Held & Hohle 2017). This package implements the spatio‑temporal conditional intensity model of Meyer, Elias & Hohle 2012, allowing a variety of spatial kernels, custom kernel definitions, and efficient computation of boundary‑corrected integrals via the polyCub package. Unlike many ETAS tools that approximate the integral of the spatial kernel over the observation window by one, surveillance computes the exact integral using a one‑dimensional cubature method for isotropic kernels, thereby reducing bias near the edges of the study region and for heavy‑tailed kernels.
Additional software mentioned includes SEDA (a MATLAB GUI wrapping Fortran ETAS routines, limited to macOS) and etasFLP (an R implementation of the semi‑parametric estimation described in the review). Both are powerful but lack flexibility in specifying alternative triggering functions.
In his closing remarks, Meyer argues that open‑source, well‑documented software is essential for the “paving the way for new uses” of self‑exciting point processes in epidemiology. The availability of flexible tools such as surveillance lowers the barrier for researchers to experiment with different kernels, incorporate network information, and perform rigorous model diagnostics. He calls for further development of frameworks that can seamlessly integrate custom contact networks, handle censored spatial data, and provide robust inference under under‑reporting. By doing so, the community can move beyond the early, earthquake‑centric applications of ETAS and fully exploit the potential of self‑exciting point processes for understanding and forecasting infectious disease spread.
Comments & Academic Discussion
Loading comments...
Leave a Comment