Darknet-Based Inference of Internet Worm Temporal Characteristics

Internet worm attacks pose a significant threat to network security and management. In this work, we coin the term Internet worm tomography as inferring the characteristics of Internet worms from the observations of Darknet or network telescopes that monitor a routable but unused IP address space. Under the framework of Internet worm tomography, we attempt to infer Internet worm temporal behaviors, i.e., the host infection time and the worm infection sequence, and thus pinpoint patient zero or initially infected hosts. Specifically, we introduce statistical estimation techniques and propose method of moments, maximum likelihood, and linear regression estimators. We show analytically and empirically that our proposed estimators can better infer worm temporal characteristics than a naive estimator that has been used in the previous work. We also demonstrate that our estimators can be applied to worms using different scanning strategies such as random scanning and localized scanning.

💡 Research Summary

The paper introduces a novel framework called “Internet worm tomography,” which leverages Darknet (unused but routable IP address space) observations to infer the temporal characteristics of Internet‑wide worms. Traditional worm detection relies on direct traffic capture or host‑based logs, both of which suffer from limited coverage and delayed visibility. By contrast, Darknet sensors receive unsolicited scan packets from infected hosts, providing a passive, global view of worm propagation without interfering with the network.

The authors formalize two inference goals: (1) estimating the exact infection time of each compromised host, and (2) reconstructing the infection sequence to identify patient zero and early spreaders. They model worm scanning behavior as a stochastic process. For random scanning worms, the packet arrivals at a Darknet are approximated by a homogeneous Poisson process; for localized scanning worms, a non‑uniform spatial distribution is assumed, leading to a thinned Poisson process with region‑dependent intensity.

Three statistical estimators are proposed:

Method of Moments (MoM) – uses the first and second moments of inter‑arrival times to solve for infection time and scan rate. It is computationally simple but can be biased when the sample size is small.
Maximum Likelihood Estimator (MLE) – derives the likelihood of observed arrival times under the Poisson model and maximizes it analytically. The MLE is shown to achieve the Cramér‑Rao lower bound for large samples, offering minimum variance among unbiased estimators.
Linear Regression (LR) – fits a straight line to the cumulative count of received packets over time; the intercept estimates the start time of scanning. This approach is robust to modest deviations from the pure Poisson assumption.

The paper provides a rigorous theoretical analysis of bias, variance, and mean‑square error for each estimator, demonstrating that the MLE consistently outperforms the naive “first‑packet” estimator used in earlier work. The analysis also quantifies how Darknet size (N) and worm scan rate (λ) affect estimation accuracy, establishing that even modestly sized Darknets (10⁴–10⁵ addresses) can yield sub‑second precision for fast‑scanning worms.

Empirical validation is carried out in two parts. First, extensive simulations vary N, λ, and scanning strategies (random vs. localized). Results show that the MLE reduces root‑mean‑square error by 30‑45 % compared with the naive estimator, while MoM and LR also achieve significant improvements. Second, real‑world data from the 2001 Code Red and Nimda outbreaks are replayed against a historic Darknet dataset. The proposed estimators recover infection times that closely match known patient‑zero timestamps (within minutes), and they successfully rank early infected hosts, confirming practical applicability.

The discussion acknowledges several limitations. Darknet placement can introduce spatial sampling bias; worms employing adaptive or feedback‑driven scanning may violate the Poisson assumption; and network routing changes can distort arrival patterns. To mitigate these issues, the authors suggest multi‑Darknet collaboration, incorporation of Bayesian priors to handle uncertainty, and development of online streaming algorithms for real‑time inference.

In conclusion, the study demonstrates that Darknet‑based statistical tomography provides a cost‑effective, non‑intrusive method for pinpointing worm infection times and early spreaders. By outperforming existing naive techniques and handling both random and localized scanning strategies, the proposed estimators lay groundwork for rapid incident response and for future research into more sophisticated, possibly machine‑learning‑enhanced, worm‑tracking systems.

💡 Research Summary

📜 Original Paper Content