An iterative tomogravity algorithm for the estimation of network traffic
This paper introduces an iterative tomogravity algorithm for the estimation of a network traffic matrix based on one snapshot observation of the link loads in the network. The proposed method does not require complete observation of the total load on individual edge links or proper tuning of a penalty parameter as existing methods do. Numerical results are presented to demonstrate that the iterative tomogravity method controls the estimation error well when the link data is fully observed and produces robust results with moderate amount of missing link data.
💡 Research Summary
The paper addresses the long‑standing problem of estimating a network traffic matrix (the source‑to‑destination, or SD, flow) from link‑level measurements. In the classic network tomography formulation y = A x, the number of observed link loads (the dimension of y) is comparable to the number of network nodes, while the number of SD pairs (the dimension of x) grows quadratically with the number of nodes. Consequently the linear system is severely under‑determined. Existing approaches either require multiple independent snapshots of y, need the total inbound and outbound traffic for every edge node (the “gravity” information), or introduce a penalty term whose tuning is delicate (e.g., entropy‑regularized tomogravity).
The authors propose an Iterative Tomogravity (ITG) algorithm that works with a single snapshot y* and does not require full edge‑node traffic information or any penalty‑parameter tuning. The key idea is to treat the estimation problem as a joint optimization over two probability‑vector spaces: (i) the tomographic space T* defined by the linear constraints A* f = y* and 1ᵀf = 1, and (ii) the gravity space G consisting of rank‑one matrices g = p qᵀ, where p and q are the normalized inbound and outbound traffic distributions over source and destination nodes. The algorithm alternates between (a) minimizing the Kullback–Leibler (KL) divergence K(f, g_old) over f ∈ T* while keeping the current gravity estimate g_old fixed, and (b) updating the gravity estimate by minimizing K(f_new, g) over g ∈ G. The KL divergence is defined as K(f,g)=∑_j f_j log(f_j/g_j).
Step (a) is solved using Krupp’s (1979) relaxation algorithm: a set of auxiliary variables v is introduced, the objective becomes concave in v, and a Newton–Raphson loop yields the optimal v. The resulting f_new is given by f_new_j = g_old_j exp(∑_i h_ij v_i − 1), where the matrix h encodes the linear constraints. Step (b) has a closed‑form solution: g_new_sd = (∑_d′ f_new_sd′)(∑_s′ f_new_s′d)/N, i.e., the product of the marginal sums of f_new normalized by the total flow N. The algorithm repeats these two steps until the KL divergence no longer decreases, at which point a locally optimal pair (f, g) is obtained. The final traffic estimate is x̂ = b_N f, where b_N rescales f to match the observed total link load: b_N = (1ᵀy*)/(1ᵀA* f).
Several important properties follow. Because g is treated as an unknown rather than a fixed “simple gravity” solution, the method does not need the exact inbound/outbound totals N_in^s, N_out^d. Consequently, even if some edge‑link loads are missing (i.e., A* contains only a subset of rows of A), the algorithm can still be applied by simply using the reduced A*. Moreover, there is no penalty parameter φ to tune, unlike the entropy‑regularized tomogravity (ER‑TG) method, making ITG attractive for operational deployment.
The authors evaluate ITG on real data from the Abilene backbone network (12 nodes, 144 SD pairs, 30 interior links, 24 edge links). Traffic matrices were collected every five minutes over 19 weeks in 2004. Four non‑overlapping three‑day periods were selected, yielding 72 hours (288 time slots) per dataset. Four estimation procedures were compared: (1) ITG, (2) Simple Tomogravity (STG), which solves min_x ‖x − e_x‖ subject to A x = y, (3) Generalized Tomogravity (GTG) that incorporates “access” vs. “peering” link status, and (4) Entropy‑regularized Tomogravity (ER‑TG) solving min_x ‖y − A x‖² + φ K(x/N, e_x/N). Self‑traffic (s = d) is directly observed via self‑links, so all methods recover those entries exactly. For non‑self SD pairs the relative total error (‖x̂ − x‖₁ / ‖x‖₁) was computed.
Results (Table 1) show average errors of 0.3001 for ITG, 0.2995 for ER‑TG (with φ = 10⁻³, the empirically optimal value), 0.3026 for GTG, and 0.3139 for STG. Thus ITG matches the best tuned ER‑TG while requiring no tuning, and outperforms the simpler STG. Error curves over time (Figures 3‑6) confirm that ITG’s performance is stable across the four datasets, each representing different traffic patterns (weekday vs. weekend). Further analysis (Table 2) groups SD pairs by total traffic volume and shows that estimation error decreases as the pair’s traffic magnitude increases, a behavior common to all methods.
Robustness to missing data was also examined. The authors artificially removed varying numbers of edge‑link measurements, noting that only ITG can operate without full edge‑link observations (the other three methods need all edge loads). Even with a moderate fraction of missing links, ITG’s error increased only modestly, demonstrating its suitability for networks where some link counters are unavailable or unreliable.
In summary, the paper contributes a practical, parameter‑free algorithm for single‑snapshot traffic matrix estimation that leverages a gravity‑based regularization in a probabilistic KL‑divergence framework. Its ability to handle incomplete link data and to avoid delicate parameter tuning makes it a strong candidate for real‑time network monitoring, capacity planning, and anomaly detection. Future work suggested includes extending ITG to time‑varying routing, incorporating multiple snapshots for improved accuracy, and testing on larger ISP‑scale topologies.
Comments & Academic Discussion
Loading comments...
Leave a Comment