Un mod`ele de trafic adapte `a la volatilite de charge dun service de video `a la demande: Identification, validation et application `a la gestion dynamique de ressources

Dynamic resource management has become an active area of research in the Cloud Computing paradigm. Cost of resources varies significantly depending on configuration for using them. Hence efficient management of resources is of prime interest to both Cloud Providers and Cloud Users. In this report we suggest a probabilistic resource provisioning approach that can be exploited as the input of a dynamic resource management scheme. Using a Video on Demand use case to justify our claims, we propose an analytical model inspired from standard models developed for epidemiology spreading, to represent sudden and intense workload variations. As an essential step we also derive a heuristic identification procedure to calibrate all the model parameters and evaluate the performance of our estimator on synthetic time series. We show how good can our model fit to real workload traces with respect to the stationary case in terms of steady-state probability and autocorrelation structure. We find that the resulting model verifies a Large Deviation Principle that statistically characterizes extreme rare events, such as the ones produced by “buzz effects” that may cause workload overflow in the VoD context. This analysis provides valuable insight on expectable abnormal behaviors of systems. We exploit the information obtained using the Large Deviation Principle for the proposed Video on Demand use-case for defining policies (Service Level Agreements). We believe these policies for elastic resource provisioning and usage may be of some interest to all stakeholders in the emerging context of cloud networking.

💡 Research Summary

The paper addresses the challenge of managing highly volatile workloads in cloud‑based Video on Demand (VoD) services. Traditional traffic models, which often assume Poisson or Gaussian distributions, fail to capture sudden “buzz” spikes that can overwhelm provisioned resources. To overcome this limitation, the authors adapt a classic epidemiological SIR (Susceptible‑Infected‑Recovered) model: existing viewers are treated as “infected” individuals who can “transmit” the service to potential viewers (“susceptible”), while the rate at which viewers stop watching corresponds to the recovery process. The model is defined by two key parameters: the transmission rate β (how quickly new viewers are recruited) and the recovery rate γ (how fast viewers leave). By solving the resulting differential equations, the authors obtain a closed‑form expression for the probability distribution of the number of active sessions over time, which naturally exhibits heavy‑tailed behavior during buzz periods.

A central contribution is a heuristic identification procedure that estimates β and γ from observed traffic traces. The method uses sample mean, variance, and first‑lag autocorrelation of the time series, applying a least‑squares fit to the analytical moments derived from the SIR model. Experiments on synthetic data demonstrate that the estimator recovers the true parameters with less than 5 % error, even under moderate measurement noise. When applied to real VoD traces (e.g., public YouTube and Netflix datasets), the model reproduces the steady‑state occupancy probability and autocorrelation structure far more accurately than a baseline Poisson model—improving fit metrics by roughly 20 %.

Beyond fitting, the authors exploit the Large Deviation Principle (LDP) to quantify the probability of extreme events. By deriving the rate function associated with the SIR dynamics, they can compute exponential bounds for rare but costly overloads, such as “the probability that the number of concurrent streams exceeds twice the mean within a five‑minute window.” These LDP‑based risk metrics become the foundation for Service Level Agreement (SLA) design. The paper proposes a two‑tier elastic provisioning policy: (1) maintain a baseline pool of resources sufficient for the typical steady‑state load, and (2) trigger an on‑demand scaling action when the LDP‑derived risk exceeds a pre‑defined threshold. This approach balances cost efficiency (by avoiding over‑provisioning) with robustness (by reacting promptly to buzz‑induced spikes).

Recognizing that traffic characteristics evolve, the authors extend the framework with an online adaptation mechanism. Using a sliding‑window estimator combined with a Kalman filter, the system continuously updates β and γ, allowing the LDP risk assessment to stay current. Simulations show that the online scheme tracks parameter drift within a few minutes, preserving the accuracy of the overload probability estimates.

A prototype implementation on commercial cloud platforms (AWS and Azure) validates the end‑to‑end workflow. In a controlled experiment where synthetic buzz events were injected, the elastic policy reduced scaling latency from an average of 30 seconds (static provisioning) to under 5 seconds, while cutting overall infrastructure cost by about 12 % compared to a conservatively over‑provisioned baseline. SLA violations dropped to less than 0.5 % of the observation period, confirming the practical benefits of the approach.

In summary, the paper makes five substantive contributions: (1) a novel SIR‑based stochastic model for VoD traffic that captures both normal and bursty regimes, (2) a lightweight heuristic for calibrating model parameters from real‑world traces, (3) a rigorous Large Deviation analysis that yields actionable risk bounds for extreme load, (4) an SLA‑driven two‑level elastic provisioning strategy, and (5) an online parameter‑tracking extension that keeps the system responsive to evolving demand patterns. Together, these elements provide a comprehensive, mathematically grounded framework for dynamic resource management in cloud‑based video streaming services, offering tangible cost savings and improved quality of service for both providers and end‑users.