A Framework for QoS-aware Execution of Workflows over the Cloud

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

The Cloud Computing paradigm is providing system architects with a new powerful tool for building scalable applications. Clouds allow allocation of resources on a “pay-as-you-go” model, so that additional resources can be requested during peak loads and released after that. However, this flexibility asks for appropriate dynamic reconfiguration strategies. In this paper we describe SAVER (qoS-Aware workflows oVER the Cloud), a QoS-aware algorithm for executing workflows involving Web Services hosted in a Cloud environment. SAVER allows execution of arbitrary workflows subject to response time constraints. SAVER uses a passive monitor to identify workload fluctuations based on the observed system response time. The information collected by the monitor is used by a planner component to identify the minimum number of instances of each Web Service which should be allocated in order to satisfy the response time constraint. SAVER uses a simple Queueing Network (QN) model to identify the optimal resource allocation. Specifically, the QN model is used to identify bottlenecks, and predict the system performance as Cloud resources are allocated or released. The parameters used to evaluate the model are those collected by the monitor, which means that SAVER does not require any particular knowledge of the Web Services and workflows being executed. Our approach has been validated through numerical simulations, whose results are reported in this paper.

💡 Research Summary

The paper introduces SAVER (QoS‑Aware workflows over the Cloud), a framework designed to execute arbitrary web‑service‑based workflows in a cloud environment while guaranteeing that the average execution time of each workflow class stays below a negotiated SLA threshold. The authors first motivate the need for dynamic reconfiguration in cloud‑based systems, pointing out that traditional static provisioning either over‑provisions resources or fails to meet QoS under variable workloads.

SAVER’s objective is formally expressed as a minimization problem: minimize the total number of service instances Σₖ Nₖ subject to response‑time constraints R_c(N) ≤ R⁺_c for every workflow class c. To solve this, the authors adopt an open, multiclass Queueing Network (QN) model. Each web‑service instance is represented as a FIFO server; the model captures both the service demand D_ck (average processing time of class‑c requests on service k) and the queuing delay. Utilization of a service instance is U_k(N) = (∑_c λ_c·D_ck)/N_k, where λ_c is the observed arrival rate of class‑c workflows. All model parameters are obtained at runtime through passive monitoring, eliminating the need for prior knowledge of the services or workflows.

The control logic follows the classic MAPE (Monitor‑Analyze‑Plan‑Execute) loop. During monitoring, SAVER continuously gathers per‑service response times, utilizations, and arrival rates. The analysis step checks whether any SLA is violated. If so, the planning step uses the QN model to evaluate candidate configurations. A greedy algorithm identifies the most loaded service (the bottleneck) and incrementally adds instances until all response‑time constraints are satisfied; it also removes surplus instances when utilization drops, thereby reducing cost. Because the QN model predicts the “bottleneck shift” phenomenon—where fixing one bottleneck may create another—the planner can compute a near‑optimal configuration in a single step rather than iteratively reacting after each change. The execute step issues the appropriate cloud API calls to provision or de‑provision virtual machines hosting the service instances.

Related work is surveyed, highlighting that many existing approaches rely on control‑theory feedback loops, machine‑learning predictions, or utility‑based optimization, often at the VM or single‑service level. SAVER distinguishes itself by operating at the service‑instance granularity, using a lightweight QN model that can be solved in milliseconds, and by planning multi‑step reconfigurations proactively.

The authors validate SAVER through extensive simulation experiments involving two workflow classes (arrival rates of 2 and 1 requests per second) and three web services with differing capacities. The results demonstrate that SAVER maintains SLA compliance while reducing the average number of active instances by 20‑30 % compared with a static worst‑case provisioning strategy. Moreover, the model‑based planning reacts quickly to workload spikes, avoiding the slow, sequential scaling typical of purely reactive systems.

Limitations acknowledged include the assumption of exponentially distributed inter‑arrival times and perfectly balanced load distribution among instances. Real clouds exhibit non‑Poisson traffic, VM spin‑up delays, and inter‑service dependencies, which are not captured in the current model. Future research directions propose extending the QN model to handle non‑exponential arrivals, incorporating scaling latencies and cost models, and exploring reinforcement‑learning policies that can adapt to more complex, stochastic environments.

In summary, SAVER provides a practical, model‑driven approach to QoS‑aware workflow execution in the cloud, achieving significant resource savings while respecting SLA constraints, and laying groundwork for more sophisticated adaptive resource management techniques.

A Framework for QoS-aware Execution of Workflows over the Cloud

💡 Research Summary

Comments & Academic Discussion

Leave a Comment