Dynamic scheduling in a partially fluid, partially lossy queueing system

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We consider a single server queueing system with two classes of jobs: eager jobs with small sizes that require service to begin almost immediately upon arrival, and tolerant jobs with larger sizes that can wait for service. While blocking probability is the relevant performance metric for the eager class, the tolerant class seeks to minimize its mean sojourn time. In this paper, we discuss the performance of each class under dynamic scheduling policies, where the scheduling of both classes depends on the instantaneous state of the system. This analysis is carried out under a certain fluid limit, where the arrival rate and service rate of the eager class are scaled to infinity, holding the offered load constant. Our performance characterizations reveal a (dynamic) pseudo-conservation law that ties the performance of both the classes to the standalone blocking probabilities of the eager class. Further, the performance is robust to other specifics of the scheduling policies. We also characterize the Pareto frontier of the achievable region of performance vectors under the same fluid limit, and identify a (two-parameter) class of Pareto-complete scheduling policies.

💡 Research Summary

The paper studies a single‑server queueing system that simultaneously serves two heterogeneous job classes: “eager” jobs that are small and must begin service almost immediately upon arrival, and “tolerant” jobs that are larger and can wait in an infinite‑capacity queue. For the eager class the performance metric is the blocking probability (the long‑run fraction of arrivals that are rejected), while for the tolerant class the metric is the mean sojourn time. The authors focus on dynamic scheduling policies whose decisions (admission control for eager jobs and service allocation for both classes) depend on the instantaneous state of the system, i.e., on the current numbers of eager and tolerant jobs.

To obtain tractable results the authors introduce the “short‑frequent‑jobs” (SFJ) fluid limit. In this limit the arrival rate λγ and the service rate μγ of the eager class are scaled to infinity proportionally, keeping the offered load ργ = λγ/μγ fixed. This creates a clear time‑scale separation: eager jobs evolve on a fast time scale and behave like a loss system, whereas tolerant jobs evolve on the original slower time scale. The eager class therefore experiences a partially fluid, partially lossy environment, while the tolerant class sees a time‑varying service capacity that depends on the current eager occupancy.

Dynamic policies are represented as a hierarchy of “eager sub‑policies”. For each possible tolerant‑queue occupancy j, a specific sub‑policy j is selected; that sub‑policy determines (i) an admission rule for newly arriving eager jobs and (ii) how the server capacity is split among the currently present eager jobs. The tolerant class always uses any leftover capacity in a work‑conserving, non‑preemptive fashion (e.g., FCFS, LCFS, random order). Several technical assumptions (A.1–A.4) guarantee that (a) when the tolerant queue changes state all eager jobs are flushed (simplifying transitions), (b) each sub‑policy depends only on the number of eager jobs, (c) admitted eager jobs receive at least a minimal service rate cmin > 0 (ensuring a loss‑system behavior), and (d) the second moment of eager busy cycles is uniformly bounded across sub‑policies. Complementary assumptions (B.1–B.4) impose work‑conservation on the tolerant scheduler and require stability of the tolerant queue under any single sub‑policy.

The central theoretical contribution is a dynamic pseudo‑conservation law. Let PBj denote the blocking probability of the eager class when sub‑policy j is used in a τ‑static manner (i.e., irrespective of the tolerant state). The pseudo‑conservation law shows that, under the SFJ limit, the overall blocking probability of the eager class and the mean sojourn time of the tolerant class depend only on the collection {PBj}. In other words, the intricate state‑dependent dynamics collapse to a set of scalar blocking probabilities that can be computed from classic Erlang‑B type loss models for each sub‑policy. Consequently, once the PBj values are known, the performance of both classes follows directly, and the specific details of the tolerant scheduler (as long as it is work‑conserving and non‑anticipative) become irrelevant.

Having a closed‑form relationship, the authors then characterize the achievable performance region in the two‑dimensional space (eager blocking probability, tolerant mean sojourn time). They prove that this region is convex and identify its Pareto frontier, i.e., the set of points where one cannot improve one metric without worsening the other. Importantly, they exhibit a two‑parameter family of Pareto‑complete policies parameterized by (L, d): L ∈ ℕ is a threshold on the tolerant queue length, and d ∈ (0, 1) is a mixing probability applied when the tolerant occupancy equals L. The policy works as follows:

If the tolerant queue length j < L, the eager class is admitted with the minimum possible blocking probability (the most permissive sub‑policy).
If j > L, the eager class is admitted with the maximum blocking probability (the most restrictive sub‑policy, often rejecting all arrivals).
If j = L, the eager class is admitted with probability d (a convex combination of the two extremes).

By varying L and d, any point on the Pareto frontier can be attained, establishing that the family is Pareto‑complete. This result demonstrates that dynamic policies dramatically enlarge the feasible performance set compared with static policies (where a single sub‑policy is used regardless of tolerant state); static policies correspond to a strict subset of the dynamic region.

The paper validates the theoretical approximations through extensive Monte‑Carlo simulations. Even when the scaling parameter μγ is moderate (e.g., 10–100), the SFJ‑based predictions of both blocking probability and mean sojourn time deviate by less than 2 % from the simulated values. This empirical evidence confirms that the fluid limit provides accurate approximations for realistic system sizes, making the results practically useful.

In summary, the contributions are:

Introduction of the SFJ fluid limit that yields a clean time‑scale separation for heterogeneous loss/queue systems.
Derivation of a dynamic pseudo‑conservation law that reduces performance analysis to a set of standalone blocking probabilities.
Complete characterization of the achievable performance region and its Pareto frontier.
Identification of a simple two‑parameter Pareto‑complete scheduling family that spans the entire frontier.
Demonstration of high‑accuracy of the fluid approximations via simulation, supporting applicability to real‑world systems such as cellular networks handling voice (loss) and data (queue) traffic, or retail environments with express counters.

These insights provide both a theoretical framework for analyzing mixed loss‑queue systems and practical guidelines for designing dynamic admission and service allocation policies that can be tuned to meet desired quality‑of‑service trade‑offs.

Dynamic scheduling in a partially fluid, partially lossy queueing system

💡 Research Summary

Comments & Academic Discussion

Leave a Comment