Stochastic Optimization for Markov Modulated Networks with Application to Delay Constrained Wireless Scheduling
We consider a wireless system with a small number of delay constrained users and a larger number of users without delay constraints. We develop a scheduling algorithm that reacts to time varying channels and maximizes throughput utility (to within a desired proximity), stabilizes all queues, and satisfies the delay constraints. The problem is solved by reducing the constrained optimization to a set of weighted stochastic shortest path problems, which act as natural generalizations of max-weight policies to Markov decision networks. We also present approximation results for the corresponding shortest path problems, and discuss the additional complexity and delay incurred as compared to systems without delay constraints. The solution technique is general and applies to other constrained stochastic decision problems.
💡 Research Summary
The paper addresses a wireless downlink system in which a small set of users have strict delay constraints while a larger set of users are delay‑agnostic. The wireless channels evolve according to a finite‑state Markov chain, and each user maintains a separate data queue. The objective is to maximize a network‑wide throughput utility (e.g., a concave log‑utility) while guaranteeing queue stability for all users and ensuring that the probability of violating any delay bound does not exceed a pre‑specified threshold ε.
To achieve this, the authors first formulate the problem as a constrained stochastic optimization: the decision at each time slot is which user to serve, given the current queue lengths and channel state. Classical max‑weight scheduling, which selects the user with the largest product of queue length and instantaneous channel rate, cannot enforce delay constraints because it only reacts to instantaneous backlog and channel quality. Therefore, the paper proposes a fundamentally different approach that transforms the constrained problem into a collection of weighted stochastic shortest‑path (SSP) problems.
In the SSP formulation, a system state s comprises the vector of queue lengths and the current channel state. An action a corresponds to selecting a particular user for transmission. Each action incurs an immediate cost c(s,a) that combines the negative of the utility gain (so that minimizing cost maximizes utility) and a penalty term that grows when a packet of a delay‑constrained user approaches or exceeds its deadline D. The “target” states are those in which all delay‑constrained packets have been served within their deadlines. The SSP objective is to find a stationary policy π that minimizes the expected cumulative cost until a target state is reached. This leads to the Bellman optimality equation
V(s) = min_a { c(s,a) + Σ_{s′} P(s′|s,a) V(s′) }
where V(s) is the optimal value function and P(·) denotes the Markov transition probabilities of the joint queue‑channel process. Solving this equation yields a policy that can be interpreted as a “weighted max‑weight” rule: the usual weight (queue length × channel rate) is augmented by a factor reflecting the urgency of meeting the deadline.
The authors prove two key properties of the resulting policy: (1) all queues are strongly stable, i.e., the time‑average total backlog is bounded, and (2) the probability that any delay‑constrained packet exceeds its deadline is ≤ ε. The proof combines Lyapunov drift‑plus‑penalty analysis with the regenerative structure of the underlying Markov chain.
Because the exact solution of the SSP is computationally prohibitive—the state space grows exponentially with the number of delay‑constrained users (each can have up to D pending packets) and with the number of channel states—the paper introduces two approximation schemes. The first uses linear function approximation together with temporal‑difference (TD) learning to estimate V(s) from sample trajectories, thereby avoiding explicit enumeration of all states. The second reduces the state space by identifying a set of “core” states (e.g., aggregated backlog levels) and solving a reduced SSP on this compressed graph. Both schemes are shown to produce policies whose utility loss relative to the optimal SSP solution is bounded by a user‑specified δ, while the computational complexity drops from exponential to polynomial in the number of users, channel states, and the deadline D.
A detailed complexity analysis quantifies the overhead introduced by delay constraints. In the unconstrained case, a max‑weight scheduler operates in O(N·|C|) time per slot (N users, |C| channel states). With K delay‑constrained users, the exact SSP would require O(N·|C|·D^K) operations, but the proposed approximations achieve O(N·|C|·poly(D,K)). The authors also examine the trade‑off between throughput and delay: enforcing deadlines inevitably reduces the raw throughput because some transmission opportunities must be reserved for urgent packets. Simulations show a typical throughput reduction of 5–10 % compared with the unconstrained max‑weight policy, while the average packet delay for constrained users drops by more than 30 % and the deadline‑violation probability falls below the target ε.
Extensive numerical experiments explore a range of system parameters: varying channel transition probabilities, traffic arrival rates, the proportion of delay‑constrained users, and deadline lengths. Across all scenarios, the weighted‑SSP policy consistently meets the delay‑violation target, stabilizes all queues, and attains utility within a few percent of the unconstrained optimum.
Finally, the paper argues that the methodology is not limited to wireless scheduling. Any stochastic decision problem that can be modeled as a Markov decision process with multiple linear constraints (e.g., energy‑budget constraints in sensor networks, resource caps in cloud computing, or due‑date constraints in manufacturing) can be cast into a family of weighted SSPs. The presented reduction and approximation techniques thus provide a general template for solving constrained stochastic control problems where traditional Lyapunov‑drift or greedy policies fall short.
In summary, the work makes three major contributions: (i) a novel reduction of delay‑constrained wireless scheduling to weighted stochastic shortest‑path problems, (ii) provably near‑optimal approximation algorithms that keep computational complexity tractable, and (iii) a thorough performance and complexity analysis that quantifies the cost of meeting strict delay guarantees. This advances the state of the art in both theoretical stochastic optimization and practical wireless network design.
Comments & Academic Discussion
Loading comments...
Leave a Comment