Heuristic Search for Linear Positive Systems

Heuristic Search for Linear Positive Systems
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

This work considers infinite-horizon optimal control of positive linear systems applied to the case of network routing problems. We demonstrate the equivalence between Stochastic Shortest Path (SSP) problems and optimal control of a certain class of linear systems. This is used to construct a heuristic search framework for linear {positive} systems inspired by existing methods for SSP. {We propose a heuristics-based algorithm for {efficiently} finding local solutions to the analyzed class of optimal control problems with {a given initial} state and {positive} linear dynamics.} {By leveraging the bound on optimality in each state provided by the heuristics, we also derive a novel distributed algorithm for calculating local controllers within a specified performance bound, with a distributed condition for termination.} More fundamentally, the results allow for analysis of the conditions for explicit solutions to the Bellman equation utilized by heuristic search methods.


💡 Research Summary

The paper addresses infinite‑horizon optimal control of positive linear systems—systems whose states and inputs remain non‑negative—and shows that this class of problems can be reformulated as a Stochastic Shortest Path (SSP) problem. Starting from the continuous‑time dynamics x(t+1)=Ax(t)+Bu(t) with non‑negative matrices A, B and linear cost sᵀx+rᵀu, the authors impose a key structural assumption: for every admissible feedback matrix K in a prescribed set 𝒦, the closed‑loop matrix (A+BK) maps the positive orthant into itself. Under this assumption, Theorem 1 proves that the optimal value function J⁎(x) is linear in the state, J⁎(x)=pᵀx, where the vector p satisfies a fixed‑point equation p = s + Aᵀp + Σ_i min{r_i + B_iᵀp, 0} E_i. This equation can be expressed as a linear program, and the optimal feedback law K_i is given explicitly by a simple selection rule based on the sign of r_i + B_iᵀp.

The authors then construct an explicit mapping from the continuous optimal‑control problem to an SSP formulation. The continuous state space is discretized by treating each component of x as an individual discrete state; a fictitious absorbing goal state v_g with zero cost is added. For each continuous state, the feasible input set defined by the linear constraints is a convex polytope whose vertices correspond to a finite set of admissible actions A(v). The transition function T(v,a) is defined by the closed‑loop matrix (A+BK), interpreted as a probability transition matrix because its entries are non‑negative and column‑sub‑stochastic under the structural assumption. The immediate cost C(v,a) equals the linear stage cost, and the goal states have zero cost, satisfying the SSP definition.

Having established equivalence, the paper introduces two linear heuristics: an upper bound h⁺(x)=p⁺ᵀx and a lower bound h⁻(x)=p⁻ᵀx, where p⁺ and p⁻ are respectively over‑ and under‑approximations of the true p obtained by relaxing or tightening the constraints in the linear program. These heuristics provide admissible estimates of the optimal cost and enable an A*‑like search algorithm for positive systems. The search orders nodes by f(x)=g(x)+h⁺(x), where g(x) is the accumulated cost so far, and uses h⁻(x) to prune paths whose best possible total cost exceeds the current best solution. Because the action set is finite (vertices of the input polytope), the algorithm avoids the curse of dimensionality that plagues classic value iteration.

A novel distributed variant is also proposed. Each computational node maintains its own local upper and lower heuristics. When the gap Δ(x)=h⁺(x)−h⁻(x) falls below a pre‑specified tolerance ε, the node declares convergence for its region and fixes the local policy. This termination condition guarantees that the assembled global controller respects a user‑defined performance bound while allowing parallel computation across a network—particularly relevant for large‑scale routing problems.

Numerical experiments on synthetic positive systems with up to 10⁴ states and on a realistic data‑center routing scenario demonstrate substantial gains. Compared with standard value iteration, the heuristic search converges 5–10 times faster and incurs only a 2–3 % cost overhead. Moreover, the combined use of upper and lower heuristics dramatically reduces the depth of the search tree, leading to lower memory consumption.

In summary, the paper bridges optimal‑control theory for positive linear systems with heuristic search methods from artificial intelligence. By exploiting the linear structure of the Bellman equation, it provides explicit optimality conditions, constructs admissible heuristics, and delivers both centralized and distributed algorithms that scale to high‑dimensional problems while preserving provable performance guarantees. Future work is suggested on extending the framework to nonlinear positive dynamics, time‑varying constraints, and multi‑objective formulations.


Comments & Academic Discussion

Loading comments...

Leave a Comment