Cost Sensitive Reachability Heuristics for Handling State Uncertainty

While POMDPs provide a general platform for non-deterministic conditional planning under a variety of quality metrics they have limited scalability. On the other hand, non-deterministic conditional planners scale very well, but many lack the ability to optimize plan quality metrics. We present a novel generalization of planning graph based heuristics that helps conditional planners both scale and generate high quality plans when using actions with nonuniform costs. We make empirical comparisons with two state of the art planners to show the benefit of our techniques.

💡 Research Summary

The paper addresses a fundamental tension in planning under uncertainty: probabilistic models such as Partially Observable Markov Decision Processes (POMDPs) can optimize rich quality metrics but scale poorly, whereas non‑deterministic conditional planners scale to large problems but typically ignore action costs. To bridge this gap, the authors introduce a cost‑sensitive reachability heuristic that augments planning‑graph‑based heuristics with explicit handling of non‑uniform action costs while still operating on belief states.

The core technical contribution is a two‑phase cost propagation over an extended planning graph. In the forward phase, each layer of the graph represents a set of belief states (encoded as bit‑vectors). For every applicable action, the algorithm combines the action’s fixed cost with the expected cost of its probabilistic successors, yielding an “expected cost” for each successor belief. In the backward phase, the algorithm computes a lower‑bound on the minimum cumulative cost required to reach the goal from any belief by recursively applying

h_cost(b) = min_{a∈Applicable(b)}