Proximity-Based Non-uniform Abstractions for Approximate Planning

Proximity-Based Non-uniform Abstractions for Approximate Planning

In a deterministic world, a planning agent can be certain of the consequences of its planned sequence of actions. Not so, however, in dynamic, stochastic domains where Markov decision processes are commonly used. Unfortunately these suffer from the curse of dimensionality: if the state space is a Cartesian product of many small sets (dimensions), planning is exponential in the number of those dimensions. Our new technique exploits the intuitive strategy of selectively ignoring various dimensions in different parts of the state space. The resulting non-uniformity has strong implications, since the approximation is no longer Markovian, requiring the use of a modified planner. We also use a spatial and temporal proximity measure, which responds to continued planning as well as movement of the agent through the state space, to dynamically adapt the abstraction as planning progresses. We present qualitative and quantitative results across a range of experimental domains showing that an agent exploiting this novel approximation method successfully finds solutions to the planning problem using much less than the full state space. We assess and analyse the features of domains which our method can exploit.


💡 Research Summary

The paper tackles the well‑known curse of dimensionality that plagues planning in stochastic domains modeled as Markov Decision Processes (MDPs). When the state space is a Cartesian product of many small sets, the number of possible states grows exponentially with the number of dimensions, making exact dynamic programming infeasible. Traditional approaches mitigate this problem by applying a uniform abstraction—reducing the same set of dimensions everywhere—or by ignoring a fixed subset of variables. While these techniques lower computational demands, they also discard potentially critical information in regions where fine‑grained detail is essential for good decision making.

To address this limitation, the authors introduce Proximity‑Based Non‑Uniform Abstraction (PNUA), a method that selectively retains or discards dimensions depending on the agent’s current location, its likely future trajectories, and the stage of planning. The core idea is simple: allocate modeling resources where they matter most and coarsen the representation elsewhere. The approach consists of two tightly coupled components:

  1. Spatial‑Temporal Proximity Measure – For every concrete state (s), a proximity score (p(s)) is computed by combining (a) the probability that the current policy will visit (s), (b) a distance metric (e.g., Manhattan distance) from the agent’s present state, and (c) a planning‑phase weight that gradually shifts emphasis from exploration to exploitation. This score is recomputed after each planning iteration, allowing the abstraction to evolve as the agent moves and as the policy improves.

  2. Non‑Markovian Corrected Planner – Ignoring dimensions destroys the Markov property, because the transition dynamics of the abstracted state no longer depend solely on the abstract state and action. The authors therefore augment the standard value‑iteration update with a correction term that estimates the expected contribution of the omitted dimensions. The correction term is learned from transition samples collected in the “proximate” region, where the full set of dimensions is still observed. Formally, the update becomes
    \