POrTAL: Plan-Orchestrated Tree Assembly for Lookahead

Reading time: 5 minute
...

📝 Original Info

  • Title: POrTAL: Plan-Orchestrated Tree Assembly for Lookahead
  • ArXiv ID: 2512.06002
  • Date: 2025-12-02
  • Authors: Evan Conway, David Porfirio, David Chan, Mark Roberts, Laura M. Hiatt

📝 Abstract

Assigning tasks to robots often involves supplying the robot with an overarching goal, such as through natural language, and then relying on the robot to uncover and execute a plan to achieve that goal. In many settings common to human-robot interaction, however, the world is only partially observable to the robot, requiring that it create plans under uncertainty. Although many probabilistic planning algorithms exist for this purpose, these algorithms can be inefficient if executed with the robot's limited computational resources, or may require more steps than expected to achieve the goal. We thereby created a new, lightweight, probabilistic planning algorithm, Plan-Orchestrated Tree Assembly for Lookahead (POrTAL), that combines the strengths of two baseline planning algorithms, FF-Replan and POMCP. In a series of case studies, we demonstrate POrTAL's ability to quickly arrive at solutions that outperform these baselines in terms of number of steps. We additionally demonstrate how POrTAL performs under varying temporal constraints.

💡 Deep Analysis

Figure 1

📄 Full Content

The ability of modern robots to respond to arbitrary user requests has advanced considerably in recent years. This advancement is in large part due to robots' ability to autonomously plan their own actions. When receiving a goal such as "bring me a cup of coffee," for example, a robot can calculate the minimum number of steps required to achieve this goal: obtain the coffee grinds, proceeding to the coffee maker, load the grinds, and so on.

In many scenarios common to human-robot interaction, however, this planning must be performed under considerable uncertainty. Such uncertainty can arise from uncertainty over the outcome of the robot’s actions or uncertainty in the environment. This paper focuses on the latter case. Specifically, task-critical entities in the robot’s environment may have uncertain locations, such as if the location of the coffee grinds is unknown. In this case, the robot may still possess a distribution of possible entity locations, sourced from prior exploration of its surroundings or provided directly from the end user [12]. The robot must then create a plan that adheres to some objective function, which may be to minimize the expected number of steps to achieve the goal, or maximize the expected reward that it receives along the way. Fortunately, many task anytime planning approaches exist for this purpose, such as FF-Replan [16] and Partially Observable Monte Carlo Planning (POMCP) [14], which enable the robot to continuously improve its plan while it executes its task.

Choosing a planning algorithm can drastically affect the outcome of a task. Consider the household delivery robot shown in Figure 2 tasked with delivering a cup and a plate to the kitchen table. Suppose the robot knows that the cup has an 80% chance of being in the bathroom and a 20% chance of already being in the kitchen. The robot must decide where to look first-should it guess that the cup will be in the bathroom, where the cup is most likely to be? If the cup is not in the bathroom, the cost of backtracking to its starting point, from which it can proceed to the kitchen, is high. Alternatively, given that the robot is already so close to the kitchen, should the robot quickly check the kitchen? Although the cup is less likely to be in the kitchen, the cost of backtracking is much lower.

In terms of minimizing expected number of plan steps, FF-Replan performs well in practice [9], but is not optimal. While Monte Carlo approaches such as POMCP maximizes the robot’s expected reward, they struggle without a strong reward signal, such as if the robot only receives reward from achieving its goal. Furthermore, these approaches require the rapid construction of large search trees, which is computationally intensive, and potentially even prohibitive if computation is performed solely onboard the robot.

We thereby created a new anytime planning algorithm, Partially Observable Lookahead Planning (POrTAL). POr-TAL works by combining the strengths of both POMCP and FF-Replan. Like POMCP, POrTAL iteratively generates a search tree during task execution. Unlike POMCP, POrTAL utilizes a classical planner to limit the breadth- . The cup has a 20% chance of being in the kitchen already, and an 80% chance of being in the bathroom. Although there is a smaller chance of the cup being in the kitchen, a policy that minimizes the number of anticipated steps to deliver the cup may dictate that the robot check in the kitchen first.

wise expansion of the tree and more rapidly facilitate its depth-wise expansion towards the goal. To expand the tree depth-wise, POrTAL draws inspiration from FF-Replan by sampling a determinized variant of the problem, using a classical planner to generate a linear plan, and inserting the entire plan into the tree at once as a new branch. Inclusive of the root, treenodes that might diverge from the expectations of the classical plan are labeled as interesting and marked for further exploration. We compared POrTAL to POMCP and FF-Replan and found that POrTAL often produces plans of lower length for scenarios that are common to human-robot interaction.

Our contributions are as follows:

• Technical-The POrTAL algorithm, which balances efficiency and plan length for realistic human-robot interaction tasking scenarios. • Empirical-An evaluation of POrTAL against FF-Replan and POMCP baselines.

We provide a brief background in automated planning, followed by several existing approaches for online probabilistic planning.

In decision making for artificial intelligence (AI), automated planning is the process of deciding what actions an agent needs to perform [4]. Within the scope of task planning, classical planning involves the robot being given a deterministic domain and a goal to achieve, which the robot uses to plan its actions prior to task execution. A key limitation of classical planners is the assumption that all object locations are known to the robot. This assumption is often not the case, such as in h

📸 Image Gallery

domains.png results.png teaser.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut