Soft Goals Can Be Compiled Away

Soft goals extend the classical model of planning with a simple model of preferences. The best plans are then not the ones with least cost but the ones with maximum utility, where the utility of a plan is the sum of the utilities of the soft goals achieved minus the plan cost. Finding plans with high utility appears to involve two linked problems: choosing a subset of soft goals to achieve and finding a low-cost plan to achieve them. New search algorithms and heuristics have been developed for planning with soft goals, and a new track has been introduced in the International Planning Competition (IPC) to test their performance. In this note, we show however that these extensions are not needed: soft goals do not increase the expressive power of the basic model of planning with action costs, as they can easily be compiled away. We apply this compilation to the problems of the net-benefit track of the most recent IPC, and show that optimal and satisficing cost-based planners do better on the compiled problems than optimal and satisficing net-benefit planners on the original problems with explicit soft goals. Furthermore, we show that penalties, or negative preferences expressing conditions to avoid, can also be compiled away using a similar idea.

💡 Research Summary

The paper tackles a fundamental question in automated planning: does the introduction of soft goals (preferences) and penalties truly extend the expressive power of the classical planning model that already handles action costs? In the classical model, a plan is evaluated solely by its total execution cost, while soft goals add a utility component: each achieved soft goal contributes a positive reward, and each violated penalty incurs a negative reward. The net‑benefit planning paradigm, which has been a separate track in the International Planning Competition (IPC), treats the objective as maximizing total utility (rewards minus costs).

The authors argue that this extension is unnecessary because any planning problem with soft goals and penalties can be transformed—or “compiled away”—into a pure cost‑minimisation problem. Their compilation proceeds as follows. For every soft goal g with utility u(g) they introduce a new Boolean fluent ĝ. They add a “achievement” action that, when executed, sets ĝ to true and has a cost of –u(g) (i.e., a negative cost representing the reward). Any original action that would achieve g is modified to also make ĝ true, or a separate “link” action is inserted to do so after the original action. The final goal set of the transformed problem includes all ĝ, forcing the planner to consider each soft goal’s reward as part of the cost function. Because the planner now minimizes total cost, the negative costs of the achievement actions effectively add the original utilities to the objective.

Penalties are handled symmetrically. For each penalty p with cost p, a new fluent p̂ is introduced, together with an action that sets p̂ to true and has a positive cost of +p. The planner is forced to avoid making p̂ true, because doing so would increase the total cost. Thus, avoiding a penalised condition becomes equivalent to a cost‑minimisation decision.

The transformation preserves the solution space: any plan that is optimal for the original net‑benefit problem can be mapped to a plan of equal (cost‑plus‑reward) value in the compiled problem, and vice‑versa. Consequently, existing cost‑based planners—both optimal (e.g., Fast Downward with admissible heuristics, LPG‑TD) and satisficing (e.g., LPG‑Fast, Fast Downward‑Satisficing)—can be applied directly without any modification to the algorithmic core.

To validate the approach, the authors take the benchmark set from the net‑benefit track of IPC 2014 and IPC 2018. They compile each instance using the method described above and run a suite of state‑of‑the‑art cost‑based planners on the compiled problems. Their results show a consistent advantage for the compiled approach:

Higher coverage – Cost‑based planners solve a larger fraction of instances than the dedicated net‑benefit planners, especially on larger, more complex domains where the latter often time out.
Reduced runtime – Because the heuristic guidance is purely cost‑oriented, search is more focused, yielding average speed‑ups of 30–45 % compared with the original net‑benefit planners.
Equal or better utility – The total utility of the plans returned by the compiled approach matches or exceeds that of the original net‑benefit planners, confirming that optimality is preserved through the compilation.

The authors also experiment with penalty‑only instances and observe the same pattern: after compiling penalties into positive costs, cost‑based planners avoid the penalised fluents more efficiently than planners that treat penalties as separate preference constructs.

The paper discusses several practical considerations. Some planners do not accept negative action costs; in those cases the authors shift all costs by a constant offset to keep them non‑negative while preserving the relative ordering of plans. They also note that extremely large utility values can cause scaling issues for heuristics that assume bounded cost ranges, suggesting a preprocessing normalization step. Finally, they acknowledge that while the compilation eliminates the need for new algorithms, it does not automatically guarantee that existing heuristics will be perfectly tuned for every domain; further heuristic engineering may still be beneficial.

In conclusion, the study demonstrates that soft goals and penalties do not increase the theoretical expressive power of cost‑based planning. By compiling them away, one can leverage the mature ecosystem of optimal and satisficing cost‑based planners, achieving better performance on standard benchmarks. This insight simplifies the research agenda for net‑benefit planning: rather than inventing new search strategies, the community can focus on improving cost‑oriented heuristics and scaling techniques, knowing that any preference‑rich problem can be reduced to a pure cost problem without loss of optimality.