Active Epistemic Control for Query-Efficient Verified Planning

Active Epistemic Control for Query-Efficient Verified Planning
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Planning in interactive environments is challenging under partial observability: task-critical preconditions (e.g., object locations or container states) may be unknown at decision time, yet grounding them through interaction is costly. Learned world models can cheaply predict missing facts, but prediction errors can silently induce infeasible commitments. We present \textbf{Active Epistemic Control (AEC)}, an epistemic-categorical planning layer that integrates model-based belief management with categorical feasibility checks. AEC maintains a strict separation between a \emph{grounded fact store} used for commitment and a \emph{belief store} used only for pruning candidate plans. At each step, it either queries the environment to ground an unresolved predicate when uncertainty is high or predictions are ambiguous, or simulates the predicate to filter hypotheses when confidence is sufficient. Final commitment is gated by grounded precondition coverage and an SQ-BCP pullback-style compatibility check, so simulated beliefs affect efficiency but cannot directly certify feasibility. Experiments on ALFWorld and ScienceWorld show that AEC achieves competitive success with fewer replanning rounds than strong LLM-agent baselines.


💡 Research Summary

The paper tackles the classic problem of planning under partial observability, where critical preconditions (e.g., object locations, container states) are often unknown at decision time. While learned world models can cheaply predict missing facts, treating those predictions as ground truth can silently cause infeasible plans. To address this, the authors introduce Active Epistemic Control (AEC), a planning layer that explicitly separates “grounded” knowledge from “beliefs” and uses a categorical feasibility check to guarantee safety.

Key components of AEC

  1. Epistemic State – represented as a triple (w, ŵ, H).
    * w* (grounded fact store) contains only predicates obtained from the initial observation or from explicit environment queries.
    * ŵ* (belief store) holds model‑generated predictions together with an epistemic uncertainty σ. These beliefs are never used to certify a plan.
    * H* is a finite set of candidate plans, each annotated with its precondition set Pre(h) and expected truth values for those preconditions.

  2. Query vs. Simulation Decision – For each plan h, the set of unresolved predicates U(w, ŵ, h) is computed. The controller selects a predicate p∈U and asks a predictor Mθ to return (µ, σ). If |µ‑0.5| is smaller than a margin ε, the predicate is deemed ambiguous and a QUERY(p) macro‑action is executed, grounding p in w (and possibly revealing additional facts). Otherwise, the predicate is simulated: (p, v̂, σ) is stored in ŷ and H is pruned based on the simulated outcome. Crucially, predictions are conditioned only on w, preventing self‑reinforcing belief loops.

  3. Verification Layer – A plan is committed only after passing a Sound Verifier V that depends solely on w. V combines (i) a precondition coverage check (grounded or entailed) and (ii) a categorical pull‑back compatibility test inherited from SQ‑BCP. Because V never reads ŷ, simulation errors cannot directly certify feasibility; they only affect which candidates survive pruning.

  4. Theoretical Guarantee – Theorem 3.4 shows that, assuming each query result is correct with probability at least 1‑ε_oracle(p), the probability that the finally committed plan is feasible is at least 1‑∑_{p∈Q} ε_oracle(p), where Q is the set of predicates grounded after initialization. Errors from belief‑only predicates do not appear in the bound, confirming that simulation influences efficiency but not safety.

Experimental Evaluation
The authors evaluate AEC on two embodied benchmarks: ALFWorld (household tasks) and ScienceWorld (science‑experiment scenarios). Baselines include strong LLM‑based agents such as ReAct, Reflexion, and the neurosymbolic agent WALL‑E. Metrics comprise success rate, number of environment queries, replanning rounds, and total token usage. Results indicate that AEC achieves comparable or slightly higher success rates while reducing query counts by roughly 30‑40 % and cutting replanning rounds substantially. The benefit is most pronounced in tasks with many latent preconditions (e.g., checking whether a microwave is open, whether an apple is hot), where AEC’s uncertainty‑driven query policy avoids unnecessary interaction yet still grounds the truly critical facts.

Contributions

  1. Epistemic–categorical integration – merges uncertainty‑guided information gathering with a categorical pull‑back feasibility check.
  2. Separation principle – formalizes and enforces a strict boundary between model‑based belief pruning and grounded‑only commitment, guaranteeing that simulation errors cannot compromise plan correctness.
  3. Empirical validation – demonstrates that the approach scales to diverse interactive domains and yields tangible efficiency gains over state‑of‑the‑art LLM agents.

Limitations and Future Work
AEC relies on well‑calibrated uncertainty estimates; poor calibration could lead to over‑querying or under‑querying. The categorical verifier currently requires hand‑crafted entailment rules, which may limit portability to new domains. Future directions include automated uncertainty calibration, learning richer categorical constraints, and deploying AEC on real‑world robotic platforms where query costs translate to physical actions.

In summary, Active Epistemic Control offers a principled, theoretically grounded control layer that lets agents decide when to verify missing information, preserving safety through grounded‑only commitment while exploiting learned world models for cheap hypothesis pruning. This bridges the gap between purely symbolic planners (which assume full observability) and purely neural approaches (which often ignore verification), providing a practical recipe for query‑efficient, verified planning in partially observable interactive environments.


Comments & Academic Discussion

Loading comments...

Leave a Comment