Temporal Knowledge-Graph Memory in a Partially Observable Environment

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Agents in partially observable environments require persistent memory to integrate observations over time. While KGs (knowledge graphs) provide a natural representation for such evolving state, existing benchmarks rarely expose agents to environments where both the world dynamics and the agent’s memory are explicitly graph-shaped. We introduce the Room Environment v3, a configurable environment whose hidden state is an RDF KG and whose observations are RDF triples. The agent may extend these observations into a temporal KG when storing them in long-term memory. The environment is easily adjustable in terms of grid size, number of rooms, inner walls, and moving objects. We define a lightweight temporal KG memory for agents, based on RDF-star-style qualifiers (time_added, last_accessed, num_recalled), and evaluate several symbolic baselines that maintain and query this memory under different capacity constraints. Two neural sequence models (LSTM and Transformer) serve as contrasting baselines without explicit KG structure. Agents train on one layout and are evaluated on a held-out layout with the same dynamics but a different query order, exposing train-test generalization gaps. In this setting, temporal qualifiers lead to more stable performance, and the symbolic TKG (temporal knowledge graph) agent achieves roughly fourfold higher test QA (question-answer) accuracy than the neural baselines under the same environment and query conditions. The environment, agent implementations, and experimental scripts are released for reproducible research at https://github.com/humemai/agent-room-env-v3 and https://github.com/humemai/room-env.

💡 Research Summary

The paper tackles the problem of long‑term memory for agents operating in partially observable environments, where the world evolves over time and the agent must integrate sparse observations to answer location queries. To study this, the authors introduce Room Environment v3, a deterministic, fully configurable grid‑world whose hidden state is an RDF knowledge graph (KG) describing rooms, walls, static objects, moving objects, and the agent itself. At each timestep the agent receives a symbolic observation consisting of the RDF subgraph induced by its current room (local adjacency and visible objects). The environment also defines a periodic schedule for wall toggling and deterministic movement rules for each moving object, ensuring that the world dynamics are structured yet non‑trivial and that the agent cannot answer queries from a single observation alone.

The core contribution is a temporal KG (TKG) memory model. When an observation is stored, the agent inserts the corresponding RDF triples into its long‑term memory as RDF‑star embedded triples, attaching three qualifiers: time_added, last_accessed, and num_recalled. These qualifiers capture when a fact entered memory, when it was last used to answer a question, and how often it has been recalled. By exposing this temporal metadata, the model enables simple, interpretable eviction policies (e.g., LRU, LFU) when memory capacity is limited, while remaining fully compatible with standard SPARQL‑like querying.

Four families of agents are evaluated:

Symbolic TKG agents that deterministically update the TKG using the qualifiers and answer queries by pattern matching. Variants include unlimited memory, capacity‑limited memory with FIFO, LRU, or LFU eviction.
Neural sequence models – an LSTM and a Transformer – that receive exactly the same symbolic observations but store them in a fixed‑length sequence buffer, relying on hidden states rather than explicit graph structure.

Training is performed on one layout; testing uses a held‑out layout that shares the same wall and object schedules but presents the location queries in a different order, thereby probing generalization. Experiments vary memory capacity (e.g., 50, 100, 200 triples).

Results show that the symbolic TKG agent dramatically outperforms the neural baselines. With a modest capacity of 100 triples, the TKG agent achieves ~78 % test accuracy, rising to ~88 % with 200 triples, whereas the LSTM and Transformer plateau around 55 % on the training layout and collapse to 15–20 % on the test layout. Temporal qualifiers improve stability: policies that consider last_accessed and num_recalled yield the most consistent performance, reducing variance by roughly 12 % compared with naïve FIFO. Moreover, the TKG updates and queries run in near‑constant time, demonstrating practical efficiency.

The authors claim four main contributions: (i) the first environment where both world state and agent memory are explicit RDF KGs; (ii) a lightweight temporal KG memory using RDF‑star qualifiers; (iii) a systematic comparison of symbolic graph‑based memory versus unstructured neural sequence memory under identical interfaces; and (iv) a fully released codebase and reproducibility package.

In conclusion, the study provides strong empirical evidence that explicit, temporally annotated graph memories enable agents to retain and retrieve crucial information in partially observable, dynamically changing domains, while neural sequence memories struggle with generalization when structural cues are absent. Future work is suggested on scaling to multimodal observations, learning adaptive eviction strategies via meta‑reinforcement learning, and integrating distributed KG stores for larger‑scale environments.

Temporal Knowledge-Graph Memory in a Partially Observable Environment

💡 Research Summary

Comments & Academic Discussion

Leave a Comment