HabitatAgent: An End-to-End Multi-Agent System for Housing Consultation
Housing selection is a high-stakes and largely irreversible decision problem. We study housing consultation as a decision-support interface for housing selection. Existing housing platforms and many LLM-based assistants often reduce this process to ranking or recommendation, resulting in opaque reasoning, brittle multi-constraint handling, and limited guarantees on factuality. We present HabitatAgent, the first LLM-powered multi-agent architecture for end-to-end housing consultation. HabitatAgent comprises four specialized agent roles: Memory, Retrieval, Generation, and Validation. The Memory Agent maintains multi-layer user memory through internal stages for constraint extraction, memory fusion, and verification-gated updates; the Retrieval Agent performs hybrid vector–graph retrieval (GraphRAG); the Generation Agent produces evidence-referenced recommendations and explanations; and the Validation Agent applies multi-tier verification and targeted remediation. Together, these agents provide an auditable and reliable workflow for end-to-end housing consultation. We evaluate HabitatAgent on 100 real user consultation scenarios (300 multi-turn question–answer pairs) under an end-to-end correctness protocol. A strong single-stage baseline (Dense+Rerank) achieves 75% accuracy, while HabitatAgent reaches 95%.
💡 Research Summary
HabitatAgent addresses the high‑stakes problem of housing selection by constructing a closed‑loop, multi‑agent pipeline that integrates memory management, evidence retrieval, grounded generation, and multi‑tier validation. The system consists of four specialized agents:
-
Memory Agent – extracts hard constraints (district, price, bedrooms) and soft preferences (priority scores, thresholds) from user utterances, structures them as JSON, and stores them in a four‑layer memory hierarchy (conversational, entity, bias, retrieval). Crucially, updates to long‑term memory are gated by a verification step, ensuring that only information that passes downstream validation is persisted, thereby preventing error propagation and preference drift across turns.
-
Retrieval Agent – implements a hybrid Vector‑Graph Retrieval (GraphRAG). Hard constraints are enforced through SQL filters, soft preferences guide dense vector similarity search, and relational constraints (e.g., “near Line 10”, “within 30 min to CBD”) are satisfied by traversing a property knowledge graph (Neo4j). Results are re‑ranked with Reciprocal Rank Fusion and packaged as an “evidence snapshot” that records source IDs, timestamps, and provenance for later auditing.
-
Generation Agent – consumes the evidence snapshot and produces a natural‑language recommendation using a large language model (e.g., Qwen‑7B or DeepSeek). Every numeric or factual claim is tagged with its evidence reference (ev:price, ev:subway, etc.), eliminating hallucination. The output includes a concise property description, a match‑score, and explicit justification aligned with the user’s weighted preferences.
-
Validation Agent – performs three layers of verification: (i) factual consistency (price, size, distance within ±3‑5 % of evidence), (ii) entity accuracy (correct property, school, transit names), and (iii) compliance checking (no prohibited marketing language). Scores above predefined thresholds (0.85 for facts, 0.90 for entities) result in a “PASS” and the response is delivered. If any layer fails, a Failure‑Type‑Aware Remediation module diagnoses the error type (fact, entity, compliance) and triggers a targeted regeneration rather than a blind re‑run.
The authors evaluated HabitatAgent on 100 real‑world consultation scenarios comprising 300 multi‑turn Q&A pairs. Under an end‑to‑end correctness protocol, HabitatAgent achieved 95 % accuracy, a substantial improvement over a strong single‑stage baseline (Dense + Rerank) that scored 75 %. Ablation studies showed that removing any of the three core mechanisms (verification‑gated memory, adaptive hybrid retrieval, or error‑type‑aware remediation) reduced performance by 8–12 % points, confirming their complementary contributions.
Key contributions are: (1) reframing buyer‑side housing consultation as a full decision‑support task rather than a ranking problem; (2) introducing a closed‑loop multi‑agent architecture that tightly couples memory gating, hybrid retrieval routing, and targeted remediation; (3) demonstrating, on realistic data, that this design yields markedly higher end‑to‑end correctness than dense‑only, graph‑only, or self‑correcting baselines.
The work highlights how LLM‑driven systems can achieve the reliability required in domains where factual errors have costly consequences. By combining structured memory, graph‑aware evidence retrieval, and systematic validation, HabitatAgent offers a blueprint for trustworthy AI assistants in real‑estate, finance, healthcare, and other high‑risk verticals. Future directions include scaling the agent collaboration framework, incorporating user feedback loops for continual learning, and extending compliance modules to cover jurisdiction‑specific regulations.
Comments & Academic Discussion
Loading comments...
Leave a Comment