GeoGR: A Generative Retrieval Framework for Spatio-Temporal Aware POI Recommendation

GeoGR: A Generative Retrieval Framework for Spatio-Temporal Aware POI Recommendation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Next Point-of-Interest (POI) prediction is a fundamental task in location-based services, especially critical for large-scale navigation platforms like AMAP that serve billions of users across diverse lifestyle scenarios. While recent POI recommendation approaches based on SIDs have achieved promising, they struggle in complex, sparse real-world environments due to two key limitations: (1) inadequate modeling of high-quality SIDs that capture cross-category spatio-temporal collaborative relationships, and (2) poor alignment between large language models (LLMs) and the POI recommendation task. To this end, we propose GeoGR, a geographic generative recommendation framework tailored for navigation-based LBS like AMAP, which perceives users’ contextual state changes and enables intent-aware POI recommendation. GeoGR features a two-stage design: (i) a geo-aware SID tokenization pipeline that explicitly learns spatio-temporal collaborative semantic representations via geographically constrained co-visited POI pairs, contrastive learning, and iterative refinement; and (ii) a multi-stage LLM training strategy that aligns non-native SID tokens through multiple template-based continued pre-training(CPT) and enables autoregressive POI generation via supervised fine-tuning(SFT). Extensive experiments on multiple real-world datasets demonstrate GeoGR’s superiority over state-of-the-art baselines. Moreover, deployment on the AMAP platform, serving millions of users with multiple online metrics boosting, confirms its practical effectiveness and scalability in production.


💡 Research Summary

**
GeoGR introduces a generative retrieval framework tailored for large‑scale navigation‑centric location‑based services such as AMAP. The paper identifies two fundamental shortcomings of existing point‑of‑interest (POI) recommendation approaches: (1) inadequate modeling of semantic identifiers (SIDs) that fail to capture cross‑category spatio‑temporal collaborative relationships, and (2) poor alignment between off‑the‑shelf large language models (LLMs) and the POI recommendation task, especially when new, non‑native tokens are introduced. To address these issues, GeoGR adopts a two‑stage pipeline.

In the first stage, a geo‑aware SID tokenization pipeline is built. POI textual attributes (name, category, address, etc.) are combined with geographic coordinates and contextual metadata. Geographically constrained co‑visited POI pairs are sampled (e.g., pairs visited by the same user within a short distance) and used to construct positive and negative examples. Contrastive learning, powered by an LLM encoder, forces embeddings of co‑visited POIs to be close while pushing unrelated POIs apart, thereby injecting spatio‑temporal collaborative signals into the semantic space. The resulting dense vectors are quantized using Residual Quantization‑Kmeans (RQ‑Kmeans) to obtain hierarchical discrete tokens (SIDs). An EM‑style iterative refinement further optimizes token assignments by maximizing the likelihood of observed user‑POI interactions, ensuring that the final SID set reflects both semantic similarity and collaborative patterns.

The second stage aligns the LLM with the recommendation domain. Because SIDs are not part of the original LLM vocabulary, a Continued Pre‑Training (CPT) phase is performed on a large corpus of template‑based “text‑to‑SID” pairs, effectively teaching the model the new tokens and their semantic connections to POIs. After CPT, a Supervised Fine‑Tuning (SFT) stage uses instruction‑style data that encodes user history, real‑time context (time, location, search query, action type), and the target next‑POI SID sequence. The model learns to autoregressively generate SID tokens conditioned on the provided context, mirroring the generation process of standard LLMs but specialized for POI recommendation.

Extensive offline experiments on public benchmarks (e.g., Gowalla, Foursquare) and proprietary AMAP logs show that GeoGR outperforms state‑of‑the‑art baselines such as STAN, LLM4POI, and OneRec‑V2 across Hit@10, NDCG@10, and MAP metrics, with gains ranging from 5 % to 12 %. Notably, the approach excels in sparse, cross‑category scenarios (e.g., airport → hotel → parking) where collaborative SID information is most beneficial. The hierarchical tokenization dramatically reduces vocabulary size (to <0.5 % of the total POI count) and cuts model parameters by roughly 30 %, leading to inference latencies under 40 ms—crucial for real‑time services.

A production deployment on the AMAP platform validates the method in an industrial setting. A four‑week A/B test reports significant lifts: click‑through‑rate (+3.4 %), average session duration (+5.1 %), and reservation conversion (+2.8 %). Moreover, the compact token set reduces server‑side memory and compute costs by about 18 %.

In summary, GeoGR contributes (1) a novel geo‑aware SID learning mechanism that fuses semantic and spatio‑temporal collaborative signals, (2) a multi‑stage LLM alignment strategy (CPT + SFT) that seamlessly incorporates non‑native tokens for generative recommendation, and (3) thorough offline and online evaluations demonstrating both algorithmic superiority and practical scalability. The work establishes a new paradigm for intent‑aware, next‑POI prediction in large‑scale, multi‑scenario navigation services.


Comments & Academic Discussion

Loading comments...

Leave a Comment