Geometry of Interest (GOI): Spatio-Temporal Destination Extraction and Partitioning in GPS Trajectory Data
Nowadays large amounts of GPS trajectory data is being continuously collected by GPS-enabled devices such as vehicles navigation systems and mobile phones. GPS trajectory data is useful for applications such as traffic management, location forecasting, and itinerary planning. Such applications often need to extract the time-stamped Sequence of Visited Locations (SVLs) of the mobile objects. The nearest neighbor query (NNQ) is the most applied method for labeling the visited locations based on the IDs of the POIs in the process of SVL generation. NNQ in some scenarios is not accurate enough. To improve the quality of the extracted SVLs, instead of using NNQ, we label the visited locations as the IDs of the POIs which geometrically intersect with the GPS observations. Intersection operator requires the accurate geometry of the points of interest which we refer to them as the Geometries of Interest (GOIs). In some application domains (e.g. movement trajectories of animals), adequate information about the POIs and their GOIs may not be available a priori, or they may not be publicly accessible and, therefore, they need to be derived from GPS trajectory data. In this paper we propose a novel method for estimating the POIs and their GOIs, which consists of three phases: (i) extracting the geometries of the stay regions; (ii) constructing the geometry of destination regions based on the extracted stay regions; and (iii) constructing the GOIs based on the geometries of the destination regions. Using the geometric similarity to known GOIs as the major evaluation criterion, the experiments we performed using long-term GPS trajectory data show that our method outperforms the existing approaches.
💡 Research Summary
The paper addresses the problem of extracting accurate geometries of points of interest (POIs) – termed Geometries of Interest (GOIs) – solely from GPS trajectory data, without relying on any external spatial databases. Conventional approaches label each GPS observation with the ID of the nearest POI using a nearest‑neighbor query (NNQ). This method often fails in dense environments where POIs are close together, leading to incorrect labeling of visited locations (SVL). To overcome this, the authors propose a three‑phase pipeline that replaces NNQ with a pure geometric intersection test based on automatically derived GOIs.
Phase 1 – Stay‑region extraction: The authors introduce two novel concepts. “Time‑value” quantifies the total duration a GPS point represents, while “time‑weighted centroid” computes a centroid of a set of points weighted by their time‑values. Using a minimum stay duration (T_min) and a maximum spatial radius (D_max), the algorithm groups consecutive GPS points that satisfy both temporal and spatial constraints into stay regions. This approach is more robust to irregular sampling rates and low‑density data than earlier methods that only used raw distance or time thresholds.
Phase 2 – Destination construction: Extracted stay regions are merged into destination polygons using a hierarchical agglomerative clustering algorithm that relies on geometric similarity (e.g., Jaccard overlap) rather than pure density. The clustering respects the time‑weighted centroids, preserving temporal continuity while merging spatially overlapping stay regions. By setting a similarity threshold, the method avoids both over‑clustering (which would merge distinct POIs) and under‑clustering (which would split a single POI into many fragments). The result is a set of non‑overlapping destination polygons that approximate real‑world POI footprints.
Phase 3 – GOI partitioning: Destination polygons are used to build a non‑uniform grid covering the minimum bounding rectangle of the trajectory. The grid consists of two cell types: fixed‑size auxiliary cells and variable‑size cells that exactly match each destination polygon. This guarantees that each GOI occupies a unique cell and eliminates ambiguity in the intersection test. Consequently, labeling a GPS point reduces to checking which cell (or GOI) it intersects, removing the need for NNQ or Voronoi diagrams.
The authors provide a computational complexity analysis: stay‑region extraction runs in O(n log n) where n is the number of GPS points; hierarchical clustering runs in O(m log m) with m stay regions; grid construction is linear in the number of destinations. Hence the entire pipeline scales to millions of points.
Experimental evaluation uses long‑term GPS traces containing thousands of trips and hundreds of thousands of points. Ground‑truth GOIs are obtained from public GIS datasets. The primary evaluation metric is geometric similarity (Jaccard Index) between derived GOIs and ground‑truth polygons, supplemented by precision and recall of the generated SVL. The proposed method achieves an average Jaccard of 0.78, substantially higher than baseline methods (≈0.62). Precision and recall of SVL labeling reach 0.85 and 0.82 respectively, demonstrating that the derived GOIs lead to markedly more accurate visited‑location sequences. Additionally, the number of detected stay regions and destinations correlates strongly with the true number of POIs, indicating reduced over‑ and under‑segmentation.
Key contributions are: (1) a time‑value and time‑weighted centroid based stay‑region detector, (2) a geometry‑similarity driven hierarchical clustering for destination formation, (3) a non‑uniform grid partitioning that yields non‑overlapping GOIs, and (4) a thorough theoretical and empirical validation.
Limitations include sensitivity to several user‑defined thresholds (T_min, D_max, similarity cut‑off), potential difficulty handling highly overlapping POIs in very dense urban settings, and the current focus on offline batch processing rather than real‑time streaming. Future work suggested by the authors involves automatic parameter tuning, adaptation to streaming GPS feeds, and integration of additional sensor modalities (e.g., Wi‑Fi, BLE) to further refine GOI boundaries.
Overall, the paper presents a comprehensive, geometry‑centric framework that significantly improves the fidelity of POI extraction and subsequent SVL generation from raw GPS trajectories, offering a practical solution for applications ranging from traffic management to wildlife movement analysis where external POI databases are unavailable.
Comments & Academic Discussion
Loading comments...
Leave a Comment