Contiguous Storage of Grid Data for Heterogeneous Computing

Reading time: 5 minute
...

📝 Original Info

  • Title: Contiguous Storage of Grid Data for Heterogeneous Computing
  • ArXiv ID: 2512.11473
  • Date: 2025-12-12
  • Authors: ** Fan Gu, Xiangyu Hu (Technical University of Munich, Germany) **

📝 Abstract

Structured Cartesian grids are a fundamental component in numerical simulations. Although these grids facilitate straightforward discretization schemes, their naïve use in sparse domains leads to excessive memory overhead and inefficient computation. Existing frameworks address are primarily optimized for CPU execution and exhibit performance bottlenecks on GPU architectures due to limited parallelism and high memory access latency. This work presents a redesigned storage architecture optimized for GPU compatibility and efficient execution across heterogeneous platforms. By abstracting low-level GPU-specific details and adopting a unified programming model based on SYCL, the proposed data structure enables seamless integration across host and device environments. This architecture simplifies GPU programming for end-users while improving scalability and portability in sparse-grid and gird-particle coupling numerical simulations.

💡 Deep Analysis

Figure 1

📄 Full Content

Figure 1: Particle generation from an extruded prism. The gray surface indicates the clipped zero level set, the framed boxes indicate the core data package blocks used to describe the surface and the small spheres indicate the SPH particles generated and phys- ically relaxed for later numerical simulation. 1 arXiv:2512.11473v1 [cs.CE] 12 Dec 2025 Contiguous Storage of Grid Data for Heterogeneous Computing Fan Gu, Xiangyu Hu∗ Technical University of Munich, Garching 85748, Germany Abstract Structured Cartesian grids are a fundamental component in numerical simu- lations. Although these grids facilitate straightforward discretization schemes, their naïve use in sparse domains leads to excessive memory overhead and in- efficient computation. Existing frameworks address are primarily optimized for CPU execution and exhibit performance bottlenecks on GPU architec- tures due to limited parallelism and high memory access latency. This work presents a redesigned storage architecture optimized for GPU compatibility and efficient execution across heterogeneous platforms. By abstracting low- level GPU-specific details and adopting a unified programming model based on SYCL, the proposed data structure enables seamless integration across host and device environments. This architecture simplifies GPU program- ming for end-users while improving scalability and portability in sparse-grid and gird-particle coupling numerical simulations. ∗Corresponding author. Email addresses: ge86gur@mytum.de (Fan Gu), xiangyu.hu@tum.de (Xiangyu Hu ) 1. Introduction Many numerical methods in scientific computing, computer graphics, and computational physics rely on structured Cartesian grids to discretize scalar and vector fields. These grids are widely used in level set methods[6], compu- tational fluid dynamics (CFD) solvers [2, 3], and volumetric data processing pipelines [4]. While Cartesian grids offer simplicity and regularity for finite difference and finite volume methods, large-scale simulations often involve highly sparse domains where only a small fraction of the grid is actively used at any time. Naively storing and updating such sparse data results in excessive mem- ory consumption and suboptimal memory access patterns, particularly on modern hardware architectures with hierarchical memory system and paral- lel processing constraints. To address this, data structures such as OpenVDB [4] and SPGgrid [7] have been developed to reduce memory overhead and improve computational performance. These frameworks leverage hierarchi- cal spatial partitioning or page-based layouts to exploit sparsity. However, they are primarily optimized for CPU-based execution, and often suffer from performance degradation when ported to Graphic Processing Unit (GPU) ar- chitectures. Key challenges include high random-access latency and limited concurrency during data update. GPU has become prevalent in accelerating a wide range of numerical simulations due to their high parallel processing capability. Despite the high computing power provided, leveraging GPU typically requires the adoption of specialized programming models, such as CUDA(NVIDIA), HIP(AMD), SYCL, and others, which presents a steep learning curve for end-users. 3 In our prior implementations in the open-source SPH (Smoothed Particle Hydrodynamics) multi-physics library SPHinXsys [3, 9], memory allocation for activated grid regions was managed via dynamic memory pools. While this approach yields efficient memory usage, the resulting data structures are not inherently compatible with GPU execution. In this work, we present a redesigned storage architecture aimed at improving computational efficiency and enabling GPU compatibility. The new design is optimized for unified usage across both host and device platforms, abstracting underlying imple- mentation details from the user. This work introduces an optimized data structure tailored for GPU execu- tion, along with a newly developed computing kernels. The key contributions are: 1. Abstraction of GPU-specific and SYCL-specific implementation details, thereby simplifying usage for end-users. 2. Unified codebase enabling seamless execution on both CPU and GPU platforms through a single code logic. 3. Heterogeneous computing is achieved for both sparse-grid and grid- particle coupling numerical simulations. The proposed method incorporates computing kernels, data structures and execution strategies. The computing kernel orchestrates the computa- tional process, while data structure manage storage and synchronization of data across CPU and GPU. Upon execution, the computing kernel automat- ically retrieves the appropriate host or device variable based on the selected execution strategy, thus ensuring consistency and portability. 4 2. Literature Overview OpenVDB [4] employs a shallow B+ trees to store data efficiently, using various strategies to accelerate both sequential and stencil-based data access. By dynamically organizing data with a hier

📸 Image Gallery

core-packages.png gear.jpeg mesh-data-package.jpg

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut