GPU Accelerated Finite Element Assembly with Runtime Compilation

Reading time: 5 minute
...

📝 Original Info

  • Title: GPU Accelerated Finite Element Assembly with Runtime Compilation
  • ArXiv ID: 1802.03433
  • Date: 2023-06-15
  • Authors: : John Smith, Jane Doe, Robert Johnson

📝 Abstract

In recent years, high performance scientific computing on graphics processing units (GPUs) have gained widespread acceptance. These devices are designed to offer massively parallel threads for running code with general purpose. There are many researches focus on finite element method with GPUs. However, most of the works are specific to certain problems and applications. Some works propose methods for finite element assembly that is general for a wide range of finite element models. But the development of finite element code is dependent on the hardware architectures. It is usually complicated and error prone using the libraries provided by the hardware vendors. In this paper, we present architecture and implementation of finite element assembly for partial differential equations (PDEs) based on symbolic computation and runtime compilation technique on GPU. User friendly programming interface with symbolic computation is provided. At the same time, high computational efficiency is achieved by using runtime compilation technique. As far as we know, it is the first work using this technique to accelerate finite element assembly for solving PDEs. Experiments show that a one to two orders of speedup is achieved for the problems studied in the paper.

💡 Deep Analysis

Figure 1

📄 Full Content

General-purpose computing on graphics processing units (GPGPU) has been developed rapidly in the recent years. Applications in many areas like machine learning, molecular dynamics, computational chemistry, medical imaging and seismic exploration are taking the advantage of GPGPU. Most of the current researches on finite element method with GPU mainly focus on solving the system of linear equations [1][2][3][4][5][6][7]. Some of the research works focus on solving specific problems with GPU accelerated finite element methods [8][9][10][11][12]. Several general algorithms for a wide range of finite element models are proposed in [13] and the papers cited in. But the development of finite element code is dependent on the hardware architectures which is usually complicated and error prone using the libraries provided by the hardware vendors. In this paper, we present architecture and implementation of the finite element assembly for partial differential equations (PDE) based on symbolic computation and runtime compilation technique on GPU. This symbolic-numeric architecture is first proposed in [16,17] for solving PDEs on CPU. Our work adopts the symbolic-numeric paradigm and extends it to GPGPU with runtime compilation. In our system, a user friendly programming interface with symbolic computation is provided. At the same time, high computation efficiency is achieved by using the runtime GPU kernels compilation library NVRTC provided in the CUDA libraries which is first released with CUDA 7 in 2015. CUDA is a parallel computing platform and application programming interface (API) model created by Nvidia. As far as we know, it is the first work using this technique on GPU to accelerate finite element assembly for solving PDEs.

The main contribution of the paper is 1. The mathematical functions and finite element weak forms are expressed in symbols which are close to their mathematical expressions. It is human friendly and very easy to understand.

  1. The expressions in finite element method are simplified and optimized by symbolic manipulation before compiling. The final expressions are compiled during runtime, and further optimization can be achieved on target devices (CPU, GPU) using runtime compilation technique.

  2. Device independent implementation is achieved with high portability. End users only focus on the mathematical details of the partial differential equations with finite element method. It is not required to the end users to know the compilation details on different device architectures.

The rest of the paper is organized as follows. Section 2 summarizes the related work and the state of the art. Section 3 presents the overview of the design of the system. Section 4 describes the PDE problem and symbolic computation in FEM. Runtime compilation for GPU compute kernels is presented in Section 5. Experiments and results are discussed in Section 6. Finally, Section 7 concludes the paper.

We focus on the weak form construction and finite element assembly step in FEM. Solving the system of linear equations is outside of our consideration in this paper. Most of the related works for finite element assembly are proposed for GPU devices in the recent years, for example parallel threads run for nonzero value in the global linear system [9], elements with graph coloring partition [18,19], local matrix approach [20] and coordinate list format of the global matrix [21,22]. The comparison of the above mentioned methods can be found in [13]. The symmetry of local mass and stiffness matrices on a GPU is investigated in [23]. The architecture of the whole finite element steps are discussed in [24]. Most of the assembly algorithms mentioned above can be used as the backend assembly process in the system proposed in this paper since the system is designed to be able to easily provide transparent assembly algorithm and device architecture support for the end users.

Symbolic computation has a long history in scientific computing which refers to the study and development of algorithms and software for manipulating mathematical expressions and other mathematical objects, such as simplification of expressions, differentiation using chain rule, polynomial factorization, indefinite integration, etc. Symbolic computation is first adopted for expressing finite element functions and weak forms in [16] for solving PDEs using Java language with just-intime compilation on CPU and later extended to other languages like C++ and Python with a cloud platform enabled in [17]. The adoption of symbolic computation for FEM in [16,17] is motivated by solving PDE based inverse problems with several newly proposed methods in [25][26][27] which involve significant amount of symbolic manipulations of the mathematical expressions. In order to evaluate the resulting expressions as fast as possible, JIT compilation technique is introduced together with symbolic computation. The combination of symbolic computation and just-in-time compilatio

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut