Surface Networks

Surface Networks

Introduction

3D geometry analysis, manipulation and synthesis plays an important role in a variety of applications from engineering to computer animation to medical imaging. Despite the vast amount of high-quality 3D geometric data available, data-driven approaches to problems involving complex geometry have yet to become mainstream, in part due to the lack of data representation regularity which is required for traditional convolutional neural network approaches. While in computer vision problems inputs are typically sampled on regular two or three-dimensional grids, surface geometry is represented in a more complex form and, in general, cannot be converted to an image-like format by parametrizing the shape using a single planar chart. Most commonly an irregular triangle mesh is used to represent shapes, capturing its main topological and geometrical properties.

Similarly to the regular grid case (used for images or videos), we are interested in data-driven representations that strike the right balance between expressive power and sample complexity. In the case of CNNs, this is achieved by exploiting the inductive bias that most computer vision tasks are locally stable to deformations, leading to localized, multiscale, stationary features. In the case of surfaces, we face a fundamental modeling choice between extrinsic versus intrinsic representations. Extrinsic representations rely on the specific embedding of surfaces within a three-dimensional ambient space, whereas intrinsic representations only capture geometric properties specific to the surface, irrespective of its parametrization. Whereas the former offer arbitrary representation power, they are unable to easily exploit inductive priors such as stability to local deformations and invariance to global transformations.

A particularly simple and popular extrinsic method represents shapes as point clouds in $`\R^3`$ of variable size, and leverages recent deep learning models that operate on input sets . Despite its advantages in terms of ease of data acquisition (they no longer require a mesh triangulation) and good empirical performance on shape classification and segmentation tasks, one may wonder whether this simplification comes at a loss of precision as one considers more challenging prediction tasks.

In this paper, we develop an alternative pipeline that applies neural networks directly on triangle meshes, building on geometric deep learning. These models provide data-driven intrinsic graph and manifold representations with inductive biases analogous to CNNs on natural images. Models based on Graph Neural Networks and their spectral variants have been successfully applied to geometry processing tasks such as shape correspondence . In their basic form, these models learn a deep representation over the discretized surface by combining a latent representation at a given node with a local linear combination of its neighbors’ latent representations, and a point-wise nonlinearity. Different models vary in their choice of linear operator and point-wise nonlinearity, which notably includes the graph Laplacian, leading to spectral interpretations of those models.

Our contributions are three-fold. First, we extend the model to support extrinsic features. More specifically, we exploit the fact that surfaces in $`\R^3`$ admit a first-order differential operator, the Dirac operator, that is stable to discretization, provides a direct generalization of Laplacian-based propagation models, and is able to detect principal curvature directions . Next, we prove that the models resulting from either Laplace or Dirac operators are stable to deformations and to discretization, two major sources of variability in practical applications. Last, we introduce a generative model for surfaces based on the variational autoencoder framework , that is able to exploit non-Euclidean geometric regularity.

By combining the Dirac operator with input coordinates, we obtain a fully differentiable, end-to-end feature representation that we apply to several challenging tasks. The resulting Surface Networks – using either the Dirac or the Laplacian, inherit the stability and invariance properties of these operators, thus providing data-driven representations with certified stability to deformations. We demonstrate the model efficiency on a temporal prediction task of complex dynamics, based on a physical simulation of elastic shells, which confirms that whenever geometric information (in the form of a mesh) is available, it can be leveraged to significantly outperform point-cloud based models.

Our main contributions are summarized as follows:

  • We demonstrate that Surface Networks provide accurate temporal prediction of surfaces under complex non-linear dynamics, motivating the use of geometric shape information.

  • We prove that Surface Networks define shape representations that are stable to deformation and to discretization.

  • We introduce a generative model for 3D surfaces based on the variational autoencoder.

A reference implementation of our algorithm is available at https://github.com/jiangzhongshi/SurfaceNetworks.

Preliminary Study: Metric Learning for Dense Correspondence

As an interesting extension, we apply the architecture we built in Experiments 6.2 directly to a dense shape correspondence problem.

Similarly as the graph correspondence model from , we consider a Siamese Surface Network, consisting of two identical models with the same architecture and sharing parameters. For a pair of input surfaces $`\mathcal{M}_1, \mathcal{M}_2`$ of $`N_1`$, $`N_2`$ points respectively, the network produces embeddings $`E_1 \in \R^{N_1 \times d}`$ and $`E_2 \in \R^{N_2 \times d}`$. These embeddings define a trainable similarity between points given by

\begin{equation}
\label{formulaa}
s_{i,j} = \frac{e^{\langle E_{1,i}, E_{2,j} \rangle}}{\sum_{j'} e^{\langle E_{1,i}, E_{2,j'} \rangle}},
\end{equation}

which can be trained by minimizing the cross-entropy relative to ground truth pairs. A diagram of the architecture is provided in Figure 1.

In general, dense shape correspondence is a task that requires a blend of intrinsic and extrinsic information, motivating the use of data-driven models that can obtain such tradeoffs automatically. Following the setup in Experiment 6.2, we use models with 15 ResNet-v2 blocks with 128 output features each, and alternate Laplace and Dirac based models with Average Pooling blocks to cover a larger context: The input to our network consists of vertex positions only.

We tested our architecture on a reconstructed (i.e. changing the mesh connectivity) version of the real scan of FAUST dataset. The FAUST dataset contains 100 real scans and their corresponding ground truth registrations. The ground truth is based on a deformable template mesh with the same ordering and connectivity, which is fitted to the scans. In order to eliminate the bias of using the same template connectivity, as well as the need of a single connected component, the scans are reconstructed again with . To foster replicability, we release the processed dataset in the additional material. In our experiment, we use 80 models for training and 20 models for testing.

Since the ground truth correspondence is implied only through the common template mesh, we compute the correspondence between our meshes with a nearest neighbor search between the point cloud and the reconstructed mesh. Consequently, due to the drastic change in vertex replacement after the remeshing, only 60-70 percent of labeled matches are used. Although making it more challenging, we believe this setup is close to a real case scenario, where acquisition noise and occlusions are unavoidable.

Siamese network pipeline: the two networks take vertex coordinates of the input models and generate a high dimensional feature vector, which are then used to define a map from 1 to 2. Here, the map is visualized by taking a color map on 2, and transferring it on 1
Additional results from our setup. Plot in the middle shows rate of correct correspondence with respect to geodesic error . We observe that Laplace is performing similarly to Dirac in this scenario. We believe that the reason is that the FAUST dataset contains only isometric deformations, and thus the two operators have access to the same information. We also provide visual comparison, with the transfer of a higher frequency colormap from the reference shape to another pose.
Heat map illustrating the point-wise geodesic difference between predicted correspondence point and the ground truth. The unit is proportional to the geodesic diameter, and saturated at 10%.
A failure case of applying the Laplace network to a new pose in the FAUST benchmark dataset. The network confuses between left and right arms. We show the correspondence visualization for front and back of this pair.

Our preliminary results are reported in Figure 2. For simplicity, we generate predicted correspondences by simply taking the mode of the softmax distribution for each reference node $`i`$: $`\hat{j}(i) = \arg\max_j s_{i,j}`$, thus avoiding a refinement step that is standard in other shape correspondence pipelines. The MLP model uses no context whatsoever and provides a baseline that captures the prior information from input coordinates alone. Using contextual information (even extrinsically as in point-cloud model) brings significative improvments, but these results may be substantially improved by encoding further prior knowledge. An example of the current failure of our model is depitcted in Figure 4, illustrating that our current architecture does not have sufficiently large spatial context to disambiguate between locally similar (but globally inconsistent) parts.

We postulate that the FAUST dataset is not an ideal fit for our contribution for two reasons: (1) it is small (100 models), and (2) it contains only near-isometric deformations, which do not require the generality offered by our network. As demonstrated in , the correspondence performances can be dramatically improved by constructing basis that are invariant to the deformations. We look forward to the emergence of new geometric datasets, and we are currently developing a capture setup that will allow us to acquire a more challenging dataset for this task.

Conclusions

We have introduced Surface Networks, a deep neural network that is designed to naturally exploit the non-Euclidean geometry of surfaces. We have shown how a first-order differential operator (the Dirac operator) can detect and adapt to geometric features beyond the local mean curvature, the limit of what Laplacian-based methods can exploit. This distinction is important in practice, since areas with high directional curvature are perceptually important, as shown in the experiments. That said, the Dirac operator comes at increased computational cost due to the quaternion calculus, and it would be interesting to instead learn the operator, akin to recent Message-Passing NNs and explore whether Dirac is recovered.

Whenever the data contains good-quality meshes, our experiments demonstrate that using intrinsic geometry offers vastly superior performance to point-cloud based models. While there are not many such datasets currently available, we expect them to become common in the next years, as scanning and reconstruction technology advances and 3D sensors are integrated in consumer devices. SNs provide efficient inference, with predictable runtime, which makes them appealing across many areas of computer graphics, where a fixed, per-frame cost is required to ensure a stable framerate, especially in VR applications. Our future plans include applying Surface Networks precisely to having automated, data-driven mesh processing, and generalizing the generative model to arbitrary meshes, which will require an appropriate multi-resolution pipeline.

Surface Networks

This section presents our surface neural network model and its basic properties. We start by introducing the problem setup and notations using the Laplacian formalism (Section 15.1), and then introduce our model based on the Dirac operator (Section 15.2).

Laplacian Surface Networks

Our first goal is to define a trainable representation of discrete surfaces. Let $`\M=\{V,E,F\}`$ be a triangular mesh, where $`V=(v_i \in \R^3)_{i \leq N}`$ contains the node coordinates, $`E = (e_{i,j} )`$ corresponds to edges, and $`F`$ is the set of triangular faces. We denote as $`\Delta`$ the discrete Laplace-Beltrami operator (we use the popular cotangent weights formulation, see for details).

This operator can be interpreted as a local, linear high-pass filter in $`\M`$ that acts on signals $`x \in \R^{d \times |V|}`$ defined on the vertices as a simple matrix multiplication $`\tilde{x} = \Delta x`$. By combining $`\Delta`$ with an all-pass filter and learning generic linear combinations followed by a point-wise nonlinearity, we obtain a simple generalization of localized convolutional operators in $`\M`$ that update a feature map from layer $`k`$ to layer $`k+1`$ using trainable parameters $`A_k`$ and $`B_k`$:

\begin{equation}
\label{laplacenet}
x^{k+1} = \rho \left( A_k \Delta x^k + B_k x^k \right)~,~A_k,B_k \in \R^{d_{k+1} \times d_k} ~.
\end{equation}

By observing that the Laplacian itself can be written in terms of the graph weight similarity by diagonal renormalization, this model is a specific instance of the graph neural network and a generalization of the spectrum-free Laplacian networks from . As shown in these previous works, convolutional-like layers ([laplacenet]) can be combined with graph coarsening or pooling layers.

In contrast to general graphs, meshes contain a low-dimensional Euclidean embedding that contains potentially useful information in many practical tasks, despite being extrinsic and thus not invariant to the global position of the surface. A simple strategy to strike a good balance between expressivity and invariance is to include the node canonical coordinates as input channels to the network: $`x^{1}:=V \in \R^{|V| \times 3}`$. The mean curvature can be computed by applying the Laplace operator to the coordinates of the vertices:

\begin{equation}
\label{laplacenorm}
\Delta x^{1} = -2 H \bf{n}~,
\end{equation}

where $`H`$ is the mean curvature function and $`{\bf n}(u)`$ is the normal vector of the surface at point $`u`$. As a result, the Laplacian neural model ([laplacenet]) has access to mean curvature and normal information. Feeding Euclidean embedding coordinates into graph neural network models is related to the use of generalized coordinates from . By cascading $`K`$ layers of the form ([laplacenet]) we obtain a representation $`\Phi_{\Delta}(\M)`$ that contains generic features at each node location. When the number of layers $`K`$ is of the order of $`\text{diam}(\M)`$, the diameter of the graph determined by $`\M`$, then the network is able to propagate and aggregate information across the whole surface.

Equation ([laplacenorm]) illustrates that a Laplacian layer is only able to extract isotropic high-frequency information, corresponding to the mean variations across all directions. Although in general graphs there is no well-defined procedure to recover anisotropic local variations, in the case of surfaces some authors () have considered anisotropic extensions. We describe next a particularly simple procedure to increase the expressive power of the network using a related operator from quantum mechanics: the Dirac operator, that has been previously used successfully in the context of surface deformation and shape analysis .

Dirac Surface Networks

The Laplace-Beltrami operator $`\Delta`$ is a second-order differential operator, constructed as $`\Delta = -\text{div} \nabla`$ by combining the gradient (a first-order differential operator) with its adjoint, the divergence operator. In an Euclidean space, one has access to these first-order differential operators separately, enabling oriented high-pass filters.

For convenience, we embed $`\R^3`$ to the imaginary quaternion space $`\text{Im}(\HH)`$ (see Appendix A in the Suppl. Material for details). The Dirac operator is then defined as a matrix $`D \in \HH^{|F| \times |V|}`$ that maps (quaternion) signals on the nodes to signals on the faces. In coordinates,

D_{f,j} = \frac{-1}{2 | \ba_f | }e_j~,~f \in F, j \in V~,

where $`e_j`$ is the opposing edge vector of node $`j`$ in the face $`f`$, and $`\ba_f`$ is the area (see Appendix A) using counter-clockwise orientations on all faces.

To apply the Dirac operator defined in quaternions to signals in vertices and faces defined in real numbers, we write the feature vectors as quaternions by splitting them into chunks of 4 real numbers representing the real and imaginary parts of a quaternion; see Appendix A. Thus, we always work with feature vectors with dimensionalities that are multiples of $`4`$. The Dirac operator provides first-order differential information and is sensitive to local orientations. Moreover, one can verify that

\mbox{Re } D^* D = \Delta~,

where $`D^*`$ is the adjoint operator of $`D`$ in the quaternion space (see Appendix A). The adjoint matrix can be computed as $`D^* = M^{-1}_V D^H M_F`$ where $`D^H`$ is a conjugate transpose of $`D`$ and $`M_V`$, $`M_F`$ are diagonal mass matrices with one third of areas of triangles incident to a vertex and face areas respectively.

The Dirac operator can be used to define a new neural surface representation that alternates layers with signals defined over nodes with layers defined over faces. Given a $`d`$-dimensional feature representation over the nodes $`x^k \in \R^{d \times |V|}`$, and the faces of the mesh, $`y^k \in \R^{d \times |F|}`$, we define a $`d'`$-dimensional mapping to a face representation as

\begin{equation}
\label{eq:dir1}
y^{k + 1}=  \rho \left(C_k D x^k + E_k y^k \right), C_k, E_k \in \R^{d_{k+1} \times d_k},
\end{equation}

where $`C_k, E_k`$ are trainable parameters. Similarly, we define the adjoint layer that maps back to a $`\tilde{d}`$-dimensional signal over nodes as

\begin{equation}
\label{eq:dir2}
x^{k+1} = \rho \left(A_k D^*y^{k+1} + B_k x^k \right)~, A_k, B_k \in \R^{d_{k+1} \times d_k},
\end{equation}

where $`A_k, B_k`$ are trainable parameters. A surface neural network layer is thus determined by parameters $`\{A, B, C, E\}`$ using equations ([eq:dir1]) and ([eq:dir2]) to define $`x^{k+1} \in \R^{d_{k+1} \times |V|}`$. We denote by $`\Phi_D(\M)`$ the mesh representation resulting from applying $`K`$ such layers (that we assume fixed for the purpose of exposition).

The Dirac-based surface network is related to edge feature transforms proposed on general graphs in , although these edge measurements cannot be associated with derivatives due to lack of proper orientation. In general graphs, there is no notion of square root of $`\Delta`$ that recovers oriented first-order derivatives.

Stability of Surface Networks

Here we describe how Surface Networks are geometrically stable, because surface deformations become additive noise under the model. Given a continuous surface $`S \subset \R^3`$ or a discrete mesh $`\M`$, and a smooth deformation field $`\tau: \R^3 \to \R^3`$, we are particularly interested in two forms of stability:

  • Given a discrete mesh $`\M`$ and a certain non-rigid deformation $`\tau`$ acting on $`\M`$, we want to certify that $`\| \Phi(\M) - \Phi(\tau(\M)) \|`$ is small if $`\| \nabla \tau ( \nabla \tau)^* - {\bf I} \|`$ is small, i.e when the deformation is nearly rigid; see Theorem [stabtheo].

  • Given two discretizations $`\M_1`$ and $`\M_2`$ of the same underlying surface $`S`$, we would like to control $`\| \Phi( \M_1) - \Phi(\M_2) \|`$ in terms of the resolution of the meshes; see Theorem [stabtheo2].

These stability properties are important in applications, since most tasks we are interested in are stable to deformation and to discretization. We shall see that the first property is a simple consequence of the fact that the mesh Laplacian and Dirac operators are themselves stable to deformations. The second property will require us to specify under which conditions the discrete mesh Laplacian $`\Delta_\M`$ converges to the Laplace-Beltrami operator $`\Delta_S`$ on $`S`$. Unless it is clear from the context, in the following $`\Delta`$ will denote the discrete Laplacian.

Let $`\M`$ be a $`N`$-node mesh and $`x,\,x' \in \R^{|V| \times d}`$ be input signals defined on the nodes. Assume the nonlinearity $`\rho(\,\cdot \,)`$ is non-expansive ($`| \rho(z) - \rho(z') | \leq | z - z'|`$). Then

  1. $`\| \Phi_\Delta(\M; x) - \Phi_\Delta(\M; x') \| \leq \alpha_\Delta \| x - x' \|~,`$ where $`\alpha_\Delta`$ depends only on the trained weights and the mesh.

  2. $`\| \Phi_D(\M; x) - \Phi_D(\M; x') \| \leq \alpha_D \| x - x' \|~,`$ where $`\alpha_D`$ depends only on the trained weights and the mesh.

  3. Let $`\taunorm := \sup_u \| \nabla \tau(u) (\nabla \tau(u))^* - {\bf 1} \|`$, where $`\nabla \tau(u)`$ is the Jacobian matrix of $`u \mapsto \tau(u)`$. Then $`\| \Phi_\Delta(\M; x) - \Phi_\Delta( \tau(\M); x) \| \leq \beta_\Delta \taunorm \|x \|~,`$ where $`\beta_\Delta`$ is independent of $`\tau`$ and $`x`$.

  4. Denote by $`\widetilde{|\nabla \tau |}_\infty := \sup_u \| \nabla \tau(u) - {\bf 1} \|`$. Then $`\| \Phi_D(\M; x) - \Phi_D( \tau(\M); x) \| \leq \beta_D \widetilde{|\nabla \tau |}_\infty \|x \|~,`$ where $`\beta_D`$ is independent of $`\tau`$ and $`x`$.

Properties (a) and (b) are not specific to surface representations, and are a simple consequence of the non-expansive property of our chosen nonlinearities. The constant $`\alpha`$ is controlled by the product of $`\ell_2`$ norms of the network weights at each layer and the norm of the discrete Laplacian operator. Properties (c) and (d) are based on the fact that the Laplacian and Dirac operators are themselves stable to deformations, a property that depends on two key aspects: first, the Laplacian/Dirac is localized in space, and next, that it is a high-pass filter and therefore only depends on relative changes in position.

One caveat of Theorem [stabtheo] is that the constants appearing in the bounds depend upon a bandwidth parameter given by the reciprocal of triangle areas, which increases as the size of the mesh increases. This corresponds to the fact that the spectral radius of $`\Delta_\M`$ diverges as the mesh size $`N`$ increases.

In order to overcome this problematic asymptotic behavior, it is necessary to exploit the smoothness of the signals incoming to the surface network. This can be measured with Sobolev norms defined using the spectrum of the Laplacian operator. Given a mesh $`\M`$ of $`N`$ nodes approximating an underlying surface $`S`$, and its associated cotangent Laplacian $`\Delta_\M`$, consider the spectral decomposition of $`\Delta_\M`$ (a symmetric, positive definite operator):

\Delta_\M = \sum_{k \leq N} \lambda_k e_k e_k^T~,~e_k \in \R^N~,~0 \leq \lambda_1 \leq \lambda_2 \dots \leq \lambda_N~.

Under normal uniform convergence 1 , the spectrum of $`\Delta_\M`$ converges to the spectrum of the Laplace-Beltrami operator $`\Delta_S`$ of $`S`$. If $`S`$ is bounded, it is known from the Weyl law that there exists $`\gamma>0`$ such that $`k^{-\gamma(S)} \lesssim \lambda_k^{-1}`$, so the eigenvalues $`\lambda_k`$ do not grow too fast. The smoothness of a signal $`x \in \R^{|V| \times d}`$ defined in $`\M`$ is captured by how fast its spectral decomposition $`\hat{x}(k) = e_k^T x \in \R^d`$ decays . We define $`\|x \|_{\Hc}^2:= \sum_k \lambda(k)^2 \|\hat{x}(k) \|^2`$ is Sobolev norm, and $`\beta(x,S) >1`$ as the largest rate such that its spectral decomposition coefficients satisfy

\begin{equation}
\label{betadef}
\|\hat{x}(k) \| \lesssim k^{-\beta} ~,~(k \to \infty)~.
\end{equation}

If $`x \in \R^{|V| \times d}`$ is the input to the Laplace Surface Network of $`R`$ layers, we denote by $`(\beta_0, \beta_1, \dots, \beta_{R-1})`$ the smoothness rates of the feature maps $`x^{(r)}`$ defined at each layer $`r \leq R`$.

Consider a surface $`S`$ and a finite-mesh approximation $`\M_N`$ of $`N`$ points, and $`\Phi_\Delta`$ a Laplace Surface Network with parameters $`\{(A_r, B_r)\}_{r \leq R}`$. Denote by $`d(S, \M_N)`$ the uniform normal distance, and let $`x_1,x_2`$ be piece-wise polyhedral approximations of $`\bar{x}(t)`$, $`t \in S`$ in $`\M_N`$, with $`\| \bar{x} \|_{\Hc(S)} < \infty`$. Assume $`\| \bar{x}^{(r)} \|_{\Hc(S)} < \infty`$ for $`r \leq R`$.

  1. If $`x_1,x_2`$ are two functions such that the $`R`$ feature maps $`x_l^{(r)}`$ have rates $`(\beta_0, \beta_1, \dots, \beta_{R-1})`$, then

    \begin{equation}
    \label{pony1}
    \| \Phi_\Delta(x_1;\M_N) - \Phi_\Delta(x_2;\M_N) \|^2 \leq C(\beta)  \| x_1 - x_2\|^{h(\beta)} ~,
    \end{equation}
    

    with $`h(\beta) = {\prod_{r=1}^R \frac{\beta_r-1}{\beta_r-1/2}}`$, and where $`C(\beta)`$ does not depend upon $`N`$.

  2. If $`\tau`$ is a smooth deformation field, then $`\| \Phi_\Delta(x; \M_N) - \Phi_\Delta(x; \tau(\M_N)) \| \leq C \taunorm^{h(\beta)}~,`$ where $`C`$ does not depend upon $`N`$.

  3. Let $`\M`$ and $`\M'`$ be $`N`$-point discretizations of $`S`$, If $`\max(d(\M, S), d(\M',S) ) \leq \epsilon`$, then $`\| \Phi_\Delta(\M;x) - \Phi_\Delta(\M', x') \| \leq C \epsilon^{h(\beta)} ~,`$ where $`C`$ is independent of $`N`$.

This result ensures that if we use as generator of the SN an operator that is consistent as the mesh resolution increases, the resulting surface representation is also consistent. Although our present result only concerns the Laplacian, the Dirac operator also has a well-defined continuous counterpart that generalizes the gradient operator in quaternion space. Also, our current bounds depend explicitly upon the smoothness of feature maps across different layers, which may be controlled in terms of the original signal if one considers nonlinearities that demodulate the signal, such as $`\rho(x) = |x|`$ or $`\rho(x) = \text{ReLU}(x)`$. These extensions are left for future work. Finally, a specific setup that we use in experiments is to use as input signal the canonical coordinates of the mesh $`\M`$. In that case, an immediate application of the previous theorem yields

Denote $`\Phi(\M) := \Phi_{\M}(V)`$, where $`V`$ are the node coordinates of $`\M`$. Then, if $`A_1 =0`$,

\begin{equation}
\| \Phi(\M) - \Phi(\tau(\M)) \| \leq \kappa \max(\taunorm, \| \nabla^2 \tau \|)^{h(\beta)}~.
\end{equation}

The Dirac Operator

The quaternions $`\HH`$ is an extension of complex numbers. A quaternion $`q \in \HH`$ can be represented in a form $`q=a + bi + cj + dk`$ where $`a,b,c,d`$ are real numbers and $`{i,j,k}`$ are quaternion units that satisfy the relationship $`i^2 = j^2 = k^2 = ijk = -1`$.

As mentioned in Section 3.1, the Dirac operator used in the model can be conveniently represented as a quaternion matrix:

D_{f,j} = \frac{-1}{2 | \ba_f | }e_j~,~f \in F, j \in V~,

where $`e_j`$ is the opposing edge vector of node $`j`$ in the face $`f`$, and $`\ba_f`$ is the area, as illustrated in Fig. 5, using counter-clockwise orientations on all faces.

The Deep Learning library PyTorch that we used to implement the models does not support quaternions. Nevertheless, quaternion-valued matrix multiplication can be replaced with real-valued matrix multiplication where each entry $`q = a + bi + cj + dk`$ is represented as a $`4 \times 4`$ block

\begin{bmatrix}
    a & -b & -c & -d \\
    b & \phantom{-}a & -d &  \phantom{-}c \\
    c &  \phantom{-}d &  \phantom{-}a & -b \\
    d & -c &  \phantom{-}b &  \phantom{-}a
\end{bmatrix}

and the conjugate $`q^*=a-bi-cj-dk`$ is a transpose of this real-valued matrix:

\begin{bmatrix}
     \phantom{-}a &  \phantom{-}b &  \phantom{-}c &  \phantom{-}d \\
    -b &  \phantom{-}a &  \phantom{-}d & -c \\
    -c & -d &  \phantom{-}a &  \phantom{-}b \\
    -d &  \phantom{-}c & -b &  \phantom{-}a
\end{bmatrix}.

Further Numerical Experiments

Ground Truth MLP AvgPool Laplace Dirac
image image image image image
image image image image image
image image image image image
image image image image image
image image image image image
Qualitative comparison of different models. We plot 1th, 10th, 20th, 30th and 40th predicted frame correspondingly.
Ground Truth MLP AvgPool Laplace Dirac
image image image image image
image image image image image
image image image image image
image image image image image
image image image image image
Qualitative comparison of different models. We plot 1th, 10th, 20th, 30th and 40th predicted frame correspondingly.
Ground Truth MLP AvgPool Laplace Dirac
image image image image image
image image image image image
image image image image image
image image image image image
image image image image image
Qualitative comparison of different models. We plot 1th, 10th, 20th, 30th and 40th predicted frame correspondingly.
Ground Truth MLP AvgPool Laplace Dirac
image image image image image
image image image image image
image image image image image
image image image image image
image image image image image
Qualitative comparison of different models. We plot 1th, 10th, 20th, 30th and 40th predicted frame correspondingly.
Ground Truth Laplace Dirac
image image image
image image image
Dirac-based model visually outperforms Laplace-based models in the regions of high mean curvature.

image
image
image image

From left to right: Laplace, ground truth and Dirac based model. Color corresponds to mean squared error between ground truth and prediction: green - smaller error, red - larger error.

image
image
image image

From left to right: set-to-set, ground truth and Dirac based model. Color corresponds to mean squared error between ground truth and prediction: green - smaller error, red - larger error.

  1. which controls how the normals of the mesh align with the surface normals; see . ↩︎