Unavoidable Canonical Nonlinearity Induced by Gaussian Measures Discretization
When we consider canonical averages for classical discrete systems, typically referred to as substitutional alloys, the map phi from many-body interatomic interactions to thermodynamic equilibrium configurations generally exhibits complicated nonlinearity. This canonical nonlinearity is fundamentally rooted in deviations of the discrete configurational density of states (CDOS) from continuous Gaussian families, and has conventionally been characterized by the Kullback-Leibler (KL) divergence on discrete statistical manifold. Thus, the previous works inevitablly missed intrinsic nonlinearities induced by discretization of Gaussian families, which remains invisible within conventional information-geometric descriptions. In the present work, we identify and quantify such unavoidable canonical nonlinearity by employing the 2-Wasserstein distance with a cost function aligned with the Fisher metric for Gaussian families. We derive an explicit expression for the Wasserstein distance in the limit of vanishing discretization scale d to 0. We further show that this limiting Wasserstein distance admits a clear geometric interpretation on the statistical manifold, equivalent to a KL divergence associated with the expected parallel translations of continuous Gaussian. Our framework thus provides a transport-information-geometric characterization of discretization-induced nonlinearity in classical discrete systems. In addition, we confirm that this W2-KL equivalence admits a natural generalization beyond Gaussian families. The correspondence reveals that the irreversible geometric distortion of the local measure induced by discretization, while extrinsic to information geometry alone, can generically be characterized by a standard KL divergence.
💡 Research Summary
The paper addresses a subtle source of non‑linearity in the mapping from many‑body interaction parameters to canonical averages in classical discrete systems such as substitutional alloys. Traditionally, this “canonical non‑linearity” has been quantified by the Kullback‑Leibler (KL) divergence between the actual configurational density of states (CDOS) and a reference continuous Gaussian distribution that shares the same mean and covariance. While this KL‑based measure captures deviations of the CDOS from Gaussianity, it inadvertently includes an additional contribution arising from the discretization of the continuous Gaussian itself. Because KL divergence is defined only on a common support, the geometric distortion introduced when a continuous Gaussian is projected onto the discrete lattice is invisible to the conventional information‑geometric framework.
To isolate and quantify this hidden contribution, the authors employ optimal transport theory, specifically the 2‑Wasserstein distance (W₂). They define a cost function aligned with the Fisher information metric of the Gaussian family:
c(x, y) = (x − y)ᵀ Γ⁻¹ (x − y),
where Γ is the covariance matrix of the CDOS. This choice replaces the usual Euclidean squared distance with the Fisher‑induced quadratic form, ensuring compatibility with the statistical manifold of Gaussians.
The continuous multivariate Gaussian P_c(μ, Γ) is discretized on a hypercubic lattice of side length d, producing a discrete counterpart P_d that assigns to each cell V_k the probability mass ∫_{V_k} P_c(q) dq. In the limit d → 0, the optimal transport plan collapses to a local projection that maps every point inside a cell to its representative point q′_k. By expanding P_c around q′_k and retaining only leading‑order terms in d, the authors derive a closed‑form expression for the squared Wasserstein distance:
W₂²(P_c, P_d) = (d / 12) Tr(Γ⁻¹).
Remarkably, this result depends solely on the covariance matrix and the discretization scale, making it a universal measure of the “unavoidable canonical non‑linearity” (UCN) induced by discretization.
The paper then connects this transport‑based metric to the KL divergence. For a fixed Γ, the KL divergence between two Gaussians differing by a small mean shift δμ is
D(P_c(μ + δμ, Γ) || P_c(μ, Γ)) = ½ δμᵀ Γ⁻¹ δμ.
If δμ is modeled as an i.i.d. uniform random vector with variance proportional to d², its expectation yields
E
Comments & Academic Discussion
Loading comments...
Leave a Comment