Kernels for Measures Defined on the Gram Matrix of their Support

Kernels for Measures Defined on the Gram Matrix of their Support
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We present in this work a new family of kernels to compare positive measures on arbitrary spaces $\Xcal$ endowed with a positive kernel $\kappa$, which translates naturally into kernels between histograms or clouds of points. We first cover the case where $\Xcal$ is Euclidian, and focus on kernels which take into account the variance matrix of the mixture of two measures to compute their similarity. The kernels we define are semigroup kernels in the sense that they only use the sum of two measures to compare them, and spectral in the sense that they only use the eigenspectrum of the variance matrix of this mixture. We show that such a family of kernels has close bonds with the laplace transforms of nonnegative-valued functions defined on the cone of positive semidefinite matrices, and we present some closed formulas that can be derived as special cases of such integral expressions. By focusing further on functions which are invariant to the addition of a null eigenvalue to the spectrum of the variance matrix, we can define kernels between atomic measures on arbitrary spaces $\Xcal$ endowed with a kernel $\kappa$ by using directly the eigenvalues of the centered Gram matrix of the joined support of the compared measures. We provide explicit formulas suited for applications and present preliminary experiments to illustrate the interest of the approach.


💡 Research Summary

The paper introduces a novel family of kernels for comparing positive measures defined on an arbitrary space 𝓧 equipped with a positive definite kernel κ. Starting with the Euclidean case, the authors consider two measures μ and ν and form their mixture μ + ν. From this mixture they compute the covariance matrix Σ, which captures the spread of the combined support around its weighted mean. The key idea is to define a kernel K(μ, ν) that depends solely on the eigenvalue spectrum λ₁,…,λ_d of Σ, ignoring the eigenvectors. Such kernels are called “spectral” because they are invariant under orthogonal transformations of the data, and “semigroup” because they use only the sum of the two measures for comparison, satisfying K(μ, ν) = K(μ + ν, 0).

Mathematically, the authors show that any non‑negative function Φ defined on the cone of positive semidefinite matrices and invariant to the addition of zero eigenvalues can be expressed as a Laplace transform of a non‑negative function g on the same cone: Φ(S) = ∫ e^{‑⟨S,T⟩} g(T) dT. This representation guarantees that the resulting kernel K(μ, ν) = Φ(Σ) is positive definite. By selecting specific forms for g (e.g., exponential, power‑law), the authors derive closed‑form expressions such as K = det(I + α Σ)^{‑β} or K = exp(‑γ tr Σ), which are computationally tractable.

The paper then extends the construction to arbitrary spaces 𝓧 that are not necessarily Euclidean. Given a base kernel κ on 𝓧, one builds the Gram matrix G_{ij}=κ(x_i,x_j) for the union of the supports of μ and ν, centers it, and extracts its eigenvalues. Because Φ is invariant to adding zero eigenvalues, the kernel can be defined directly from these eigenvalues without needing an explicit embedding of the points. This yields a kernel between atomic measures (finite point clouds) that depends only on the spectrum of the centered Gram matrix of the combined support.

Experimental validation is performed on synthetic Gaussian mixtures and on real image histograms. The proposed spectral kernels are compared against classical histogram kernels (e.g., χ², intersection), the Bhattacharyya kernel, and kernel PCA‑based distances. Results show that the new kernels better capture the variance structure of the data, leading to higher classification accuracy, more robust ROC curves, and improved resistance to noise, especially when class distributions have overlapping means but distinct covariances.

The authors discuss broader implications: the semigroup property makes the kernels naturally suited for multi‑source data fusion and incremental learning where measures are continually added; the spectral nature provides invariance to rotations and permutations of the support points; and the Laplace‑transform framework opens the door to a systematic design of new positive definite kernels by choosing appropriate non‑negative functions g. Future work is suggested on extending the theory to non‑linear spectral functions, developing scalable eigenvalue approximation techniques for large datasets, and exploring applications in kernel density estimation, graph‑based clustering, and measure‑valued regression.


Comments & Academic Discussion

Loading comments...

Leave a Comment