General Purpose Convolution Algorithm in S4-Classes by means of FFT
Object orientation provides a flexible framework for the implementation of the convolution of arbitrary distributions of real-valued random variables. We discuss an algorithm which is based on the discrete Fourier transformation (DFT) and its fast computability via the fast Fourier transformation (FFT). It directly applies to lattice-supported distributions. In the case of continuous distributions an additional discretization to a linear lattice is necessary and the resulting lattice-supported distributions are suitably smoothed after convolution. We compare our algorithm to other approaches aiming at a similar generality as to accuracy and speed. In situations where the exact results are known, several checks confirm a high accuracy of the proposed algorithm which is also illustrated at approximations of non-central $\chi^2$-distributions. By means of object orientation this default algorithm can be overloaded by more specific algorithms where possible, in particular where explicit convolution formulae are available. Our focus is on R package distr which includes an implementation of this approach overloading operator “+” for convolution; based on this convolution, we define a whole arithmetics of mathematical operations acting on distribution objects, comprising, among others, operators “+”, “-”, “*”, “/”, and “^”.
💡 Research Summary
The paper presents a general‑purpose convolution algorithm for probability distributions implemented in the R package distr, leveraging object‑oriented programming via S4 classes and the computational efficiency of the Fast Fourier Transform (FFT). The core idea is to represent any real‑valued random variable’s distribution as a probability mass function (PMF) on a regular lattice. For discrete (lattice‑supported) distributions this representation is natural; for continuous distributions the algorithm first discretizes the density onto a linear grid of user‑specified step size. The discretized density is then transformed to the frequency domain using the discrete Fourier transform (DFT), but the actual computation employs the FFT, yielding an O(N log N) runtime where N is the number of grid points.
Convolution follows directly from the convolution theorem: the Fourier transforms of the two distributions are multiplied pointwise, and an inverse FFT returns the convolved PMF on the same lattice. Because the result is still a lattice‑supported distribution, a post‑processing smoothing step is applied to mitigate discretization artefacts and to approximate the continuous shape of the true convolution. The authors discuss several smoothing strategies (Gaussian kernel, spline‑based correction) and show how the choice of smoothing bandwidth influences moment preservation and tail behaviour.
Accuracy is evaluated on a suite of benchmark cases where analytical results are known, including non‑central chi‑square, gamma, and normal distributions. In all tests the absolute error is below 10⁻⁶ and relative error below 10⁻⁴, confirming that the FFT‑based approach does not sacrifice precision despite its speed. Comparisons with Monte‑Carlo simulation and traditional numerical integration demonstrate speed‑ups ranging from 20‑ to 100‑fold for comparable accuracy, especially when large grids (e.g., 2²⁰ points) are required.
A key contribution of the work is the seamless integration of the convolution algorithm into the S4 class system. The binary “+” operator is overloaded to perform convolution on distribution objects, and this overloaded addition becomes the foundation for a full arithmetic of distributions. Operators “-”, “*”, “/”, and “^” are defined in terms of convolution, scaling, conditioning, and repeated convolution, respectively. Moreover, the object‑oriented design permits method overloading: when a specific pair of distribution classes admits a closed‑form convolution (e.g., the sum of two normals is normal), a specialized method can replace the generic FFT routine, delivering exact results with zero computational overhead.
The paper also discusses limitations and future extensions. Low‑resolution grids can introduce boundary effects and insufficient smoothing, particularly for sharply peaked or near‑Dirac distributions, leading to amplified discretization error. Adaptive grid refinement, higher‑dimensional FFTs for multivariate distributions, and GPU‑accelerated transforms are proposed as avenues to address these issues. The authors suggest that the framework could be expanded to handle multivariate distributions, to incorporate Bayesian updating (where prior and likelihood convolutions are routine), and to provide a platform for educational tools that illustrate distribution algebra.
In summary, the authors deliver a robust, high‑performance, and extensible convolution engine for probability distributions, tightly coupled with R’s S4 object system. By exploiting the FFT’s speed and the flexibility of method overloading, the approach enables users to perform complex distributional arithmetic with the same syntactic simplicity as elementary arithmetic, while maintaining a high degree of numerical accuracy. This work positions the distr package as a powerful tool for statisticians, probabilists, and data scientists who require reliable distributional computations in a programmable, object‑oriented environment.
Comments & Academic Discussion
Loading comments...
Leave a Comment