Decomposing data sets into skewness modes
We derive the nonlinear equations satisfied by the coefficients of linear combinations that maximize their skewness when their variance is constrained to take a specific value. In order to numerically solve these nonlinear equations we develop a gradient-type flow that preserves the constraint. In combination with the Karhunen-Lo`eve decomposition this leads to a set of orthogonal modes with maximal skewness. For illustration purposes we apply these techniques to atmospheric data; in this case the maximal-skewness modes correspond to strongly localized atmospheric flows. We show how these ideas can be extended, for example to maximal-flatness modes.
💡 Research Summary
The paper addresses a fundamental limitation of conventional linear dimensional‑reduction techniques such as Principal Component Analysis (PCA) and the Karhunen‑Loève (KL) expansion: they are designed to capture only second‑order statistics (mean and variance) and therefore ignore higher‑order characteristics like skewness (asymmetry) and kurtosis (tail heaviness). To remedy this, the authors formulate an optimization problem that seeks linear combinations of the original variables which maximize skewness while keeping the variance fixed at a prescribed value.
Mathematically, let (x\in\mathbb{R}^N) denote the data vector, (C=\langle xx^{\mathsf T}\rangle) its covariance matrix, and (S_{ijk}=\langle x_i x_j x_k\rangle) the third‑order central‑moment tensor. For a coefficient vector (a) the projected scalar (y=a^{\mathsf T}x) has variance (a^{\mathsf T}Ca) and third moment (a^{\mathsf T}Sa a). The skewness of (y) is proportional to (\frac{a^{\mathsf T}Sa a}{(a^{\mathsf T}Ca)^{3/2}}). Imposing the variance constraint (a^{\mathsf T}Ca=\sigma^2) and introducing a Lagrange multiplier (\lambda), the necessary condition for an extremum becomes the nonlinear equation
\
Comments & Academic Discussion
Loading comments...
Leave a Comment