Sparse Convolved Multiple Output Gaussian Processes

Sparse Convolved Multiple Output Gaussian Processes
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Recently there has been an increasing interest in methods that deal with multiple outputs. This has been motivated partly by frameworks like multitask learning, multisensor networks or structured output data. From a Gaussian processes perspective, the problem reduces to specifying an appropriate covariance function that, whilst being positive semi-definite, captures the dependencies between all the data points and across all the outputs. One approach to account for non-trivial correlations between outputs employs convolution processes. Under a latent function interpretation of the convolution transform we establish dependencies between output variables. The main drawbacks of this approach are the associated computational and storage demands. In this paper we address these issues. We present different sparse approximations for dependent output Gaussian processes constructed through the convolution formalism. We exploit the conditional independencies present naturally in the model. This leads to a form of the covariance similar in spirit to the so called PITC and FITC approximations for a single output. We show experimental results with synthetic and real data, in particular, we show results in pollution prediction, school exams score prediction and gene expression data.


💡 Research Summary

The paper tackles two fundamental challenges in multi‑output Gaussian process (MOGP) modeling: capturing rich inter‑output dependencies and scaling inference to large datasets. Traditional multi‑output GP approaches, such as the linear model of coregionalisation (LMC) or shared kernels, are limited in expressing complex, possibly non‑linear relationships among outputs and become computationally prohibitive as the number of outputs (D) and data points (N) grow, because the full covariance matrix scales as O(N³D³).

To overcome these limitations, the authors adopt the convolution process (CP) framework. Each output (f_d(x)) is expressed as a convolution of a set of latent functions (u_q(z)) (independent GPs) with output‑specific smoothing kernels (G_{d,q}(x-z)):

\


Comments & Academic Discussion

Loading comments...

Leave a Comment