Semi-described and semi-supervised learning with Gaussian processes

Semi-described and semi-supervised learning with Gaussian processes
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Propagating input uncertainty through non-linear Gaussian process (GP) mappings is intractable. This hinders the task of training GPs using uncertain and partially observed inputs. In this paper we refer to this task as “semi-described learning”. We then introduce a GP framework that solves both, the semi-described and the semi-supervised learning problems (where missing values occur in the outputs). Auto-regressive state space simulation is also recognised as a special case of semi-described learning. To achieve our goal we develop variational methods for handling semi-described inputs in GPs, and couple them with algorithms that allow for imputing the missing values while treating the uncertainty in a principled, Bayesian manner. Extensive experiments on simulated and real-world data study the problems of iterative forecasting and regression/classification with missing values. The results suggest that the principled propagation of uncertainty stemming from our framework can significantly improve performance in these tasks.


💡 Research Summary

This paper introduces a unified Gaussian process (GP) framework that simultaneously addresses two challenging scenarios: learning with uncertain or partially missing input features (termed “semi‑described learning”) and learning with missing output labels (semi‑supervised learning). The authors first highlight the intractability of propagating input uncertainty through non‑linear GP mappings, which has limited prior work to either ignore input noise or treat it only at test time with moment‑matching approximations.

To overcome this, they develop a variational approach built on the Bayesian GP‑LVM of Titsias and Lawrence (2010). The key innovation is a variational constraint that operates on the posterior distribution of the latent true inputs X given the noisy observations Z. When inputs are fully observed but noisy, the variational means are fixed to the observed values while the variational covariances capture the uncertainty. When inputs are partially observed, a mixture of Dirac‑like (approximated by sharply peaked Gaussians) and full Gaussians is used, allowing each data point and each dimension to have its own uncertainty level without introducing additional hyper‑parameters.

The framework further employs inducing points U to decouple the latent function values from the inputs. By setting the joint variational distribution q(F,U)=p(F|U,X)q(U), the problematic term p(F|U,X) cancels out of the bound, yielding a tractable lower‑bound that can be optimized efficiently with standard stochastic variational inference. This construction preserves the Bayesian nature of the GP while scaling linearly with the number of inducing points.

Three application pipelines are built on top of this core: (1) semi‑supervised classification, where unlabeled data are imputed by sampling from the variational posterior of X and then propagating through a discriminative GP; (2) auto‑regressive iterative forecasting, where each predicted output becomes the next input and its posterior variance is carried forward, mitigating error accumulation; and (3) regression/classification with missing input features, where the variational constraint directly imputes missing dimensions and quantifies the associated uncertainty.

Extensive experiments on synthetic datasets, sensor time‑series, and image classification tasks demonstrate that the proposed method consistently outperforms baseline approaches such as moment‑matching, local linearisation, and simple mean imputation. In particular, the method achieves lower mean‑squared error and higher log‑likelihood, and it provides well‑calibrated predictive uncertainties that improve downstream decision making.

Overall, the paper delivers a principled, Bayesian solution for handling both input uncertainty and output missingness within a single GP framework. Its variational constraint mechanism is flexible, scalable, and potentially applicable to other probabilistic models, marking a significant contribution to the fields of probabilistic machine learning and data imputation.


Comments & Academic Discussion

Loading comments...

Leave a Comment