Dirichlet Process Mixtures of Generalized Linear Models
We propose Dirichlet Process mixtures of Generalized Linear Models (DP-GLM), a new method of nonparametric regression that accommodates continuous and categorical inputs, and responses that can be modeled by a generalized linear model. We prove conditions for the asymptotic unbiasedness of the DP-GLM regression mean function estimate. We also give examples for when those conditions hold, including models for compactly supported continuous distributions and a model with continuous covariates and categorical response. We empirically analyze the properties of the DP-GLM and why it provides better results than existing Dirichlet process mixture regression models. We evaluate DP-GLM on several data sets, comparing it to modern methods of nonparametric regression like CART, Bayesian trees and Gaussian processes. Compared to existing techniques, the DP-GLM provides a single model (and corresponding inference algorithms) that performs well in many regression settings.
💡 Research Summary
The paper introduces Dirichlet Process mixtures of Generalized Linear Models (DP‑GLM), a non‑parametric regression framework that unifies the flexibility of Dirichlet process (DP) mixture models with the broad modeling capabilities of generalized linear models (GLMs). In a DP‑GLM each mixture component (or “cluster”) is itself a GLM, allowing the method to handle continuous and categorical covariates simultaneously and to model responses drawn from any exponential‑family distribution (e.g., Gaussian, Bernoulli, Poisson, multinomial). This contrasts with earlier DP‑based regression approaches that typically assume a Gaussian likelihood and therefore are limited to continuous outcomes.
Model construction.
The generative process starts with a base measure (G_{0}) over the joint space of input‑distribution parameters (\phi) and GLM coefficients (\beta). A Dirichlet process (G\sim DP(\alpha,G_{0})) is drawn, and for each observation a cluster indicator (\theta_i\sim G) is sampled. Conditional on (\theta_i=(\phi_i,\beta_i)), the covariate vector (x_i) is generated from a distribution (p(x|\phi_i)) (e.g., a Gaussian for continuous covariates or a categorical distribution for discrete covariates). The response (y_i) is then generated via the GLM likelihood (p(y|x_i,\beta_i)) with a chosen link function (g(\cdot)). Because the DP allows an unbounded number of clusters, the model can automatically adapt its complexity to the data.
Theoretical contribution – asymptotic unbiasedness.
A central result is a theorem proving that the posterior predictive mean (\hat m(x)=\mathbb{E}
Comments & Academic Discussion
Loading comments...
Leave a Comment