A Unified Framework for Debiased Machine Learning: Riesz Representer Fitting under Bregman Divergence

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Estimating the Riesz representer is central to debiased machine learning for causal and structural parameter estimation. We propose generalized Riesz regression, a unified framework for estimating the Riesz representer by fitting a representer model via Bregman divergence minimization. This framework includes various divergences as special cases, such as the squared distance and the Kullback–Leibler (KL) divergence, where the former recovers Riesz regression and the latter recovers tailored loss minimization. Under suitable pairs of divergence and model specifications (link functions), the dual problems of the Riesz representer fitting problem correspond to covariate balancing, which we call automatic covariate balancing. Moreover, under the same specifications, the sample average of outcomes weighted by the estimated Riesz representer satisfies Neyman orthogonality even without estimating the regression function, a property we call automatic Neyman orthogonalization. This property not only reduces the estimation error of Neyman orthogonal scores but also clarifies a key distinction between debiased machine learning and targeted maximum likelihood estimation (TMLE). Our framework can also be viewed as a generalization of density ratio fitting under Bregman divergences to Riesz representer estimation, and it applies beyond density ratio estimation. We provide convergence analyses for both reproducing kernel Hilbert space (RKHS) and neural network model classes. A Python package for generalized Riesz regression is released as genriesz and is available at https://github.com/MasaKat0/genriesz.

💡 Research Summary

The paper introduces a unified framework for estimating the Riesz representer, a key object in debiased machine learning for causal and structural parameter estimation. The authors propose “Generalized Riesz Regression” (GRR), which fits a representer model by minimizing a Bregman divergence between the model output (after an appropriate link transformation) and a target quantity derived from the functional of interest. By selecting different convex generators φ and link functions g, the framework subsumes several existing methods: squared‑error Riesz regression (SQ‑Riesz) when φ(u)=½‖u‖² and g is linear, Kullback–Leibler (KL)‑based loss (UKL‑Riesz) when φ is the KL generator and g is the log link, as well as a host of other loss–link pairs (BKL‑Riesz, BP‑Riesz, PU‑Riesz, etc.).

A central theoretical contribution is the identification of “dual linearity” conditions under which the dual of the Bregman‑divergence minimization problem yields covariate‑balancing constraints as KKT conditions. This leads to the notion of automatic covariate balancing: the estimated Riesz representer automatically satisfies moment‑matching equations that balance the covariate distribution without any explicit balancing step.

Simultaneously, the same loss–link specifications guarantee that the sample average of outcomes weighted by the estimated representer, (\frac{1}{n}\sum_i \hat\alpha(X_i)Y_i), coincides with a Neyman‑orthogonal score. Consequently, the estimator enjoys automatic Neyman orthogonalization—first‑order bias is eliminated even when the regression function is not estimated, contrasting with traditional debiased machine learning that relies on separate regression estimation and cross‑fitting.

The authors provide non‑asymptotic convergence analyses for two major model classes. In reproducing kernel Hilbert spaces (RKHS), they show that with appropriate regularization the estimation error of the representer attains the parametric rate (O_p(n^{-1/2})). For deep neural networks, they derive rates based on Rademacher complexity and local complexity bounds, showing polynomial convergence depending on network depth, width, and activation smoothness. The regularization parameter λ is interpreted as a trade‑off controller between balancing accuracy and bias‑variance.

Empirical evaluations on synthetic and semi‑synthetic datasets demonstrate that GRR outperforms a range of baselines—including entropy balancing, stable balancing, kernel balancing, LSIF, and conventional Riesz regression—across metrics such as mean squared error, mean absolute error, and final policy value estimation. Notably, the KL‑based UKL‑Riesz variant exhibits superior stability in high‑dimensional covariate settings.

A Python package, “genriesz”, is released to facilitate practical implementation. It supports multiple loss–link pairs, model choices (kernel methods, multilayer perceptrons), cross‑fitting pipelines, and diagnostic tools for checking automatic balancing.

In summary, the paper unifies disparate Riesz‑representer estimation techniques under a Bregman‑divergence optimization lens, introduces automatic covariate balancing and automatic Neyman orthogonalization as inherent properties of the framework, and provides both theoretical guarantees and practical tools. This work offers a significant step forward for robust, efficient debiased machine learning in causal inference, policy evaluation, and covariate‑shift adaptation.

A Unified Framework for Debiased Machine Learning: Riesz Representer Fitting under Bregman Divergence

💡 Research Summary

Comments & Academic Discussion

Leave a Comment