Is Memorization Helpful or Harmful? Prior Information Sets the Threshold

Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We examine the connection between training error and generalization error for arbitrary estimating procedures, working in an overparameterized linear model under general priors in a Bayesian setup. We find determining factors inherent to the prior distribution $π$, giving explicit conditions under which optimal generalization necessitates that the training error be (i) near interpolating relative to the noise size (i.e., memorization is necessary), or (ii) close to the noise level (i.e., overfitting is harmful). Remarkably, these phenomena occur when the noise reaches thresholds determined by the Fisher information and the variance parameters of the prior $π$.

💡 Research Summary

The paper investigates the relationship between training error and generalization error in over‑parameterized linear regression under a Bayesian framework. The model is y = Xθ + σ τ with τ∼N(0,Iₙ), θ drawn from a prior π on ℝᵈ, and d ≥ n. The design matrix X is assumed to have full row rank and is treated as fixed (or conditioned on). The authors introduce two key quantities derived from the prior: the Fisher information Jπ = E

Is Memorization Helpful or Harmful? Prior Information Sets the Threshold

💡 Research Summary

Comments & Academic Discussion

Leave a Comment