Parameter estimation for computationally intensive nonlinear regression with an application to climate modeling

Reading time: 5 minute
...

📝 Abstract

Nonlinear regression is a useful statistical tool, relating observed data and a nonlinear function of unknown parameters. When the parameter-dependent nonlinear function is computationally intensive, a straightforward regression analysis by maximum likelihood is not feasible. The method presented in this paper proposes to construct a faster running surrogate for such a computationally intensive nonlinear function, and to use it in a related nonlinear statistical model that accounts for the uncertainty associated with this surrogate. A pivotal quantity in the Earth’s climate system is the climate sensitivity: the change in global temperature due to doubling of atmospheric $\mathrm{CO}_2$ concentrations. This, along with other climate parameters, are estimated by applying the statistical method developed in this paper, where the computationally intensive nonlinear function is the MIT 2D climate model.

💡 Analysis

Nonlinear regression is a useful statistical tool, relating observed data and a nonlinear function of unknown parameters. When the parameter-dependent nonlinear function is computationally intensive, a straightforward regression analysis by maximum likelihood is not feasible. The method presented in this paper proposes to construct a faster running surrogate for such a computationally intensive nonlinear function, and to use it in a related nonlinear statistical model that accounts for the uncertainty associated with this surrogate. A pivotal quantity in the Earth’s climate system is the climate sensitivity: the change in global temperature due to doubling of atmospheric $\mathrm{CO}_2$ concentrations. This, along with other climate parameters, are estimated by applying the statistical method developed in this paper, where the computationally intensive nonlinear function is the MIT 2D climate model.

📄 Content

arXiv:0901.3665v1 [stat.AP] 23 Jan 2009 The Annals of Applied Statistics 2008, Vol. 2, No. 4, 1217–1230 DOI: 10.1214/08-AOAS210 c ⃝Institute of Mathematical Statistics, 2008 PARAMETER ESTIMATION FOR COMPUTATIONALLY INTENSIVE NONLINEAR REGRESSION WITH AN APPLICATION TO CLIMATE MODELING By Dorin Drignei,1 Chris E. Forest2 and Doug Nychka1 Oakland University, Massachusetts Institute of Technology and Pennsylvania State University, and National Center for Atmospheric Research Nonlinear regression is a useful statistical tool, relating observed data and a nonlinear function of unknown parameters. When the parameter-dependent nonlinear function is computationally inten- sive, a straightforward regression analysis by maximum likelihood is not feasible. The method presented in this paper proposes to con- struct a faster running surrogate for such a computationally inten- sive nonlinear function, and to use it in a related nonlinear statis- tical model that accounts for the uncertainty associated with this surrogate. A pivotal quantity in the Earth’s climate system is the climate sensitivity: the change in global temperature due to doubling of atmospheric CO2 concentrations. This, along with other climate parameters, are estimated by applying the statistical method devel- oped in this paper, where the computationally intensive nonlinear function is the MIT 2D climate model.

  1. Introduction. A fundamental question in understanding the Earth’s climate system is quantifying the warming of the atmosphere due to in- creased greenhouse gases. This relationship is formalized by the climate sensitivity, a parameter defined as the increase in global mean surface tem- perature due to a doubling of CO2 in the atmosphere. Although climate sensitivity and other climate parameters are informed by observations, their impact can only be evaluated by simulations of climate with a numerical computer model. Such a model usually includes atmosphere, ocean, land and ice components and is called an atmosphere ocean general circulation model Received December 2007; revised September 2008. 1Supported by NSF Grant DMS-03-55474. 2Supported in part by the NSF-CMG Grant DMS-04-26845. Key words and phrases. Equilibrium climate sensitivity, observed and modeled climate, space–time modeling, statistical surrogate, temperature data. This is an electronic reprint of the original article published by the Institute of Mathematical Statistics in The Annals of Applied Statistics, 2008, Vol. 2, No. 4, 1217–1230. This reprint differs from the original in pagination and typographic detail. 1 2 D. DRIGNEI, C. E. FOREST AND D. NYCHKA (AOGCM). Because climate is defined as a long term average of weather, an AOGCM is usually run (integrated) over many years in order to establish its mean behavior. Thus, numerical experiments with these models require extensive computational resources and the number of runs (also termed in- tegrations) is often limited. For example, the Community Climate System Model (CCSM) requires months of time on a supercomputer to simulate a few hundred model years. Typically an AOGCM depends on unknown parameters which need to be estimated and the statistical challenge is to estimate these parameters along with companion measures of uncertainty using a limited set of climate model experiments. The general statistical problem addressed here is parameter estimation in a nonlinear regression model [e.g., Seber and Wild (1989)], where the nonlinear regression function is computationally intensive to evaluate, such as an AOGCM. In particular, the example discussed here involves observed climate data and the MIT 2D climate model [Sokolov and Stone (1988)] as a nonlinear function of three uncertain parameters collectively denoted by θ: the equilibrium climate sensitivity S, the diffusion of heat anomalies into the deep ocean Kv and the net aerosol forcing Faer. Here we take a maximum likelihood approach and pay particular attention to the correla- tion structure and its effects on the uncertainty measures of the resulting estimates. The standard nonlinear regression approach for this particular estimation problem assumes the following statistical model for the observed climate data: Y = fθ + ε, (1) where the errors are assumed normal with zero mean vector and covariance matrix W. The estimated parameters of this statistical model, including θ, are then obtained by maximizing the likelihood  1 √ 2π N 1 √ detW exp  −1 2(Y −fθ)′W −1(Y −fθ)  . (2) This is usually achieved by using an iterative algorithm. Notice, however, that this requires computing fθ for many values of θ, or, equivalently, run- ning the climate model for a possibly large number of θ values. Such an approach is not feasible for applications where fθ is an AOGCM or even the simplified MIT 2D climate model. To overcome this computational difficulty, we will substitute a statistical surrogate for fθ, denoted ˜fθ, which will result in a much faster estimation algorithm for the unknown p

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut