Parameters defined via general estimating equations (GEE) can be estimated by maximizing the empirical likelihood (EL). Newey and Smith [Econometrica 72 (2004) 219--255] have recently shown that this EL estimator exhibits desirable higher-order asymptotic properties, namely, that its $O(n^{-1})$ bias is small and that bias-corrected EL is higher-order efficient. Although EL possesses these properties when the model is correctly specified, this paper shows that, in the presence of model misspecification, EL may cease to be root n convergent when the functions defining the moment conditions are unbounded (even when their expectations are bounded). In contrast, the related exponential tilting (ET) estimator avoids this problem. This paper shows that the ET and EL estimators can be naturally combined to yield an estimator called exponentially tilted empirical likelihood (ETEL) exhibiting the same $O(n^{-1})$ bias and the same $O(n^{-2})$ variance as EL, while maintaining root n convergence under model misspecification.
Deep Dive into Point estimation with exponentially tilted empirical likelihood.
Parameters defined via general estimating equations (GEE) can be estimated by maximizing the empirical likelihood (EL). Newey and Smith [Econometrica 72 (2004) 219–255] have recently shown that this EL estimator exhibits desirable higher-order asymptotic properties, namely, that its $O(n^{-1})$ bias is small and that bias-corrected EL is higher-order efficient. Although EL possesses these properties when the model is correctly specified, this paper shows that, in the presence of model misspecification, EL may cease to be root n convergent when the functions defining the moment conditions are unbounded (even when their expectations are bounded). In contrast, the related exponential tilting (ET) estimator avoids this problem. This paper shows that the ET and EL estimators can be naturally combined to yield an estimator called exponentially tilted empirical likelihood (ETEL) exhibiting the same $O(n^{-1})$ bias and the same $O(n^{-2})$ variance as EL, while maintaining root n convergence
1. Introduction. Statistical models defined via general estimating equations (GEE) of the form E[g(x, θ)] = 0, where g(x, θ) is a vector-valued nonlinear function of a random vector x and a parameter vector θ, are very common in statistics. In such models, the parameter vector θ is traditionally estimated using two-step efficient generalized method of moments estimators (GMM) [21]. Over the last two decades, various one-step alternatives to two-step GMM have been suggested. Perhaps the best known estimators of this class are the empirical likelihood (EL), exponential tilting (ET) and GMM with continuous updating (CU) estimators, which have been previously studied in the econometrics [22,26,27,35,47] and statistics [37,45,48,49,50,53] literatures. While all of these alternative estimators of θ share the first-order efficiency of efficient two-step GMM, their one-step nature provides them with desirable properties not enjoyed by GMM. In addition to bypassing the arbitrariness in the choice of first-step estimate (since any consistent estimate of θ can, in principle, be used as a first step and lead to slightly different second-step estimates in finite samples), these one-step estimators are also invariant under general parameter-dependent linear transformations of the vector of moment conditions [30,50] and possess superior higher-order asymptotic properties [27,28,29,47].
Considerable effort has been devoted to identifying which of these alternative estimators, EL, ET or CU, is preferable. Since all of these estimators are asymptotically equivalent up to O p (n -1/2 ) when the overidentifying restrictions are valid, differences must reside in their higher-order asymptotic properties or in their behavior under potential model misspecification. The CU estimator is generally regarded as less desirable than EL and ET because its objective function has often been observed to possess multiple modes [22,30] and because it lacks the ability to generate likelihood ratio-based confidence regions whose shape adapts to the support of the data [4,50]. Comparing ET and EL proves to be more difficult. On the one hand, based on a stochastic expansion argument, Newey and Smith [47] have established that EL should typically have a lower finite-sample bias than both ET and CU. Also, they have shown that bias-corrected EL is higher-order efficient than any other regular method of moments estimator. On the other hand, Imbens and co-workers [27,30] have indicated that EL, unlike ET, exhibits a singularity in its influence function, suggesting that ET should be better behaved than EL in the presence of model misspecification. In addition, ET admits a computationally convenient treatment of misspecified models [32].
Although it can be argued that model misspecification can always be avoided through the use of specification tests, an alternative view is that most models are only approximations to the underlying phenomena and are therefore intrinsically misspecified. Accordingly, there exists a growing literature devoted to the study of so-called globally misspecified models (in which the misspecification does not vanish asymptotically). The classic theory of maximum likelihood estimators (MLE) when the distributional assumptions are misspecified can be found in [1,25,63,64]. In this context, MLE consistently estimates the so-called pseudo-true value of the parameter of interest [56], which is defined as the parameter value associated with the distribution which is the closest to the true data generating process according to the so-called Kullback-Leibler information criterion (KLIC) discrepancy.
In recent years, the analysis of misspecified models has been actively extended to various extremum estimators [2,13,44,51] and, in particular, to overidentified moment condition models [8,18,26,32,34,41]. Overidentified models arise naturally in a number of applications. For instance, consider a regression model y = x ′ θ + ε where ε is correlated with x (so that least squares cannot be used) but uncorrelated with a vector of so-called instruments (denoted z). This leads to a vector of restrictions of the form E[(yx ′ θ)z] = 0, the dimension of which typically exceeds the dimension of θ. Given the overidentified (i.e., overdetermined) nature of the restrictions, it is then possible that no value of θ simultaneously satisfies all the moment restrictions exactly in the population, resulting in a misspecified model [41]. A more extensive discussion of misspecified models as well as many references to empirical studies that perform inference with models which fail standard specification tests can be found in [18].
The motivation behind this interest for misspecified models stems from two observations. First, the imperfections of a model, although statistically detectable, may nevertheless be small in absolute terms and consequently have little impact on the results ( [42], pages 1168-1169). Second, a misspecified but parsimoniously parametr
…(Full text truncated)…
This content is AI-processed based on ArXiv data.