In empirical research, when we have multiple estimators for the same parameter of interest, a central question arises: how do we combine unbiased but less precise estimators with biased but more precise ones to improve the inference? Under this setting, the point estimation problem has attracted considerable attention. In this paper, we focus on a less studied inference question: how can we conduct valid statistical inference in such settings with unknown bias? We propose a strategy to combine unbiased and biased estimators from a sensitivity analysis perspective. We derive a sequence of confidence intervals indexed by the magnitude of the bias, which enable researchers to assess how conclusions vary with the bias levels. Importantly, we introduce the notion of the b-value, a critical value of the unknown maximum relative bias at which combining estimators does not yield a significant result. We apply this strategy to three canonical combined estimators: the precision-weighted estimator, the pretest estimator, and the soft-thresholding estimator. For each estimator, we characterize the sequence of confidence intervals and determine the bias threshold at which the conclusion changes. Based on the theory, we recommend reporting the b-value based on the soft-thresholding estimator and its associated confidence intervals, which are robust to unknown bias and achieve the lowest worst-case risk among the alternatives.
In empirical research, it is common for researchers to employ different methods to estimate the same parameter of interest. These differences may arise from the use of distinct datasets or from imposing different model assumptions on the same dataset. We motivate our paper with the following two examples of combining estimators.
Example 1.1. Randomized controlled trials (RCTs) are the gold standard for estimating treatment effects due to their ability to eliminate unmeasured confounding. However, RCTs often suffer from limited sample sizes, as large-scale experiments can be costly or infeasible. In contrast, observational data are more readily available from the target population of interest. However, estimates using observational data may be biased in estimating treatment effects due to unmeasured confounding, raising concerns about the internal validity. See Brantner et al. (2023) and Colnet et al. (2024) for recent reviews on motivations and methods for combining RCTs and observational studies.
Example 1.2. The ordinary least squares (OLS) estimator is biased in estimating the unknown parameters under the linear model when the error term is correlated with the regressor. In contrast, the instrumental variables (IV) estimator can provide unbiased estimates for the parameters of interest with a valid instrumental variable that is uncorrelated with the error term but correlated with the regressor. However, the IV estimator is usually much less precise than the OLS estimator, especially when the IV is weakly correlated with the regressor (Bound et al., 1995). In empirical studies, e.g., Angrist and Krueger (1991), researchers often report results from both OLS and IV estimators. How to combine OLS and IV estimators is gaining increasing interest (Armstrong et al., 2025).
When the estimators are from different datasets, e.g., Example 1.1, the estimators are independent as long as the datasets are independent. When the estimators are from the same dataset but with different model assumptions, e.g., Example 1.2, the estimators are dependent in general. Given access to multiple potentially dependent estimators, some unbiased but less precise and others biased but more precise, a natural question is: How can we combine the unbiased and potentially biased estimators to improve the inference with unknown bias? From the point estimation perspective, this problem has been extensively studied (Bickel, 1984;Green and Strawderman, 1991;Giles and Giles, 1993;Chen et al., 2015;Athey et al., 2020;de Chaisemartin and D’Haultfoeuille, 2020;Rosenman et al., 2023a;Gao and Yang, 2023;Yang et al., 2025). Many methods have been proposed for constructing combined estimators that perform well when the bias is small and have bounded risks when the bias is large. From the statistical inference perspective, this problem is less studied. In this paper, we answer the following question: how can we conduct valid statistical inference after combining the estimators?
This question has received considerably less attention. The primary difficulty lies in the impossibility of characterizing the distribution of the combined estimator with unknown bias (Armstrong et al., 2025). Once the information about the bias is introduced, e.g., an upper bound on its magnitude, confidence intervals for the parameter of interest become possible. In the absence of such information, we focus on the following question: rejects the null hypothesis. If we have prior knowledge suggesting the bias is small, incorporating the biased estimator may yield a more precise estimator to reject the null hypothesis. In such cases, the sequence of confidence intervals enables us to address the following question:
How large must the bias be to change the conclusion of a hypothesis test-from rejection to non-rejection?
The idea of constructing the sequence of confidence intervals and examining how conclusions change as the assumed level of bias varies is related to sensitivity analysis in causal inference with unmeasured confounding, e.g., Cornfield et al. (1959); Rosenbaum and Rubin (1983); VanderWeele and Ding (2017). In observational studies, sensitivity analysis assesses how the causal conclusions change with respect to different degrees of unmeasured confounding by varying the sensitivity parameter (Rosenbaum, 2002;Ding and VanderWeele, 2016). Our proposed framework has a similar flavor: by indexing inference results over a continuum of bias levels, we can assess the robustness of statistical inference.
We formalize two statistical inference questions, confidence interval and hypothesis testing, in the context of combining unbiased and biased estimators. Under regularity conditions, the estimators satisfy a joint central limit theorem. Consequently, we present our formulation in a finite-sample Gaussian setting, assuming exact normality for both the unbiased and biased estimators. This reduction to a Gaussian model is motivated by Le Cam’s classical asymptotic argu
This content is AI-processed based on open access ArXiv data.