Confidence intervals in regression utilizing prior information
We consider a linear regression model with regression parameter beta=(beta_1,…,beta_p) and independent and identically N(0,sigma^2) distributed errors. Suppose that the parameter of interest is theta = a^T beta where a is a specified vector. Define the parameter tau=c^T beta-t where the vector c and the number t are specified and a and c are linearly independent. Also suppose that we have uncertain prior information that tau = 0. We present a new frequentist 1-alpha confidence interval for theta that utilizes this prior information. We require this confidence interval to (a) have endpoints that are continuous functions of the data and (b) coincide with the standard 1-alpha confidence interval when the data strongly contradicts this prior information. This interval is optimal in the sense that it has minimum weighted average expected length where the largest weight is given to this expected length when tau=0. This minimization leads to an interval that has the following desirable properties. This interval has expected length that (a) is relatively small when the prior information about tau is correct and (b) has a maximum value that is not too large. The following problem will be used to illustrate the application of this new confidence interval. Consider a 2-by 2 factorial experiment with 20 replicates. Suppose that the parameter of interest theta is a specified simple effect and that we have uncertain prior information that the two-factor interaction is zero. Our aim is to find a frequentist 0.95 confidence interval for theta that utilizes this prior information.
💡 Research Summary
The paper addresses the construction of a frequentist 1‑α confidence interval for a linear combination θ = aᵀβ of regression coefficients in the classical linear model Y = Xβ + ε, ε ∼ N(0,σ²I). In addition to θ, the authors consider a secondary linear combination τ = cᵀβ − t, where a and c are linearly independent. The practitioner possesses uncertain prior information suggesting that τ = 0 (for example, that a two‑factor interaction is absent). The goal is to incorporate this information without sacrificing the guaranteed coverage of a frequentist interval.
To achieve this, the authors impose four design criteria: (i) the interval endpoints must be continuous functions of the sufficient statistics (θ̂, τ̂, s²); (ii) when the data strongly contradict the prior (i.e., |τ̂| is large) the interval must revert to the standard ordinary least‑squares confidence interval; (iii) the interval must maintain at least 1‑α coverage for every possible value of τ; and (iv) the weighted average expected length,
∫ Eτ
Comments & Academic Discussion
Loading comments...
Leave a Comment