The Residual Information Criterion, Corrected
Shi and Tsai (JRSSB, 2002) proposed an interesting residual information criterion (RIC) for model selection in regression. Their RIC was motivated by the principle of minimizing the Kullback-Leibler d
Shi and Tsai (JRSSB, 2002) proposed an interesting residual information criterion (RIC) for model selection in regression. Their RIC was motivated by the principle of minimizing the Kullback-Leibler discrepancy between the residual likelihoods of the true and candidate model. We show, however, under this principle, RIC would always choose the full (saturated) model. The residual likelihood therefore, is not appropriate as a discrepancy measure in defining information criterion. We explain why it is so and provide a corrected residual information criterion as a remedy.
💡 Research Summary
Shi and Tsai (2002) introduced the Residual Information Criterion (RIC) as a model‑selection tool for linear regression. Their motivation was to minimize the Kullback‑Leibler (KL) discrepancy between the residual likelihoods of the true model and a candidate model. In this paper we demonstrate that, when the KL discrepancy is defined in terms of residual likelihoods, RIC inevitably selects the saturated model, regardless of the underlying data‑generating process. The root cause is that the residual likelihood depends on the determinant of the residual covariance matrix, which does not penalize model dimension sufficiently; consequently the KL term always favors adding more parameters.
We first formalize RIC, derive its KL expression, and prove (Theorem 1) that for any candidate model the RIC value is never larger than that of the full model. This result shows that RIC lacks the essential consistency property required of an information criterion.
To remedy the flaw we propose a corrected residual information criterion (CRIC). Two possible remedies are considered: (i) replace the residual likelihood with the full likelihood, which abandons the “residual‑based” spirit of RIC; and (ii) retain the residual likelihood but add an explicit penalty that reflects model degrees of freedom. The latter leads to the following expression:
\
📜 Original Paper Content
🚀 Synchronizing high-quality layout from 1TB storage...