Optimality of the Half-Order Exponent in the Turing-Good Identities for Bayes Factors

Bayes factors are widely computed by Monte Carlo, yet heavy-tailed sampling distributions can make numerical validation unreliable. The Turing--Good identities provide exact moment equalities for powers of a Bayes factor (a density ratio). When these…

Authors: Kensuke Okada

Optimality of the Half-Order Exponent in the Turing-Good Identities for Bayes Factors
Optimalit y of the Half - Order Exp onen t in the T uring–Go o d Iden tities for Ba y es F actors Kensuk e Okada Graduate Sc ho ol of Education, The Univ ersit y of T oky o F ebruary 24, 2026 Abstract Ba yes factors are widely computed b y Mon te Carlo, yet hea vy-tailed sampling distributions can mak e n umerical v alidation unreliable. The T uring–Go od iden tities pro vide exact moment equalities for pow ers of a Bay es factor (a densit y ratio). When these identities are used as Go o d-c hec k diagnostics, the pow er c hoice b ecomes a statistical design parameter. W e develop a nonasymptotic v ariance theory for Mon te Carlo ev aluation of the identities and sho w that the half-order (square-ro ot) pow er is uniquely minimax-stable: it equalizes v ariability across the tw o mo del orientations and is the only choice that guaran tees finite second moments in a distribution-free w orst-case sense o ver all mutually absolutely contin uous mo del pairs. This yields a balanced tw o-sample half-order diagnostic that is symmetric in mo del lab eling and has a uniform v ariance bound at fixed computational budget; in small-o v erlap regimes it is guaran teed to b e no less efficient than the standard one-sided T uring c heck. Sim ulations for binomial Ba yes factor w orkflo ws illustrate stable finite-sample b eha vior and sensitivit y to simulator–ev aluator mismatches. W e further connect the half-order ov erlap viewpoint to stable primitiv es for normalizing-constant ratios and imp ortance-sampling degeneracy summaries. K eywor ds: marginal lik eliho o d; Ba yes factor; Hellinger affinit y; Rén yi div ergence; Mon te Carlo diagnostics; imp ortance sampling; bridge sampling 1 1 In tro duction Ba yes factors pro vide a canonical Bay esian measure of relativ e evidence betw een tw o statistical mo dels. Giv en riv al h yp otheses H 1 and H 2 with marginal lik eliho o ds (prior predictiv es) p 1 ( x ) and p 2 ( x ) , the Ba yes factor B ( x ) := p 1 ( x ) p 2 ( x ) up dates prior o dds in to p osterior o dds via multiplication ( Kass & Raftery 1995 ). As Heck et al. ( 2023 , p. 558) note, “The last 25 y ears ha ve sho wn a steady increase in attention for the Ba yes factor as a to ol for hypothesis ev aluation and mo del selection. ” Ba yes factors are routinely used as decision statistics (e.g., threshold rules that declare H 1 when B ( x ) is large), which raises b oth computational and frequentist-calibration questions ( Dic key 1971 , Kass & Raftery 1995 ). Bay es factors are used not only for h yp othesis tests but also to test alternativ e substantiv e theories against eac h other ( Lee & W agenmak ers 2013 ). While asymptotic appro ximations such as BIC connect Ba yes factors to log-lik eliho o d differences in regular mo dels ( Sch warz 1978 ), mo dern applications often hinge on reliable finite-sample computations of marginal lik eliho o ds. Computing marginal likelihoo ds—and more generally , estimating ratios of normalizing constan ts—is notoriously delicate. Man y practical algorithms reduce these problems to Mon te Carlo ev aluation of exp ectations of a densit y ratio (the Radon–Nikodym deriv ative), using importance sampling, bridge/path sampling, and related free-energy estimators ( Bennett 1976 , Meng & W ong 1996 , Shirts & Chodera 2008 , Gronau et al. 2020 ). Such metho ds can be highly effective when the relev ant distributions o verlap substan tially . Ho w ever, under tail mismatc h the ratio can dev elop extreme w eights that dominate Monte Carlo a v erages, leading to severe v ariance inflation, highly sk ewed sampling distributions, and unstable standard-error assessmen ts. 2 A ccordingly , there is a gro wing emphasis on workflow-level diagnostics and rigorous v alidation of Bay es factor implemen tations ( Schad et al. 2021 ). A remarkably general lens on the sampling behavior of densit y ratios is provided by the classic T uring–Go o d momen t identit y ( Go o d 1985 , Jacod & Shiry aev 2003 ): E H 2 h B ( X ) t i = E H 1 h B ( X ) t − 1 i . Recen tly , Sekulovski et al. ( 2024 ) adv o cated lev eraging selected instances of this iden tity , sp ecifically t = 1 (the “T uring” iden tity E H 2 [ B ] = 1 ) and t = 2 —as practical Go o d che cks for v alidating n umerical Bay es factor computations. When these exact p opulation identities are repurp osed as Mon te Carlo diagnostics, the exp onen t t b ecomes a critical statistical design p ar ameter . The identities fix means, but n umerical reliability is go v erned by second momen ts. Because B ( X ) is often highly sk ewed and heavy-tailed, natural in teger-order choices lik e the t = 1 or t = 2 chec ks can exhibit enormous or even infinite v ariance for perfectly v alid, m utually absolutely con tin uous mo del pairs. This exp oses a methodological paradox: the v ery regimes where v alidation is most needed (weak ov erlap and heavy tails) are exactly those where the diagnostic statistics themselv es can b ecome ill-p osed in finite Monte Carlo runs. This pap er resolv es the design parado x b y studying exp onen t selection from an explicitly computation-a w are p ersp ectiv e. W e ask: which exp onent yields a moment identity that c an b e verifie d stably, in a distribution-fr e e sense? W e show that the half-order exp onen t t = 1 2 is mathematically distinguished. Among the con tin uum of v alid identities, the half-order is the unique exp onent that equalizes the natural v ariances under H 1 and H 2 , and it uniquely yields a distribution-free, finite second-momen t guaran tee for the p er-dra w diagnostic building blo c ks across all mutually absolutely con tinuous pairs. 3 Our main con tributions are threefold: 1. Exact v ariance theory and unique minimax stabilit y (Section 2). W e deriv e exact nonasymptotic v ariance expressions for Mon te Carlo ev aluation of the exp ectationfrom either side of the T uring–Go o d iden tit y in terms of shifted ov erlaps. A t half order, B 1 / 2 under H 2 and B − 1 / 2 under H 1 ha v e equal finite v ariance. Con v ersely , for ev ery t  = 1 2 there exist mutually absolutely contin uous pairs for whic h the w orst-side v ariance div erges. 2. Robust Go o d chec ks at matc hed cost (Section 3). W e propose a symmetric, t wo- sided half-order diagnostic that resolves the main b ottlenecks of integer-momen t Goo d c hecks. Under a fixed budget of Ba yes factor ev aluations, the balanced t wo-sample half-order discrepancy has a bounded v ariance, uniformly o ver mutually absolutely con tin uous pairs. W e also pro vide an analytic matched-budget comparison showing that, in the small-o verlap regime, the half-order tw o-sided chec k is guaranteed to b e no less efficien t (in v ariance) than the one-sided t = 1 T uring chec k. Sim ulation studies illustrate stable finite-sample b ehavior and sensitivit y to simulator–ev aluator mismatc hes. 3. Broader implications for ratio estimation and w eigh t degeneracy(Section 4). W e sho w that the same half-order viewp oint extends b ey ond Go o d c hecks to core computational tasks: a canonical geometric bridge identit y for normalizing-constant ratios whose building blo c ks hav e distribution-free finite second momen ts and stable o verlap-guided summaries of imp ortance-weigh t concentration that do not rely on fragile second momen ts Section 2 establishes the exact v ariance structure of the T uring–Go o d identities and pro v es the unique minimax stabilit y of the half-order exponent. Section 3 dev elops the balanced t w o-sided half-order diagnostic, provides matc hed-budget v ariance comparisons, and presen ts 4 sim ulation studies. Section 4 explores broader implications for normalizing-constan t ratios and imp ortance sampling diagnostics. Section 5 concludes with a discussion. 2 T uring–Go o d Iden tities, V ariance Structure, and Half-Order Optimalit y 2.1 Setup and basic prop erties Let ( X , A , µ ) b e a measurable space with a σ -finite dominating measure µ . W e consider tw o comp eting models (or hypotheses) H 1 and H 2 with mar ginal likeliho o d (prior predictive) densities p 1 and p 2 with respect to µ . Let P j b e the induced probabilit y measures, P j ( A ) := R A p j dµ . Throughout we assume mutual absolute c ontinuity P 1 ≪ P 2 and P 2 ≪ P 1 ; equiv alen tly , p 1 ( x ) = 0 ⇐ ⇒ p 2 ( x ) = 0 for µ -a.e. x ∈ X . (1) so that the Ba yes factor is w ell-defined P 2 -a.s. (and P 1 -a.s.): B ( x ) := p 1 ( x ) p 2 ( x ) for p 2 ( x ) > 0 , with an arbitrary (e.g., 1 ) v alue on { p 2 ( x ) = 0 } , whic h is P 1 - and P 2 -n ull. F or readabilit y , w e write R f ( x ) dx as shorthand for R f dµ . F or t ∈ R , define I ( t ) := Z p 1 ( x ) t p 2 ( x ) 1 − t dx, (2) the Hellinger in tegral of order t ( Jaco d & Shiryaev 2003 , p. 228). W e denote the effective domain (whic h dep ends on the pair ( p 1 , p 2 ) ) b y D = D ( p 1 , p 2 ) := { t ∈ R : I ( t ) < ∞} . 5 Unless stated otherwise, whenev er we write E H 2 [ B t ] or E H 1 [ B t − 1 ] w e implicitly assume t ∈ D . Note that t ∈ D do es not in general imply 2 t ∈ D or 2 t − 1 ∈ D ; these additional conditions go v ern the finiteness of the second momen ts (and hence whether the v ariances in Lemma 2.4 are finite). F or t ∈ (0 , ∞ ) ∩ D \ { 1 } , the quantit y D t ( p 1 ∥ p 2 ) := 1 t − 1 log I ( t ) is the order- t Rén yi divergence from p 1 to p 2 . At t = 1 , in terpret b y contin uity: lim t → 1 1 t − 1 log I ( t ) = D KL ( p 1 ∥ p 2 ) . W e write P H j and E H j for probability/expectation under the law with densit y p j (equiv alen tly , P p j and E p j ). Theorem 2.1 (T uring–Go o d moment iden tity; Goo d 1985 ) . F or any t ∈ D , E H 1 h B t − 1 i = E H 2 h B t i = I ( t ) . Pr o of. By definition of B , E H 1 h B t − 1 i = Z  p 1 p 2  t − 1 p 1 dx = Z p t 1 p 1 − t 2 dx = I ( t ) , and similarly E H 2 h B t i = Z  p 1 p 2  t p 2 dx = Z p t 1 p 1 − t 2 dx = I ( t ) . Corollary 2.2 (T uring’s theorem) . A t t = 1 in The or em 2.1 , E H 2 [ B ] = 1 . 6 Lemma 2.3 (Basic bounds and equalit y characterization) . (i) I (0) = I (1) = 1 . (ii) F or every t ∈ [0 , 1] , I ( t ) ≤ 1 (in p articular, [ 0 , 1] ⊂ D ). (iii) F or every t / ∈ [0 , 1] with t ∈ D , I ( t ) ≥ 1 . If t / ∈ { 0 , 1 } , then e quality in (ii) or (iii) holds if and only if p 1 ( x ) = p 2 ( x ) µ -a.e. Pr o of. (i) follo ws from I (0) = R p 2 dx = 1 and I (1) = R p 1 dx = 1 . F or (ii), if 0 < t < 1 then the w eigh ted AM–GM inequalit y yields p 1 ( x ) t p 2 ( x ) 1 − t ≤ t p 1 ( x ) + (1 − t ) p 2 ( x ) p oin twise; in tegrating b oth sides gives I ( t ) ≤ 1 . F or (iii), let f t ( u ) = u t on (0 , ∞ ) . F or t < 0 or t > 1 , f t is strictly con vex, so Jensen’s inequalit y gives I ( t ) = E H 2 [ B t ] ≥ { E H 2 [ B ] } t = 1 , where w e used Corollary 2.2 in the last equalit y . If t / ∈ { 0 , 1 } , equalit y in Jensen holds if and only if B is H 2 -a.s. constan t. Since E H 2 [ B ] = 1 , this forces B ≡ 1 H 2 -a.s., and hence p 1 = p 2 µ -a.e. Remark (Domain and finiteness on [0 , 1] ). Lemma 2.3 (ii) implies that [0 , 1] ⊂ D and that I ( t ) ∈ [0 , 1] for all t ∈ [0 , 1] . In particular, the half-order o verlap ρ = I ( 1 2 ) is alwa ys w ell-defined and finite. Under mutual absolute contin uity ( 1 ) , we also hav e ρ > 0 (and ρ = 1 if and only if p 1 = p 2 µ -a.e.). A quan tity that will pla y a central role is the half-or der ov erlap ρ := I ( 1 2 ) = Z q p 1 ( x ) p 2 ( x ) dx ∈ (0 , 1] , (3) the Hellinger affinit y (Bhattacharyy a co efficient) betw een the tw o marginal likelihoo ds. 2.2 V ariance identities and a CGF viewp oin t The T uring–Go o d momen t identit y fixes the means of the transformed Bay es factors B t − 1 under H 1 and B t under H 2 . F or exp onent selection, the next natural ob ject is their 7 disp ersion. Con v en tion (Second momen ts and extended v ariances). W e allow v ariances to tak e v alues in [ 0 , ∞ ] , in terpreting V ar ( · ) = + ∞ when the corresp onding second moment is infinite; equiv alen tly , V ar H 2 ( B t ) < ∞ exactly when 2 t ∈ D and V ar H 1 ( B t − 1 ) < ∞ exactly when 2 t − 1 ∈ D . Lemma 2.4 (V ariance iden tities) . F or any t ∈ D , V ar H 1  B t − 1  = I (2 t − 1) − I ( t ) 2 , V ar H 2  B t  = I (2 t ) − I ( t ) 2 , (4) wher e the right-hand sides ar e interpr ete d in [ 0 , ∞ ] ac c or ding to the pr e c e ding c onvention. Pr o of. Fix t ∈ D , so I ( t ) < ∞ and hence E H 2 [ B t ] = E H 1 [ B t − 1 ] = I ( t ) . F or the H 2 side, using V ar ( X ) = E [ X 2 ] − { E [ X ] } 2 with the conv ention that V ar ( X ) = + ∞ when E [ X 2 ] = + ∞ , V ar H 2 ( B t ) = E H 2 [ B 2 t ] − { E H 2 [ B t ] } 2 . Moreo v er, E H 2 [ B 2 t ] = Z p 1 p 2 ! 2 t p 2 dµ = Z p 2 t 1 p 1 − 2 t 2 dµ = I (2 t ) ∈ [ 0 , ∞ ] . Therefore V ar H 2 ( B t ) = I (2 t ) − I ( t ) 2 in [0 , ∞ ] . The H 1 side is analogous: E H 1 [ B 2( t − 1) ] = Z p 1 p 2 ! 2( t − 1) p 1 dµ = Z p 2 t − 1 1 p 2 − 2 t 2 dµ = I (2 t − 1) , hence V ar H 1 ( B t − 1 ) = I (2 t − 1) − I ( t ) 2 . Corollary 2.5 (Half-order equalization) . A t t = 1 2 , the two varianc es c oincide and ar e always finite: V ar H 1  B − 1 / 2  = V ar H 2  B 1 / 2  = 1 − ρ 2 . 8 Pr o of. Apply Lemma 2.4 at t = 1 2 and use Lemma 2.3 (i), i.e., I (0) = I (1) = 1 , together with ( 3 ). Corollary 2.5 already hin ts at the sp ecial role of t = 1 2 : the relev ant second momen ts are anc hored at I (0) and I (1) , whic h are identically 1 for al l pairs ( p 1 , p 2 ) , whereas for generic t one m ust control I (2 t ) and I (2 t − 1) , which ma y fail to b e finite. T o formalize a minimax statemen t, we use a cum ulan t generating function viewp oint. Lemma 2.6 (CGF of log B and log-con v exit y of I ) . L et Y := log B and ϕ ( t ) := log I ( t ) . F or t ∈ D , I ( t ) = E H 2 [ e tY ] , henc e ϕ is the cumulant gener ating function of Y under H 2 . In p articular, I is lo g-c onvex on D (e quivalently, ϕ is c onvex on D ), and D is a c onvex set. Mor e over, if p 1  = p 2 µ -a.e., then ϕ is strictly c onvex on in t( D ) . Pr o of. The iden tit y I ( t ) = E H 2 [ e tY ] is immediate from Theorem 2.1 and the definition of Y . F or log-conv exity , let t 1 , t 2 ∈ D and λ ∈ (0 , 1) . By Hölder’s inequalit y , I  λt 1 + (1 − λ ) t 2  = E H 2 h e ( λt 1 +(1 − λ ) t 2 ) Y i = E H 2 h e λt 1 Y e (1 − λ ) t 2 Y i ≤ I ( t 1 ) λ I ( t 2 ) 1 − λ . T aking logarithms yields con vexit y of ϕ , and in particular the left-hand side is finite, so D is conv ex. If p 1  = p 2 µ -a.e., then Y is nondegenerate under H 2 , and strictness of Hölder (for t 1  = t 2 in in t( D ) ) implies strict conv exity of ϕ on in t( D ) . 2.3 A minimax principle for c ho osing the exp onen t F or t ∈ D , define R 1 ( t ) := V ar H 1 ( B t − 1 ) ∈ [0 , ∞ ] , R 2 ( t ) := V ar H 2 ( B t ) ∈ [0 , ∞ ] . 9 Since I ( t ) < ∞ for t ∈ D , Lemma 2.4 together with the conv ention on extended v ariances implies that R 2 ( t ) < ∞ if and only if 2 t ∈ D , and R 1 ( t ) < ∞ if and only if 2 t − 1 ∈ D . W e define the w orst-side risk R ( t ) := max { R 1 ( t ) , R 2 ( t ) } ∈ [0 , ∞ ] . Notation (W orst-case comparisons). Define the class of mutually absolutely contin u- ous densit y pairs P ac :=  ( p 1 , p 2 ) : p 1 , p 2 are µ -densities on ( X , A ) satisfying ( 1 )  . All “w orst-case” suprema b elo w (e.g., sup ( p 1 ,p 2 ) ∈P ac ) are tak en ov er this class. W e no w sho w that t = 1 2 is the unique minimax choice for R ( t ) and, moreov er, the only exp onen t whose worst-case risk is uniformly finite ov er m utually absolutely contin uous pairs ( p 1 , p 2 ) . Theorem 2.7 (Half-order is minimax and uniquely worst-case stable) . A ssume p 1  = p 2 µ -a.e. (i) (Pairwise minimaxit y and uniqueness.) F or every t ∈ D , R ( t ) ≥ 1 − ρ 2 , with e quality if and only if t = 1 2 . In p articular, t = 1 2 uniquely minimizes R ( t ) over t ∈ D , and R ( 1 2 ) = 1 − ρ 2 . (ii) (W orst-case div ergence a wa y from half-order.) F or every t  = 1 2 , ther e exists a mutual ly absolutely c ontinuous p air ( p 1 , p 2 ) for which I ( t ) < ∞ but R ( t ) = + ∞ . In p articular, for every fixe d t  = 1 2 , sup  R p 1 ,p 2 ( t ) : ( p 1 , p 2 ) ∈ P ac , I ( t ) < ∞  = + ∞ , 10 wher e as for every ( p 1 , p 2 ) ∈ P ac , R p 1 ,p 2 ( 1 2 ) = 1 − ρ ( p 1 , p 2 ) 2 < ∞ . Remark 2.8 (Degenerate equal-mo del case) . If p 1 = p 2 µ -a.e., then B ≡ 1 and I ( t ) = 1 for al l t , so R 1 ( t ) = R 2 ( t ) = 0 and exp onent sele ction is immaterial. The assumption p 1  = p 2 in The or em 2.7 is use d only to obtain strict ine qualities and uniqueness. Pr o of. W rite ϕ ( t ) = log I ( t ) as in Lemma 2.6 . F or t ≥ 1 2 (with 2 t ∈ D ) define g + ( t ) := ϕ (2 t ) − 2 ϕ ( t ) and for t ≤ 1 2 (with 2 t − 1 ∈ D ) define g − ( t ) := ϕ (2 t − 1) − 2 ϕ ( t ) . Since ϕ is con v ex, its right-deriv ativ e ϕ ′ + ( t ) exists on in t ( D ) and is nondecreasing. Thus for t ≥ 1 2 (whenev er t, 2 t ∈ in t( D ) ), g ′ + ( t ) = 2 ϕ ′ + (2 t ) − 2 ϕ ′ + ( t ) ≥ 0 . Similarly , for t ≤ 1 2 (whenev er t, 2 t − 1 ∈ in t( D ) ), g ′ − ( t ) = 2 ϕ ′ + (2 t − 1) − 2 ϕ ′ + ( t ) ≤ 0 . Therefore g + is nondecreasing on { t ∈ D : t ≥ 1 2 , 2 t ∈ D } and g − is nonincreasing on { t ∈ D : t ≤ 1 2 , 2 t − 1 ∈ D } . Exp onen tiating, for all t suc h that the displa y ed quan tities are finite, I (2 t ) I ( t ) 2 = e g + ( t ) ≥ e g + ( 1 2 ) = I (1) I ( 1 2 ) 2 = ρ − 2 ( t ≥ 1 2 ) , (5) I (2 t − 1) I ( t ) 2 = e g − ( t ) ≥ e g − ( 1 2 ) = I (0) I ( 1 2 ) 2 = ρ − 2 ( t ≤ 1 2 ) . (6) Pr o of of (i). W e split b y the lo cation of t relativ e to 1 2 . Case t > 1 2 . If I (2 t ) = ∞ , then E H 2 [ B 2 t ] = I (2 t ) = ∞ , hence V ar H 2 ( B t ) = ∞ and therefore R ( t ) = ∞ ≥ 1 − ρ 2 . Otherwise I (2 t ) < ∞ , and ( 5 ) yields I ( t ) 2 ≤ ρ 2 I (2 t ) . Th us, R 2 ( t ) = I (2 t ) − I ( t ) 2 ≥ I (2 t ) (1 − ρ 2 ) . 11 Moreo v er, since 2 t > 1 , Lemma 2.3 (iii) gives I (2 t ) > 1 for p 1  = p 2 , so R 2 ( t ) > 1 − ρ 2 . Case t < 1 2 . If I (2 t − 1) = ∞ , then E H 1 [ B 2( t − 1) ] = I (2 t − 1) = ∞ , hence V ar H 1 ( B t − 1 ) = ∞ and therefore R ( t ) = ∞ ≥ 1 − ρ 2 . Otherwise I (2 t − 1) < ∞ , and ( 6 ) yields I ( t ) 2 ≤ ρ 2 I (2 t − 1) . Therefore, R 1 ( t ) = I (2 t − 1) − I ( t ) 2 ≥ I (2 t − 1) (1 − ρ 2 ) . Since 2 t − 1 < 0 , Lemma 2.3 (iii) gives I (2 t − 1) > 1 for p 1  = p 2 , so R 1 ( t ) > 1 − ρ 2 . Case t = 1 2 . By Corollary 2.5 , R ( 1 2 ) = 1 − ρ 2 and b oth side-sp ecific v ariances coincide. Com bining the three cases prov es R ( t ) ≥ 1 − ρ 2 for all t ∈ D , with equalit y only at t = 1 2 . Pr o of of (ii). Fix t  = 1 2 . It suffices to exhibit (for each suc h t ) a single mutually absolutely con tinuous pair ( p 1 , p 2 ) for which I ( t ) < ∞ but either I (2 t ) = ∞ (when t > 1 2 ) or I (2 t − 1) = ∞ (when t < 1 2 ), b ecause then ( 2.4 ) gives R ( t ) = ∞ . W ork on ((0 , 1) , B , µ ) with µ the Lebesgue measure and tak e p 2 ( x ) ≡ 1 . • If t > 1 2 , let γ := 1 / (2 t ) ∈ (0 , 1) and set p 1 ( x ) = (1 − γ ) x − γ 1 (0 , 1) ( x ) . Then I ( s ) = Z 1 0 p 1 ( x ) s dx = (1 − γ ) s Z 1 0 x − γ s dx =              (1 − γ ) s 1 − γ s , γ s < 1 , + ∞ , γ s ≥ 1 . Since γ t = 1 2 < 1 while γ (2 t ) = 1 , w e hav e I ( t ) < ∞ but I (2 t ) = ∞ , so R 2 ( t ) = ∞ . • If t < 1 2 , let γ := 1 / (1 − 2 t ) > 0 and set p 1 ( x ) = ( γ + 1) x γ 1 (0 , 1) ( x ) . Then I ( s ) = Z 1 0 p 1 ( x ) s dx = ( γ + 1) s Z 1 0 x γ s dx =              ( γ + 1) s γ s + 1 , γ s > − 1 , + ∞ , γ s ≤ − 1 . Here γ t = t/ (1 − 2 t ) > − 1 , so I ( t ) < ∞ , but γ (2 t − 1) = − 1 , so I (2 t − 1) = ∞ and hence R 1 ( t ) = ∞ . 12 In b oth cases, the constructed ( p 1 , p 2 ) are strictly p ositiv e on (0 , 1) and therefore mutually absolutely con tinuous, completing the proof. 3 Go o d c hec ks: t w o-sided half-order diagnostics and sim ulation studies This section develops the implications of Section 2 for Go o d che cks of Ba yes factor compu- tations ( Sekulo vski et al. 2024 ). W e first recall why momen t-iden tit y chec ks can b e fragile in practice, due to the skewness and hea vy tails of the Ba y es factor under either hypothesis. W e then formalize a t wo-sided diagnostic based on indep enden tly estimating b oth sides of the T uring–Goo d iden tity , establishing that the half-order c hoice is uniquely minimax-stable in a tw o-sample sense and yields a fav orable matched-budget v ariance comparison. Finally , w e provide n umerical illustrations that highligh t the finite-sample stabilit y of the half-order c hec k and its ability to detect implemen tation and prior mismatc hes. 3.1 Go o d c hec ks for Ba yes factor computations: motiv ation and b ottlenec ks The immediate motiv ation for this w ork is the Go o d che ck prop osed b y Sekulo vski et al. ( 2024 ), a practical diagnostic that lev erages the T uring–Go o d iden tities to v alidate n umerical Ba yes factor computations. Go o d chec ks are attractive b ecause they are broadly mo del- agnostic : they apply to an y pair of marginal likelihoo ds ( p 1 , p 2 ) defined on a common measure space, provided one can (i) sim ulate from the corresp onding prior predictives and (ii) ev aluate the Ba yes factor B = p 1 /p 2 for each simulated dataset. At the same time, the diagnostic inherits a fundamen tal difficulty from Bay es factors themselv es—namely , the skewness and hea vy-tail b ehavior of B = p 1 /p 2 under either mo del. W e summarize 13 k ey b ottlenec ks of the original Goo d-chec k design and use them to motiv ate the half-order viewp oin t dev elop ed in Section 2 . Existing b ottlenec k: Go o d c hec ks are vulnerable to the tails of the Ba y es factor distribution. The T uring–Go o d identities are exact p opulation equalities (e.g., Corollary 2.2 and Theorem 2.1 ), but Goo d c hecks must ev aluate them numerically , t ypically via Monte Carlo: one sim ulates syn thetic datasets under a designated generating model, computes Ba yes factors using the implementation under scrutin y , and then chec ks whether the relev an t empirical av erages match the theoretical v alues ( Sekulo vski et al. 2024 ). The practical difficulty is that the Bay es factor distribution under the generating mo del is often highly asymmetric with hea vy tails. As a result, Mon te Carlo a verages can b e dominated b y r ar e extr eme events : even though such even ts hav e tiny probability , they can induce astronomically large Bay es factor v alues, and hence con tribute non-negligibly (or even decisiv ely) to the mean. A simple illustration already app ears in the canonical binomial point-n ull example. When H 2 : θ = 1 / 2 is true and one ev aluates a Bay es factor fav oring H 1 , ev ents suc h as y = 0 or y = 1 hav e probabilities 2 − n and n 2 − n , resp ectiv ely; for n = 50 these are ab out 8 . 9 × 10 − 16 and 4 . 4 × 10 − 14 . Suc h ev ents are practically never observ ed in finite Mon te Carlo runs, yet the Ba yes factor v alues they produce can b e extremely large. Consequen tly , the empirical mean ma y fail to approximate its theoretical v alue even when the implemen tation is correct, leading to “false alarms” or inconclusive diagnostics ( Sekulo vski et al. 2024 ). Existing b ottleneck: in teger-momen t c hec ks amplify tail sensitivit y , and existing practice requires choosing a “true” (or “more complex”) mo del. The original Go od chec k emphasizes integer-momen t instances—most notably the t = 1 (T uring) and t = 2 cases. The t = 2 chec k effectively brings in a higher-or der moment (equiv alen tly , a 14 squared Ba yes factor term) and therefore magnifies tail sensitivit y: in principle, the larger the momen t order, the more severely rare extremes influence the Monte Carlo estimate. This creates t wo practical complications emphasized b y Sekulovski et al. ( 2024 ). First, numerical stabilit y b ecomes fragile in exactly the settings where diagnostics are most needed (w eak o verlap b etw een p 1 and p 2 , large sample size, sharp marginal likelihoo ds). Second, to mitigate this instabilit y , one is often advised to generate synthetic data from the “more complex” mo del when computing the tail-sensitiv e terms. Ho wev er, in many realistic comparisons it is not obvious whic h mo del should b e regarded as more complex (e.g., m ultiple nonlinear mo dels of similar dimension and flexibilit y), and the diagnostic b ecomes con tingen t on a sub jective design c hoice. Design question and resolution: c ho osing an exp onen t with t w o-sided, distribution-free stability . The T uring–Go od identit y holds for a con tinuum of exp onen ts. How ever, Mon te Carlo reliability is gov erned by second momen ts, and the v ariance iden tities in Lemma 2.4 sho w that the tw o side-sp ecific risks dep end on the shifted o verlaps I (2 t ) and I (2 t − 1) . Section 2 pro v es that t = 1 2 is the unique exponent that (i) equalizes the t w o v ariances and (ii) guaran tees a finite worst-side v ariance uniformly o v er all m utually absolutely contin uous pairs ( p 1 , p 2 ) (Theorem 2.7 ). The next subsection turns this minimax principle in to a concrete t wo-sided diagnostic and compares it, at matc hed Mon te Carlo cost, to the one-sided t = 1 chec k emphasized b y Sekulovski et al. ( 2024 ). 3.2 A t wo-sided c hec k and a cost-matc hed comparison Theorem 2.1 implies that, for every t ∈ D , E H 2 [ B t ] − E H 1 [ B t − 1 ] = 0 . 15 A natural wa y to empirically assess this identit y is to estimate the t w o exp ectations separately from independent samples and to compare the corresp onding sample means. The next lemma sho ws that the half-order choic e is again minimax—no w for the v ariance of the t w o-sided difference. Lemma 3.1 (Unique minimax for the t w o-sided chec k) . L et X 1 ∼ H 1 and X 2 ∼ H 2 b e indep endent and define ∆ t := B ( X 2 ) t − B ( X 1 ) t − 1 . Then, at t = 1 2 , V ar(∆ 1 / 2 ) = 2(1 − ρ 2 ) < ∞ , wher e as for every t  = 1 2 , sup ( p 1 ,p 2 ) ∈P ac V ar(∆ t ) = + ∞ . Pr o of. By indep endence and Lemma 2.4 , V ar(∆ t ) = V ar H 2 ( B t ) + V ar H 1 ( B t − 1 ) = { I (2 t ) − I ( t ) 2 } + { I (2 t − 1) − I ( t ) 2 } . A t t = 1 2 this becomes { I (1) − I ( 1 2 ) 2 } + { I (0) − I ( 1 2 ) 2 } = 2(1 − ρ 2 ) . F or t  = 1 2 , Theorem 2.7 (ii) pro vides pairs ( p 1 , p 2 ) for whic h either V ar H 2 ( B t ) = ∞ (when t > 1 2 ) or V ar H 1 ( B t − 1 ) = ∞ (when t < 1 2 ), forcing V ar(∆ t ) = ∞ for those pairs. Finally , we compare the (balanced) t wo-sample half-order c hec k to the one-sample T uring c hec k advocated by Sekulo vski et al. ( 2024 ), under a fixed Mon te Carlo budget. W e measure Monte Carlo cost b y the num b er N of Ba y es factor ev aluations (equiv alently , dra ws at which B = p 1 /p 2 is computed). Consider: (a) One-sample T uring che ck ( t = 1 ). Draw X 1 , . . . , X N iid ∼ H 2 and compute ¯ B N = N − 1 P N i =1 B ( X i ) . 16 (b) Two-sample half-or der che ck ( t = 1 2 ). Draw X 11 , . . . , X 1 n 1 iid ∼ H 1 and X 21 , . . . , X 2 n 2 iid ∼ H 2 indep enden tly , with n 1 + n 2 = N , and compute ¯ B − 1 / 2 1 n 1 = n − 1 1 P n 1 i =1 B ( X 1 i ) − 1 / 2 and ¯ B 1 / 2 2 n 2 = n − 1 2 P n 2 i =1 B ( X 2 i ) 1 / 2 . F or the one-sample chec k, the natural target is ¯ B N − 1 . F or the tw o-sample c hec k, the natural target is ¯ B 1 / 2 2 n 2 − ¯ B − 1 / 2 1 n 1 . Con v en tion on the balanced split. When w e write the balanced allo cation as n 1 = n 2 = N / 2 , we tacitly assume that N is even. If N is o dd, one may tak e the nearest-integer split ( n 1 , n 2 ) = ( ⌊ N / 2 ⌋ , ⌈ N / 2 ⌉ ) ; this c hanges the displa y ed constan ts and thresholds only b y a factor 1 + O ( N − 2 ) . Theorem 3.2 (W orst-case v ariance b ound and conditional matc hed-cost dominance) . L et V (1) := V ar ( ¯ B N − 1) and let V (2) ( n 1 , n 2 ) := V ar ( ¯ B 1 / 2 2 n 2 − ¯ B − 1 / 2 1 n 1 ) , with n 1 + n 2 = N . F or the b alanc e d split, define V bal (2) := V (2) ( N / 2 , N / 2) . Then: (i) (W orst-case b oundedness.) F or every ( p 1 , p 2 ) ∈ P ac , under the b alanc e d two-sample choic e n 1 = n 2 = N / 2 (with N even), V bal (2) = V (2) ( N / 2 , N / 2) = 4(1 − ρ 2 ) N , henc e sup ( p 1 ,p 2 ) ∈P ac V bal (2) ≤ 4 / N < ∞ . (ii) (Conditional dominance.) Under the b alanc e d split (with N even), whenever ρ ≤ 1 / 2 , V bal (2) ≤ V (1) . Pr o of. F or the one-sample chec k, independence gives V (1) = V ar H 2 ( B ) / N = { I (2) − I (1) 2 } / N = ( I (2) − 1) / N , where we used Lemma 2.4 at t = 1 and Lemma 2.3 (i). F or the t w o-sample half-order c hec k, 17 indep endence and Corollary 2.5 yield V (2) ( n 1 , n 2 ) = 1 n 2 V ar H 2 ( B 1 / 2 ) + 1 n 1 V ar H 1 ( B − 1 / 2 ) =  1 n 1 + 1 n 2  (1 − ρ 2 ) . F or fixed n 1 + n 2 = N , the AM–GM inequalit y implies that n 1 n 2 is maximized at n 1 = n 2 = N / 2 , hence the balanced c hoice minimizes V (2) ( n 1 , n 2 ) and giv es V (2) = 4(1 − ρ 2 ) / N . (i) Since ρ ∈ (0 , 1] , we ha v e 1 − ρ 2 ≤ 1 , so sup ( p 1 ,p 2 ) V (2) ≤ 4 / N . On the other hand, by Theorem 2.7 (ii), there exist pairs for whic h I (2) = ∞ , which implies V (1) = ( I (2) − 1) / N = ∞ . (ii) By ( 5 ) at t = 1 , I (2) ≥ ρ − 2 (since I (1) = 1 ), hence V (1) = I (2) − 1 N ≥ ρ − 2 − 1 N = 1 − ρ 2 N ρ 2 . Under the balanced split n 1 = n 2 = N / 2 , w e hav e V bal (2) = 4 1 − ρ 2 N . Therefore V (2) ≤ V (1) whenev er 4(1 − ρ 2 ) N ≤ 1 − ρ 2 N ρ 2 , i.e., whenev er 4 ≤ ρ − 2 , equiv alen tly ρ ≤ 1 / 2 . Remark (When ρ > 1 2 ). When the t w o mo dels are v ery similar (high o v erlap), ρ can exceed 1 / 2 . In this regime, the ordering b etw een V (1) and V (2) is not determined by ρ alone: under the balanced split, V (2) ≤ V (1) ⇐ ⇒ I (2) ≥ 1 + 4(1 − ρ 2 ) . Since I (2) = R p 1 ( x ) 2 /p 2 ( x ) dx = 1 + χ 2 ( p 1 ∥ p 2 ) , this condition is equiv alent to χ 2 ( p 1 ∥ p 2 ) ≥ 4(1 − ρ 2 ) . Con vexit y alone guaran tees only I (2) ≥ ρ − 2 , so either ordering is possible when ρ > 1 2 . Nevertheless, the worst-case comparison remains fav orable to the half-order t w o-sample chec k: V (2) is alwa ys finite and uniformly b ounded by 4 / N , whereas V (1) can b e arbitrarily large. 18 3.3 Practical implemen tation and sim ulation studies The half-order theory suggests a simple default: estimate the ov erlap ρ = I ( 1 2 ) from b oth generating mo dels and c heck agreemen t. This yields a symmetric diagnostic that do es not require deciding whic h mo del is “true” or “more complex” and, by Theorem 2.7 , guaran tees b ounded p er-dra w v ariance on b oth sides. W e summarize the resulting w orkflo w and then rep ort sim ulation studies. A half-order Go o d c heck. Let B ( x ) = p 1 ( x ) /p 2 ( x ) b e the Ba y es factor computed b y the n umerical pro cedure under scrutin y . A half-order Goo d chec k compares the t w o sides of Theorem 2.1 at t = 1 2 : E H 2 h B 1 / 2 i = E H 1 h B − 1 / 2 i = ρ. Algorithm 1 summarizes a one-b o x recip e for implementing this t wo-sided diagnostic. Algorithm 1: T wo-sided half-order Go od c hec k. Input: prior-predictiv e simulators under H 1 and H 2 ; a Bay es factor ev aluator B ( x ) = p 1 ( x ) /p 2 ( x ) . Output: (∆ , b se(∆) , b ρ ) . Step 1: Cho ose Mon te Carlo budgets m 1 , m 2 (default m 1 = m 2 ) and a tolerance lev el ε . Step 2: Draw X 2 i i.i.d. ∼ H 2 and compute U i := B ( X 2 i ) 1 / 2 for i = 1 , . . . , m 2 . Step 3: Draw X 1 i i.i.d. ∼ H 1 and compute V i := B ( X 1 i ) − 1 / 2 for i = 1 , . . . , m 1 . Step 4: Compute b ρ 2 := 1 m 2 m 2 X i =1 U i , b ρ 1 := 1 m 1 m 1 X i =1 V i , ∆ := b ρ 2 − b ρ 1 , b ρ := b ρ 1 + b ρ 2 2 . Step 5: Chec k if ∆ is close enough to zero. If one w an ts an ob jectiv e th teshold, one option may b e to estimate b se (∆) and calibrate ε using the rules in Section 3.5; p ersisten t | ∆ | > ε flags a sim ulator–ev aluator mismatc h. 19 Under correct Bay es factor computation, b oth estimators target the same quan tit y ρ , so the discrepancy ∆ should b e close to 0 up to Mon te Carlo error. Because eac h summand has b ounded v ariance at t = 1 2 (Theorem 2.7 ), this c heck is intrinsically less sensitiv e to rare, extreme Ba y es factor v alues than the integer-momen t choices emphasized in the original Go o d c heck. Symmetry: no need to decide which mo del is “true” (or “more complex”). A further practical adv an tage of the half-order c hoice is symmetry b etw een the tw o mo del orien tations. W riting B = p 1 /p 2 (so B − 1 = p 2 /p 1 ), the half-order iden tity is E H 2 h B 1 / 2 i = E H 1 h B − 1 / 2 i = ρ. Equiv alen tly , one may swap the mo del lab els (i.e., replace B b y B − 1 ) without c hanging the diagnostic. Th us the c heck need not privilege one mo del as “true” or “more complex”: whic hev er direction is easier to sim ulate or implement, the same theoretical target and the same v ariance guaran tee apply . Matc hed-budget comparison: when ov erlap is small, the half-order t w o-sided c hec k can dominate the one-sided T uring c heck. A common practical constrain t is a fixed budget of N Ba y es factor ev aluations. Under such a budget, Sekulo vski et al. ’s one- sided T uring chec k targets t = 1 with v ariance prop ortional to I (2) − 1 = E H 2 [ B 2 ] − 1 , which ma y be arbitrarily large or even infinite. By con trast, the balanced tw o-sided half-order design has v ariance prop ortional to 1 − ρ 2 ≤ 1 and is uniformly bounded in the w orst case; see Theorem 3.2 for a formal matc hed-cost comparison and dominance conditions. Hence, ev en when one is primarily in terested in c hecking Corollary 2.2 , the half-order t wo-sided c hec k provides a conserv ativ e default with minimax-stable behavior. 20 3.4 Sim ulation studies W e use a binomial point-n ull example (adapted from Sekulo vski et al. ( 2024 )) to compare finite-run Mon te Carlo b ehavior across three Go o d-c heck c hoices, t = 1 2 , 1 , 2 , and to illustrate sensitivit y to simulator–ev aluator mismatc h. Throughout, B ( y ) := p 1 ( y ) /p 2 ( y ) denotes the Ba yes factor in fa v or of H 1 o ver H 2 . F or eac h configuration and eac h n ∈ { 10 , 50 , 100 } , w e conducted R = 10 , 000 indep enden t Go o d-c heck runs. Eac h run uses m = 2 , 000 prior- predictiv e draws from H 1 and m = 2 , 000 prior-predictive dra ws from H 2 , and w e record Mon te Carlo differ enc es for the half-order iden tit y and for the integer-order ( t = 1 , 2 ) iden tities. T ables rep ort Mean (SD) o ver the R runs. Figures visualize the finite- m b eha vior for n = 50 using fan charts (mean and cen tral 50%/90% bands across runs). 3.4.1 Sim ulation study 1A: Binomial example (correct sp ecification) Mo del: Y | θ ∼ Binomial( n, θ ) , H 2 : θ = 1 2 , H 1 : θ ∼ Beta (1 , 1) . Ba yes factors are ev aluated under the same (correct) sp ecification. T able 1 summarizes Mon te Carlo differences for the half-order discrepancy b ∆ 1 / 2 := b E H 2 [ B 1 / 2 ] − b E H 1 [ B − 1 / 2 ] and for the t = 1 , 2 c hecks (both forward and rev erse orientations). T able 1 shows that the half-order t wo-sided discrepancy remains tigh tly cen tered near zero with small run-to-run v ariabilit y across all n . In con trast, the forwar d integer-order quan tities in v olving b E H 2 [ B ] and esp ecially b E H 2 [ B 2 ] can b e extremely unstable in finite Mon te Carlo runs b ecause they are dominated by exceedingly rare outcomes under H 2 . The rev erse-orien ted integer c hecks are mark edly more stable on this scale. Figure 1 complements the table by displaying the run-to-run distribution of running discrepancies as a function of m for n = 50 (half-order discrepancy , rev erse T uring, and rev erse Go o d). 21 −0.4 −0.2 0.0 0.2 0.4 ∆ 1 2 (A) t = 1/2 −0.4 −0.2 0.0 0.2 0.4 E H 1 ( B − 1 ) − 1 (B) t = 1 0 500 1000 1500 2000 −0.4 −0.2 0.0 0.2 0.4 E H 1 ( B − 2 ) − E H 2 ( B − 1 ) (C) t = 2 m Data sets Figure 1: Simulation study 1A (binomial example, correctly sp ecified; n = 50 ): fan c harts of running Monte Carlo discrepancies as a function of m (n um b er of sim ulated datasets p er h yp othesis) across R = 10 , 000 indep endent runs. Panel (A) shows b ∆ 1 / 2 ( m ) = b E H 2 , ≤ m [ B 1 / 2 ] − b E H 1 , ≤ m [ B − 1 / 2 ] . Panel (B) sho ws the rev erse T uring discrepancy b E H 1 , ≤ m [ B − 1 ] − 1 . Panel (C) sho ws the rev erse Go o d discrepancy b E H 1 , ≤ m [ B − 2 ] − b E H 2 , ≤ m [ B − 1 ] . Solid curves are means; shaded bands are cen tral 50% (dark) and 90% (ligh t) in terv als; the horizon tal dotted line indicates 0 (the exact identit y). 22 t = 1 2 t = 1 t = 2 n b ∆ 1 / 2 b E H 2 [ B ] − 1 b E H 1 [ B − 1 ] − 1 b E H 2 [ B 2 ] − b E H 1 [ B ] b E H 1 [ B − 2 ] − b E H 2 [ B − 1 ] 10 0 . 000 (0 . 018) − 0 . 001 (0 . 095) 0 . 000 (0 . 022) − 0 . 074 (8 . 508) 0 . 000 (0 . 059) 50 0 . 000 (0 . 023) − 0 . 209 (7 . 278) 0 . 000 (0 . 039) − 8 . 84e+11 (9 . 62e+10) − 0 . 001 (0 . 198) 100 0 . 000 (0 . 023) − 0 . 441 (3 . 103) 0 . 000 (0 . 048) − 2 . 51e+26 (3 . 91e+25) 0 . 000 (0 . 340) T able 1: Sim ulation study 1A (binomial example, correctly sp ecified): Mon te Carlo differ- enc es . Entries are Mean (SD) o v er R = 10000 rep etitions, eac h using m = 2000 dra ws p er h yp othesis. 3.4.2 Sim ulation study 1B: Binomial example (simulator–ev aluator mismatc h) T o emulate a mild missp ecification, w e k eep the Bayes factor evaluator fixed at the in tended mo del H 1 : θ ∼ Beta (1 , 1) , but generate syn thetic data under H 1 using a sligh tly different prior: Y | θ ∼ Binomial( n, θ ) , H 2 : θ = 1 2 , sim ulator under H 1 : θ ∼ Beta (1 . 2 , 1 . 2) . This sim ulator–ev aluator mismatch violates the exact T uring–Go o d iden tities, and should therefore induce systematic, p ersisten t discrepancies. T able 2 reports Mean (SD) differences across R = 10 , 000 runs. The half-order discrepancy b ∆ 1 / 2 is shifted a wa y from zero with comparativ ely small v ariabilit y , indicating clear de- tectabilit y at mo derate Mon te Carlo budgets. As in the correctly sp ecified case, forward in teger-order quantities under H 2 can b e highly unstable. Figure 2 visualizes the same phenomenon for n = 50 via fan c harts of running discrepancies. The v ertical blue line marks the first m (on the plotting grid) at which the central 90% band no longer con tains 0 , providing a simple visual “detection time” summary . 23 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 ∆ 1 2 (A) t = 1/2 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 E H 1 ( B − 1 ) − 1 (B) t = 1 0 500 1000 1500 2000 −0.3 −0.2 −0.1 0.0 0.1 0.2 0.3 E H 1 ( B − 2 ) − E H 2 ( B − 1 ) (C) t = 2 m Data sets Figure 2: Sim ulation study 1B (binomial example; sim ulator–ev aluator mismatc h; n = 50 ): fan c harts of running Mon te Carlo discrepancies across R = 10 , 000 runs. P anels (A)–(C) matc h Figure 1 . The vertical blue line indicates the first m at whic h the central 90% band excludes 0 (on the plotted m grid). 24 t = 1 2 t = 1 t = 2 n b ∆ 1 / 2 b E H 2 [ B ] − 1 b E H 1 [ B − 1 ] − 1 b E H 2 [ B 2 ] − b E H 1 [ B ] b E H 1 [ B − 2 ] − b E H 2 [ B − 1 ] 10 − 0 . 051 (0 . 017) 0 . 000 (0 . 094) 0 . 079 (0 . 022) 3 . 213 (8 . 465) 0 . 171 (0 . 059) 50 − 0 . 061 (0 . 025) − 0 . 109 (10 . 188) 0 . 108 (0 . 041) − 5 . 43e+11 (7 . 62e+10) 0 . 447 (0 . 210) 100 − 0 . 054 (0 . 054) 4 . 329 (474 . 266) 0 . 112 (0 . 050) − 1 . 35e+26 (2 . 86e+25) 0 . 644 (0 . 358) T able 2: Sim ulation study 1B (binomial example, correctly specified): Mon te Carlo differ- enc es . Entries are Mean (SD) o v er R = 10000 rep etitions, eac h using m = 2000 dra ws p er h yp othesis. T ak ea w a y . A cross b oth w orkflows, the empirical behavior mirrors the theoretical message of Section 2 : Monte Carlo reliabilit y of Go o d c hecks is con trolled b y shifted o verlaps suc h as I (2 t ) and can deteriorate rapidly a wa y from half order, often in a strongly asymmetric wa y across the t wo orien tations. The t wo-sided half-order c heck remains stable across mo del separations (as reflected b y ρ ) and retains sensitivit y to simulator–ev aluator mismatc hes, including mild prior p erturbations, without requiring a judgement ab out whic h mo del should b e treated as “true” or “more complex. ” 3.5 Implications for practice and design (1) Default recommendation: a balanced t w o-sided half-order chec k. F or Monte Carlo verification of the T uring–Goo d iden tity (Theorem 2.1 ), the half-order exp onen t t = 1 2 is uniquely stable in second momen ts: it equalizes the t wo side-specific v ariances at 1 − ρ 2 (Corollary 2.5 ) and uniquely minimizes the w orst-side v ariance risk R ( t ) (Theorem 2.7 ). Op erationally , the natural two-side d estimator compares ¯ B 1 / 2 2 n 2 and ¯ B − 1 / 2 1 n 1 using indep enden t dra ws from H 2 and H 1 , respectively . F or a fixed total budget N = n 1 + n 2 Ba yes factor ev aluations, Theorem 3.2 sho ws that the v ariance V (2) ( n 1 , n 2 ) =  1 n 1 + 1 n 2  (1 − ρ 2 ) is 25 minimized b y the balanced split n 1 = n 2 = N / 2 (see the con v ention on the balanced split preceding Theorem 3.2 ), giving V (2) = 4(1 − ρ 2 ) / N . In particular, sup ( p 1 ,p 2 ) ∈P ac V bal (2) ≤ 4 / N . In particular, sup V (2) ≤ 4 / N . (2) Small-ov erlap regime ( ρ ≤ 1 2 ): t w o-sided half-order dominates the one-sided T uring chec k. Under the same Mon te Carlo budget N , the one-sided T uring chec k based on ¯ B N has v ariance V (1) = ( I (2) − 1) / N (Theorem 3.2 ). Theorem 3.2 (ii) shows that, under the balanced split, whenev er ρ ≤ 1 2 , V (2) ≤ V (1) . Equiv alen tly , the relativ e efficiency RE := V (1) V (2) = I (2) − 1 4(1 − ρ 2 ) satisfies the univ ersal low er b ound RE ≥ 1 / (4 ρ 2 ) (using I (2) ≥ ρ − 2 from ( 5 ) at t = 1 ). Thus, at ρ = 1 2 the pro cedures are (at least) equally efficien t, while at ρ = 0 . 25 the half-order t w o-sided chec k is guaranteed to b e ≥ 4 × as efficien t at matched cost. (3) Rep orting and in terpreting the o v erlap ρ . The o v erlap ρ = R √ p 1 p 2 dx is the Bhattac haryy a co efficien t (Hellinger affinit y) b et w een the t wo marginal lik eliho o ds. It is the only model-dep endent quan tity go verning the b ounded tw o-sided half-order v ariance V (2) . F or comm unication and planning, it is therefore useful to report either ρ itself or 1 − ρ 2 (whic h equals the common half-order v ariance under either mo del; Corollary 2.5 ). Moreov er, once ¯ B 1 / 2 2 n 2 and ¯ B − 1 / 2 1 n 1 agree within Monte Carlo error, either can be reported as an estimate of ρ . (4) Cost-aw are allo cation and conserv ativ e thresholds. If the unit costs of gener- ating a dra w and ev aluating B differ under H 1 and H 2 (sa y c 1 and c 2 ), one can minimize 26 V (2) ( n 1 , n 2 ) under a cost constraint c 1 n 1 + c 2 n 2 = C . A Lagrange-multiplier calculation yields the optimal ratio n 1 : n 2 ∝ 1 √ c 1 : 1 √ c 2 . When ρ is unkno wn, the w orst-case b ound V (2) ≤ 4 / N (balanced split) implies a simple distribution-free Cheb yshev guarantee: P     ¯ B 1 / 2 2 n 2 − ¯ B − 1 / 2 1 n 1    ≥ ε  ≤ V (2) ε 2 ≤ 4 N ε 2 . This can b e used to choose conserv ative tolerance lev els for the diagnostic b efore any pilot estimate of ρ is av ailable. (5) Standard error Let ∆ := b ρ 2 − b ρ 1 b e the observ ed discrepancy in the t w o-sided half-order c hec k (Algorithm 1 ; Section 3.3). Under correct Ba yes factor computation, E [∆] = 0 and, by Theorem 3.2 , V ar(∆) = (1 − ρ 2 )  1 m 1 + 1 m 2  . In practice, one can compute the sample v ariances s 2 2 := 1 m 2 − 1 m 2 X i =1  B ( X 2 i ) 1 / 2 − b ρ 2  2 , s 2 1 := 1 m 1 − 1 m 1 X i =1  B ( X 1 i ) − 1 / 2 − b ρ 1  2 , and the estimated standard error b se(∆) := s s 2 2 m 2 + s 2 1 m 1 . Then the studentized statistic T := ∆ / b se (∆) is appro ximately standard normal for mo derate ( m 1 , m 2 ) , suggesting the tolerance ε CL T ( α ) := z 1 − α/ 2 b se(∆) , 27 where z 1 − α/ 2 is the (1 − α / 2) quan tile of the standard normal. A conserv ativ e alternativ e that uses only the distribution-free bound 1 − ρ 2 ≤ 1 is to bound the true standard error: se(∆) := s V ar H 2 ( B 1 / 2 ) m 2 + V ar H 1 ( B − 1 / 2 ) m 1 = q 1 − ρ 2 s 1 m 1 + 1 m 2 ≤ s 1 m 1 + 1 m 2 . Define se wc (∆) := q 1 /m 1 + 1 /m 2 and use ε wc CL T ( α ) := z 1 − α/ 2 se wc (∆) . (6) When a one-sided chec k is unav oidable. If sim ulation is feasible from only one mo del, then a t wo-sided chec k is imp ossible and one must rely on a one-sided identit y: t = 1 under H 2 (Corollary 2.2 ) or, symmetrically , t = 0 under H 1 (since I (0) = 1 and E H 1 [ B − 1 ] = 1 b y Theorem 2.1 ). Theorem 2.7 (ii) cautions that such one-sided chec ks can ha v e unbounded v ariance for some m utually absolutely con tinuous pairs, so in practice they should b e accompanied b y tail diagnostics for B (or log B ), sufficien tly large Monte Carlo budgets, and—whenev er feasible—a return to the tw o-sided half-order design. 4 Broader implications of the half-order exp onen t Section 2 iden tifies the half-order exp onen t t = 1 2 as the unique choice that equalizes and uniformly b ounds the Mon te Carlo v ariances on the H 1 and H 2 sides (Theorem 2.7 ). Although our motiv ating application w as Go o d che cks for Ba y es factor implemen tations ( Sekulo vski et al. 2024 ), the underlying structure is more general: many core problems in Ba y esian computation reduce to estimating exp ectations of p ow ers of a density ratio, and their n umerical stability is go verned b y the same o verlap family I ( t ) = R p t 1 p 1 − t 2 dµ . W e b egin with the most cen tral computational instance: r atios of normalizing c onstants . This framing subsumes marginal lik eliho o d ratios, Ba yes factors, and (more broadly) free-energy differences. W e show that the half-order midp oin t √ p 1 p 2 yields a canonical “geometric 28 bridge” for ratio estimation, together with a stably estimable ov erlap ρ = I ( 1 2 ) that directly con trols the relative Mon te Carlo error. W e then connect the same half-order structure to further settings (testing-theoretic bounds, distribution shift, and robust evidence measures) in the subsequen t subsections. 4.1 Normalizing-constan t ratios: the geometric bridge and o verlap-based design Setup: ratios of normalizing constants as the computational core. Let ˜ p 1 and ˜ p 2 b e nonnegativ e in tegrable functions on ( X , A , µ ) with unkno wn normalizing constan ts Z j = R ˜ p j dµ ∈ (0 , ∞ ) and normalized densities p j = ˜ p j / Z j . The fundamental target is the ratio r := Z 1 Z 2 , whic h includes Bay es factors as a special case (e.g., Z j as marginal lik eliho o d under mo del j ). In practice, Z j is typically in tractable ev en when sim ulation from p j is feasible (e.g., b y MCMC), so computation hinges on iden tities that express r as an exp ectation under one or b oth of p 1 , p 2 . Momen t fragility of one-sided ratio estimators. Define the unnormalized ratio w ( x ) := ˜ p 1 ( x ) ˜ p 2 ( x ) . A baseline iden tity is the one-sided importance-sampling representation Z 1 Z 2 = E p 2 [ w ( X )] , so b r IS = 1 n P n i =1 w ( X i ) with X i ∼ p 2 is unbiased. How ever, its reliabilit y is go verned b y V ar p 2 ( w ) (equiv alently E p 2 [ w 2 ] ), which can b e enormous or infinite under mild tail mismatch or limited ov erlap. This is a core reason wh y bridge sampling and related stabilized estimators are widely used ( Meng & W ong 1996 , Gronau et al. 2017 ). 29 A robust design desideratum. General bridge-sampling iden tities introduce a bridge function h and use samples from both p 1 and p 2 to stabilize ratio estimation ( Meng & W ong 1996 , Gronau et al. 2017 ); related optimality ideas also appear in free-energy computation ( Bennett 1976 , Shirts & Cho dera 2008 ). Y et in the regimes that motiv ate bridge/path sampling (small ov erlap; hea vy tails; high dimension), many candidate transformations still deteriorate sharply , and the c hoice of tuning criteria based on higher moments of densit y ratios can b ecome ill-p osed. This motiv ates tw o practical questions: (i) whic h ratio transformation provides a distribution-fr e e stabilit y baseline? (ii) whic h o v erlap quan tity can b e estimated reliably enough to guide reference choice and bridging design? Half order as the uniquely w orst-case-stable midp oin t. Section 2 answers (i)–(ii) in a unified manner. Applied to the normalized density ratio B = p 1 /p 2 , the T uring–Goo d iden tit y (Theorem 2.1 ) gives E H 2 [ B t ] = E H 1 [ B t − 1 ] = I ( t ) whenev er these exp ectations are finite. Crucially , Theorem 2.7 sho ws that t = 1 2 is the unique exp onen t that equalizes and uniformly b ounds the w orst-side v ariance across all mutually absolutely con tinuous pairs. T ranslating back to normalizing-constan t ratios yields a canonical “geometric bridge” . Prop osition 4.1 (Half-order bridge iden tity for normalizing-constan t ratios) . In the ab ove setting, assume p 1 and p 2 ar e mutual ly absolutely c ontinuous. Define w ( x ) := ˜ p 1 ( x ) / ˜ p 2 ( x ) , r := Z 1 / Z 2 , and the half-or der overlap ρ := Z q p 1 ( x ) p 2 ( x ) dµ ∈ (0 , 1] . Then E p 2 h w ( X ) 1 / 2 i = r 1 / 2 ρ, E p 1 h w ( X ) − 1 / 2 i = r − 1 / 2 ρ, 30 and henc e the r atio admits the half-order bridge iden tit y Z 1 Z 2 = E p 2 h w ( X ) 1 / 2 i E p 1 [ w ( X ) − 1 / 2 ] . (7) Pr o of. By definition, E p 2 [ w 1 / 2 ] = Z − 1 2 R ˜ p 2 ( ˜ p 1 / ˜ p 2 ) 1 / 2 dµ = Z − 1 2 R √ ˜ p 1 ˜ p 2 dµ . Since √ ˜ p 1 ˜ p 2 = √ Z 1 Z 2 √ p 1 p 2 , this equals q Z 1 / Z 2 R √ p 1 p 2 dµ = r 1 / 2 ρ . The p 1 iden tity is analogous, and taking the ratio yields ( 7 ). Prop osition 4.2 (Second-momen t stability and an o verlap-con trolled v ariance form ula) . Under the c onditions of Pr op osition 4.1 , V ar p 2  w ( X ) 1 / 2  = r (1 − ρ 2 ) , V ar p 1  w ( X ) − 1 / 2  = r − 1 (1 − ρ 2 ) . In p articular, the half-or der building blo cks have finite varianc e for all mutual ly absolutely c ontinuous p airs. Mor e over, their c o efficients of variation c oincide: V ar p 2 ( w 1 / 2 ) E p 2 [ w 1 / 2 ] 2 = V ar p 1 ( w − 1 / 2 ) E p 1 [ w − 1 / 2 ] 2 = ρ − 2 − 1 . With indep endent samples X 21 , . . . , X 2 m 2 iid ∼ p 2 and X 11 , . . . , X 1 m 1 iid ∼ p 1 , define b r 1 / 2 := b a 2 b a 1 , b a 2 := 1 m 2 m 2 X i =1 w ( X 2 i ) 1 / 2 , b a 1 := 1 m 1 m 1 X i =1 w ( X 1 i ) − 1 / 2 . Then, as m 1 , m 2 → ∞ , V ar( b r 1 / 2 ) = r 2 1 − ρ 2 ρ 2  1 m 1 + 1 m 2  + o  1 m 1 + 1 m 2  . Pr o of. W rite B = p 1 /p 2 so that w = r B . Then w 1 / 2 = r 1 / 2 B 1 / 2 and w − 1 / 2 = r − 1 / 2 B − 1 / 2 . Under p 2 , E [ B 1 / 2 ] = ρ and V ar ( B 1 / 2 ) = 1 − ρ 2 b y Corollary 2.5 , so V ar p 2 ( w 1 / 2 ) = r V ar p 2 ( B 1 / 2 ) = r (1 − ρ 2 ) ; the p 1 side is iden tical. The co efficien t-of-v ariation iden tit y follo ws by dividing by the squared means from Prop osition 4.1 . Finally , the v ariance expansion for b r 1 / 2 follo ws from the CL T for ( b a 2 , b a 1 ) and a first-order delta metho d for ( a 2 , a 1 ) 7→ a 2 /a 1 . 31 Remark (Overlap estimation at essen tially no additional cost). Prop osition 4.1 implies the product identit y E p 2 [ w 1 / 2 ] E p 1 [ w − 1 / 2 ] = ρ 2 . Therefore, if b a 2 and b a 1 denote the sample means of w 1 / 2 under p 2 and w − 1 / 2 under p 1 as in Prop osition 4.2 , then b ρ 2 := b a 2 b a 1 , b ρ := ( b a 2 b a 1 ) 1 / 2 , is av ailable alongside the ratio estimator b r 1 / 2 = b a 2 / b a 1 at essentially no additional cost. Because each half-order summand has finite v ariance for all mutually absolutely con tin uous pairs, b ρ pro vides a stable pilot diagnostic of o v erlap that can guide reference c hoice and the refinemen t of intermediate bridgingsc hedules (Section 4.2 ). Remark (Relation to bridge sampling and what is distinctive here). Iden tit y ( 7 ) is a geometric sp ecial case of bridge sampling ( Meng & W ong 1996 , Gronau et al. 2017 ). The p oin t emphasized here is not the existence of a bridge iden tity per se, but that t = 1 2 inherits the distribution-fr e e worst-case second-momen t guaran tee from Section 2 : regardless of tail b eha vior (as long as m utual absolute contin uity holds), the half-order summands admit finite v ariance and their relativ e v ariability is go verned b y ρ alone. This provides a conserv ativ e default (and a reliable pilot quan tity) ev en when higher-moment criteria are unstable. Design implication for marginal lik eliho o d computation. In marginal likelihoo d estimation one often compares ˜ p ( θ ) = p ( y | θ ) π ( θ ) to a tractable reference ˜ g ( θ ) with kno wn normalizing constan t, reducing the problem to estimating Z ˜ p / Z ˜ g via bridge/path sampling ( Meng & W ong 1996 , Gronau et al. 2017 , 2020 ). Prop ositions 4.1 – 4.2 yield a practical principle: use the ge ometric/half-or der bridge as a r obust b aseline when overlap or tails ar e unc ertain, and use the half-or der overlap ρ —estimable with b ounde d varianc e under either mo del—as a stable tuning tar get when c onstructing r efer enc es or interme diate bridgingsche dules. Concretely , when a difficult ratio is decomp osed in to a pro duct of 32 easier ratios, a natural operational goal is to ensure that the pairwise o v erlaps ρ b et w een consecutiv e states are not to o small. This ov erlap-guided viewp oint is pursued further for IS w eight diagnostics in Section 4.2 . Sim ulation study 3: Half-order bridge vs. one-sided imp ortance sampling under tail mismatc h. T o complemen t Prop ositions 4.1 – 4.2 , w e giv e a minimal to y family in whic h either one-sided normalizing-constan t estimator can b e theoretically ill-posed (infinite v ariance) dep ending on the direction of tail mismatc h, while the half-or der bridge estimator remains stable in b oth directions. Mo del p air and the tar get r atio. On (0 , 1) , let ˜ p 2 ( x ) ≡ 1 so that Z 2 = R 1 0 ˜ p 2 ( x ) dx = 1 and p 2 ( x ) ≡ 1 . Let p 1 b e the Beta ( a, 1) densit y , p 1 ( x ) = a x a − 1 , and set ˜ p 1 ( x ) = r p 1 ( x ) so that Z 1 = R 1 0 ˜ p 1 ( x ) dx = r . Then the normalizing-constan t ratio is Z 1 / Z 2 = r and the unnormalized ratio is w ( x ) := ˜ p 1 ( x ) ˜ p 2 ( x ) = r p 1 ( x ) = r a x a − 1 . (F or the Monte Carlo exp eriments b elow w e set r = 1 without loss of generalit y , since w e rep ort relativ e error.) Estimators and c ost matching. W e compare the half-order bridge estimator b r 1 / 2 := b a 2 b a 1 , b a 2 := 1 n 2 n 2 X i =1 w ( X 2 i ) 1 / 2 , b a 1 := 1 n 1 n 1 X j =1 w ( X 1 j ) − 1 / 2 , with indep enden t samples X 2 i iid ∼ p 2 and X 1 j iid ∼ p 1 , against the appropriate one-sided estimator in eac h stress test: b r F := 1 N N X i =1 w ( X 2 i ) , X 2 i iid ∼ p 2 (forw ard one-sided IS; t = 1 ) , and b r R := 1 N N X i =1 w ( X 1 i ) − 1 ! − 1 , X 1 i iid ∼ p 1 (rev erse one-sided IS; based on t = 0 ). 33 T o make comparisons cost-matc hed, w e fix the total n umber of ratio ev aluations to N total = 2000 . F or a one-sided estimator w e tak e N = 2000 draws from its sampling distribution. F or the half-order bridge w e split ( n 1 , n 2 ) = (1000 , 1000) . Why this family is diagnostic: forwar d and r everse br e akdown. The forward estimator b r F is un biased ( E p 2 [ w ] = r ) but its v ariance can b e infinite: E p 2 [ w 2 ] = r 2 Z 1 0 p 1 ( x ) 2 dx = r 2 a 2 Z 1 0 x 2 a − 2 dx = ∞ whenev er a ≤ 1 2 . Con v ersely , the in verse of the rev erse estimator, ˆ r − 1 R , is un biased for r − 1 (since E p 1 [ w − 1 ] = r − 1 ) , but it can b e ill-p osed b ecause E p 1 [ w − 2 ] = r − 2 Z 1 0 p 1 ( x ) − 1 dx = r − 2 a Z 1 0 x 1 − a dx = ∞ whenev er a ≥ 2 . Th us, one-sided ratio estimation can fail on either side , depending on the direction of tail mismatc h. Half-or der pr e diction via overlap. In this Beta ( a, 1) vs. Unif (0 , 1) family , the half-order o v erlap is av ailable in closed form: ρ ( a ) = Z 1 0 q p 1 ( x ) p 2 ( x ) dx = Z 1 0 √ a x ( a − 1) / 2 dx = 2 √ a a + 1 , ρ ( a ) 2 = 4 a ( a + 1) 2 . Prop osition 4.2 yields the asymptotic relativ e standard deviation RSD( b r 1 / 2 ) ≈ s 1 − ρ 2 ρ 2  1 n 1 + 1 n 2  = s  ρ − 2 − 1   1 n 1 + 1 n 2  . With n 1 = n 2 = 1000 , this predicts RSD( b r 1 / 2 ) ≈ 1 . 58% for a = 1 2 and ≈ 2 . 58% for a = 3 . T ake away. This to y family isolates the practical con ten t of the half-order theory for normalizing-constan t ratios: one-sided ratio estimators can b e dominated by rare extreme w eigh ts in either dir e ction of tail mismatch, making CL T-based error assessmen t ill-founded, whereas the geometric/half-order bridge pro vides a conserv ativ e default whose Mon te Carlo building blo c ks admit distribution-free second-momen t control and whose relativ e error is go v erned by the stably estimable o v erlap ρ . 34 0.00 0.05 0.10 0.15 0.20 log 10 ( r r ) (A) a=0.5, rho^2=0.889 Pred. RSD=1.58% Forw ard IS (t=1) Half−order bridge (t=1/2) Density 0.00 0.05 0.10 0.15 0.20 0 10 20 30 40 50 60 Forw ard IS (t=1) Half−order bridge (t=1/2) (B) −0.8 −0.6 −0.4 −0.2 0.0 log 10 ( r r ) (C) a=3, rho^2=0.750 Pred. RSD=2.58% Rev erse IS (1/w) Half−order bridge (t=1/2) log 10 ( r r ) Density −0.8 −0.6 −0.4 −0.2 0.0 0 10 20 30 40 Rev erse IS (1/w) Half−order bridge (t=1/2) (D) Figure 3: T o y normalizing-constant ratio estimation under tail mismatc h in the Beta ( a, 1) vs. Unif (0 , 1) family (Simulation study 3). W e rep ort log 10 ( b r /r ) o ver R = 2000 indep enden t rep etitions at matched computational cost N total = 2000 . T op ro w ( a = 1 2 ): forward breakdo wn. P anels (A)–(B) compare the forw ard one-sided estimator b r F = 1 N P N i =1 w ( X 2 i ) ( X 2 i ∼ p 2 ; N = 2000 ) to the half-order bridge b r 1 / 2 ( n 1 = n 2 = 1000 ). Because E p 2 [ w 2 ] = ∞ at a = 1 2 , the forw ard estimator exhibits a pronounced righ t tail and man y outliers. Bottom ro w ( a = 3 ): reverse breakdo wn. Panels (C)–(D) compare the rev erse one-sided estimator b r R = { 1 N P N i =1 w ( X 1 i ) − 1 } − 1 ( X 1 i ∼ p 1 ; N = 2000 ) to the same half-order bridge estimator ( n 1 = n 2 = 1000 ). Because E p 1 [ w − 2 ] = ∞ for a ≥ 2 , the rev erse estimator dev elops a hea vy left tail (extreme underestimation on the log scale). In all panels, the half-order bridge remains tightly concen trated around 0 . Panel headers rep ort the closed-form o v erlap ρ 2 = 4 a/ ( a + 1) 2 and the corresp onding asymptotic RSD prediction from Prop osition 4.2 . 35 4.2 Imp ortance sampling view: degeneracy diagnostics without fragile second momen ts Imp ortance sampling as ratio estimation and momen t fragilit y . Imp ortance sampling (IS) appro ximates exp ectations under a target densit y p 1 b y rew eighting draws from a prop osal densit y p 2 . Its numerical behavior is con trolled b y the density ratio W ( x ) := p 1 ( x ) p 2 ( x ) , whic h is the same ratio ob ject that app ears in the T uring–Goo d iden tities of Section 2 . A k ey practical issue is that man y widely used diagnostics and tuning criteria for IS (e.g., squared-w eigh t concentration summaries, co efficien ts of v ariation, and other moment-based rules) are implicitly functions of second or higher momen ts of W (or p o w er transforms W t ). These moments can b e extremely unstable or ev en infinite under mild tail mismatc h, whic h mak es such criteria ill-posed precisely in the regimes where they are most needed. Section 2 implies a simple but consequen tial principle: among p o w er transforms of the ratio, the half-order p oint t = 1 2 is the unique c hoice that admits a distribution-free worst-side second-momen t guarantee. This motiv ates using the half-order ov erlap ρ = R q p 1 ( x ) p 2 ( x ) dµ as a conserv ative, stably estimable summary of w eigh t concentration, and as an anc hor for prop osal and bridging design. Prop osition 4.3 (Half-order o v erlap and co efficien t of v ariation as a stable degeneracy summary) . L et p 1 and p 2 b e mutual ly absolutely c ontinuous and define the IS weight W ( X ) := p 1 ( X ) /p 2 ( X ) to gether with its half-or der tr ansform U ( X ) := W ( X ) 1 / 2 = v u u t p 1 ( X ) p 2 ( X ) , ρ := Z q p 1 ( x ) p 2 ( x ) dµ. 36 Then, under p 2 , E p 2 [ U ] = ρ, V ar p 2 ( U ) = 1 − ρ 2 ≤ 1 , and symmetric al ly, under p 1 , E p 1 [ W − 1 / 2 ] = ρ and V ar p 1 ( W − 1 / 2 ) = 1 − ρ 2 . Conse quently, the half-or der squar e d c o efficient of variation is CV 2 1 / 2 := V ar p 2 ( U ) E p 2 [ U ] 2 = 1 − ρ 2 ρ 2 = ρ − 2 − 1 . In p articular, ρ 2 = 1 / (1 + CV 2 1 / 2 ) pr ovides an interpr etable overlap index in (0 , 1] that r emains wel l-define d and stably estimable under either mo del. Pr o of. The mean identities are the half-order instance of the T uring–Go o d identit y (Theo- rem 2.1 ) applied to B = W ; the v ariance statemen ts are Corollary 2.5 . The CV identit y follo ws by algebra. Remark (Estimating ρ and ρ 2 from one or tw o sides). Prop osition 4.3 directly yields b ounded-v ariance estimators of ρ . If X 1 , . . . , X m iid ∼ p 2 and the normalized ratio W = p 1 /p 2 is a v ailable, then b ρ (2) := 1 m m X i =1 q W ( X i ) satisfies V ar( b ρ (2) ) = 1 − ρ 2 m ≤ 1 m . Lik ewise, if Y 1 , . . . , Y m iid ∼ p 1 , then b ρ (1) := 1 m m X i =1 W ( Y i ) − 1 / 2 satisfies V ar( b ρ (1) ) = 1 − ρ 2 m ≤ 1 m . In man y Ba y esian applications p 1 is only known up to a normalizing constan t, so one instead works with an unnormalized ratio w = ˜ p 1 / ˜ p 2 = r W . In that case, the half-order building blo c ks from Section 4.1 still allo w stable o verlap estimation: b y Prop osition 4.1 , E p 2 [ w 1 / 2 ] = r 1 / 2 ρ and E p 1 [ w − 1 / 2 ] = r − 1 / 2 ρ , hence ρ 2 = E p 2 h w 1 / 2 i E p 1 h w − 1 / 2 i . (8) 37 A ccordingly , with samples from b oth sides one ma y estimate ρ 2 b y the pro duct of the corresp onding sample means. In particular, if b a 2 and b a 1 denote the sample means of w 1 / 2 under p 2 and w − 1 / 2 under p 1 (as in Prop osition 4.2 ), then b ρ 2 := b a 2 b a 1 , b ρ := ( b a 2 b a 1 ) 1 / 2 , is a v ailable alongside b r 1 / 2 = b a 2 / b a 1 at essen tially no additional cost (see also the ov erlap remark in Section 4.1 ). Prop osition 4.4 (Uniqueness: p o wer-momen t degeneracy diagnostics can b e ill-posed aw a y from t = 1 2 ) . Fix t  = 1 2 and c onsider the p ower-weight tr ansform W t . Ther e exist mutual ly absolutely c ontinuous p airs ( p 1 , p 2 ) for which V ar p 2 ( W t ) = ∞ (if t > 1 2 ) or V ar p 1 ( W t − 1 ) = ∞ (if t < 1 2 ). In p articular, diagnostics and tuning rules that r e quir e finite se c ond moments of W t c an b e il l-p ose d away fr om the half-or der p oint even when p 1 and p 2 ar e valid IS p airs. Pr o of. Apply Theorem 2.7 (ii) to the Ba yes factor B = W . Practical recip e: o v erlap-guided diagnostics and design. The ab o v e results suggest using the half-order o v erlap as a conserv ative anc hor in regimes where tail mismatch is uncertain. 1. R ep ort a half-or der overlap estimate. Compute b ρ (2) from prop osal draws (or b ρ (1) from target dra ws) when W is a v ailable, and otherwise use the t wo-sided pro duct estimator implied b y ( 8 ) when w orking with unnormalized ratios. Because the corresp onding summands ha ve b ounded v ariance, this diagnostic remains w ell-p osed under worst-case tail mismatc h. 2. Interpr etation. Small b ρ (equiv alen tly , large d CV 1 / 2 ) indicates sev ere weigh t concentra- 38 tion and w arns against reliance on tail-sensitive, higher-momen t diagnostics. 3. Design of pr op osals and interme diate sche dules. When a difficult rew eighting problem is decomposed in to a pro duct of easier steps (temp ering/bridging/annealing), use pairwise half-order ov erlap as a tuning target: estimate ov erlaps b etw een consecutive states, and refine the grid (or adjust prop osals) un til adjacen t o v erlaps are not excessiv ely small. This directly aligns design with an ov erlap quantit y that remains estimable in w orst-case tail regimes. Sim ulation study: weigh t concen tration aw ay from t = 1 2 and stability anc hored b y the half-order o v erlap. W e close this subsection with a simple numerical illustration of Propositions 4.3 – 4.4 . The goal is to visualize t wo p oints: (i) pow er-weigh t diagnostics a wa y from t = 1 2 can b ecome ill-posed when the relev an t second momen ts do not exist, and (ii) the half-order o verlap ρ pro vides a stable, interpretable b enc hmark for w eight concen tration that remains well-defined under w orst-case tail mismatch. Mo del p air. W e use the one-parameter family on (0 , 1) that also app ears in the pro of of Theorem 2.7 (ii): let p 2 ( x ) = 1 (uniform) and p 1 ( x ) = ax a − 1 (a Beta ( a, 1) densit y), so that the imp ortance w eigh t is W ( x ) = p 1 ( x ) p 2 ( x ) = ax a − 1 . In this family , the half-order o verlap admits a closed form, ρ = Z 1 0 q p 1 ( x ) p 2 ( x ) dx = 2 √ a a + 1 , ρ 2 = 4 a ( a + 1) 2 . Prop osition 4.3 implies that half-order summaries based on √ W (or 1 / √ W ) remain stable and that ρ 2 pla ys the role of a deterministic ov erlap b enchmark. W eight tr ansforms and c onc entr ation summaries. F or a set of N dra ws { X i } N i =1 from a designated generating distribution, we compute unnormalized w eights w i and normalize 39 them as ˜ w i = w i / P N j =1 w j . W e report three complemen tary summaries of concen tration: C ( p ) := ⌈ pN ⌉ X i =1 ˜ w ( i ) , p ∈ (0 , 1] , κ N := 1 N P N i =1 ˜ w 2 i , S 0 . 01 := ⌈ 0 . 01 N ⌉ X i =1 ˜ w ( i ) , where ˜ w (1) ≥ · · · ≥ ˜ w ( N ) are the normalized weigh ts in decreasing order. Here C ( p ) is a Lorenz-t yp e concentration curve (uniform w eigh ts corresp ond to C ( p ) = p ), κ N is a normalized recipro cal squared-w eight concen tration index (often rep orted in practice as ESS / N ), and S 0 . 01 measures the cum ulative mass carried b y the top 1% of w eights. Two str ess tests away fr om t = 1 2 . Figure 4 summarizes the b ehavior ov er R = 500 indep enden t replicates at fixed N = 2000 . In the prop osal-side stress test (top ro w) w e dra w X i ∼ p 2 and compare the naiv e w eigh ts w = W ( t = 1 ) to half-order w eigh ts w = W 1 / 2 . With the b oundary choice a = 1 / 2 , we ha v e V ar p 2 ( W ) = + ∞ (Prop osition 4.4 ), and the resulting normalized w eigh ts exhibit extreme concen tration. In the target-side stress test (b ottom row) w e dra w X i ∼ p 1 and compare a represen tative exp onen t b elo w half order (here w = W t − 1 with t = 1 / 4 ) to the half-order rev erse w eights w = W − 1 / 2 . With the b oundary c hoice a = 3 , we hav e V ar p 1 ( W t − 1 ) = + ∞ for this exponent, again pro ducing sev ere concentration. Half-or der stabilization and overlap b enchmarks. In b oth stress tests, the half-order trans- forms yield mild concentration: the Lorenz curves lie close to the uniform baseline, the top-share S 0 . 01 remains near 0 . 01 , and the squared-w eight index κ N concen trates near the deterministic b enc hmark ρ 2 = 4 a/ ( a + 1) 2 . This illustrates the design message of Section 4.2 : diagnostics and tuning rules anchored at t = 1 2 (equiv alen tly , based on √ W or 1 / √ W ) remain w ell-p osed under w orst-case tail mismatch, whereas p ow er-weigh t rules aw ay from t = 1 2 can b ecome fragile or ill-defined ev en in mutually absolutely con tinuous settings. 40 5e−04 5e−03 5e−02 5e−01 0.0 0.2 0.4 0.6 0.8 1.0 T op fr action k/N (log scale) Cumulativ e weight share t=1 t=1/2 (A) H2−side weights: B^t a=0.50, N=2000 1 1/2 0.0 0.2 0.4 0.6 0.8 1.0 t ESS/N (B) H2−side: ESS/N theory (t=1/2): rho^2=0.889 1 1/2 0.0 0.2 0.4 0.6 0.8 1.0 t T op 1% w eight share (C) H2−side: top 1% share uniform baseline = 0.01 5e−04 5e−03 5e−02 5e−01 0.0 0.2 0.4 0.6 0.8 1.0 T op fr action k/N (log scale) Cumulativ e weight share t=0.25 t=1/2 (D) H1−side weights: B^{t−1} a=3.0, N=2000 0.25 1/2 0.0 0.2 0.4 0.6 0.8 1.0 t ESS/N (E) H1−side: ESS/N theory (t=1/2): rho^2=0.750 0.25 1/2 0.0 0.2 0.4 0.6 0.8 1.0 t T op 1% w eight share (F) H1−side: top 1% share uniform baseline = 0.01 Figure 4: Imp ortance-sampling weigh t concen tration a wa y from t = 1 2 and stabilit y anc hored b y the half-order o verlap in the Beta ( a, 1) vs. Unif (0 , 1) family ( R = 500 replicates, N = 2000 ). T op ro w (prop osal-side): X i ∼ p 2 with a = 1 / 2 , comparing w eigh ts w = W t for t ∈ { 1 , 1 2 } . Bottom row (target-side): X i ∼ p 1 with a = 3 , comparing rev erse weigh ts w = W t − 1 for t ∈ { 1 4 , 1 2 } . Left panels sho w Lorenz-type concentration curv es C ( p ) = P ⌈ pN ⌉ i =1 ˜ w ( i ) (median and 10/90% en velopes across replicates) on a log- p scale; the diagonal C ( p ) = p corresp onds to uniform w eights. Middle panels show boxplots of the normalized recipro cal squared-w eigh t concentration κ N := 1 / ( N P N i =1 ˜ w 2 i ) (commonly rep orted as ESS / N ), with dotted reference lines at the half-order o v erlap b enc hmark ρ 2 = 4 a/ ( a + 1) 2 . Right panels sho w the cum ulativ e weigh t share carried b y the largest 1% of w eigh ts ( S 0 . 01 ); the dotted line is the uniform baseline 0 . 01 . In b oth stress tests, the half-order w eights exhibit mild concen tration and κ N close to ρ 2 , whereas the tail-sensitive exp onen ts yield substan tial concen tration and degraded squared-weigh t summaries. 41 5 Discussion This pap er revisits the classical T uring–Go o d iden tities through a deliberately computation- a w are lens. While the identities themselv es are exact population equalities that hold for a con tinuum of exp onents, their usefulness as numeric al c hecks and as building blo cks for Mon te Carlo estimators is go v erned b y higher-order b eha vior of the Bay es factor In particular, in hea vy-tailed regimes with weak o v erlap—precisely where algorithmic v alidation is most needed—the integer-order momen ts traditionally used for diagnostics can ha ve enormous or ev en infinite v ariance, so that the c heck itself becomes ill-posed. Our main message is that the half-order p oin t t = 1 / 2 is not merely a conv enient fractional c hoice but a structurally singled-out exponent: it is the unique c hoice that yields t w o-sided, distribution-free stability in second momen ts. Implications for Go o d c hec ks in Ba y esian w orkflo ws. A recurring challenge in Ba y es factor computation is that correctness is difficult to v alidate from a single realized dataset: the Bay es factor is a ratio of marginal lik eliho o ds, and n umerical appro ximations can fail silen tly when tails are mismatched or when the implemen tation inadv ertently differs from the intended prior specification. Go o d chec ks ( Sekulovski et al. 2024 ) address this by comparing empirical Mon te Carlo momen ts against exact iden tities that hold under prior predictiv e simulation. Our results refine the design principle for such c hecks. The half-order viewp oin t yields tw o practical adv antages. First, symmetry : the tw o-sided half-order c hec k do es not require a judgement ab out which mo del is “true” or “more complex. ” Because ( p 1 /p 2 ) − 1 / 2 = ( p 2 /p 1 ) 1 / 2 p oin t wise, the same theoretical target ρ is approached from either generating mo del, and the diagnostic remains well-posed whichev er direction is computationally con v enient. Second, worst-c ase stability : each summand in the c heck has v ariance at most 1, and the balanced t w o-sample discrepancy has v ariance 4(1 − ρ 2 ) / N ≤ 4 / N 42 under a fixed budget of N Ba y es factor ev aluations (Theorem 3.2 ). Moreo ver, in the small- o v erlap regime ρ ≤ 1 / 2 , the balanced t w o-sample half-order c hec k can be more efficien t than the one-sided t = 1 T uring chec k at matched cost (Theorem 3.2 (ii)). This is qualitativ ely differen t from one-sided in teger-moment c hec ks, which can ha ve arbitrarily large or infinite v ariance for v alid (mutually absolutely con tinuous) model pairs. F rom a workflo w p ersp ective, this suggests a conserv ativ e default: when one can sim ulate from b oth prior predictiv es, start with a balanced t wo-sample half-order c heck and report (i) the discrepancy b ∆ = b ρ 2 − b ρ 1 together with a studentized standard error and (ii) the p o oled o v erlap estimate b ρ = ( b ρ 1 + b ρ 2 ) / 2 as an in terpretable measure of mo del o verlap. Large systematic discrepancies indicate implemen tation, prior, or sim ulation mismatches, while a small b ρ signals an intrinsically hard Bay es factor problem in which tail sensitivit y and w eigh t degeneracy are exp ected. The appearance of ρ is not incidental: it is simultaneously (i) the half-order Hellinger o verlap that alw a ys exists and lies in (0 , 1] and (ii) the only model-dep endent quantit y gov erning the b ounded half-order v ariance. Thus, ρ pla ys a dual role in practice. As a diagnostic target, it is a quantit y that can b e estimated from either side with uniformly con trolled v ariance. As a difficult y index, it quan tifies how strongly separated the marginal lik eliho o ds are. When ρ is small, one should exp ect heavy-tailed Ba y es factors and unstable one-sided estimators based on ra w w eights; this is precisely the regime in whic h the simulation studies sho w dramatic failures of integer-momen t chec ks. More broadly , expressing algorithmic stability in terms of ρ in vites a useful shift in ho w we comm unicate Bay es factor computations. Rather than reporting only the Ba y es factor and an algorithm-sp ecific Mon te Carlo standard error, it can b e informativ e to also rep ort an o v erlap statistic (e.g. b ρ or its complemen t 1 − b ρ 2 ) that is robustly estimable and that directly signals p oten tial w eigh t degeneracy . This complements established recommendations for 43 principled Ba yesian workflo ws that emphasize diagnostics and v alidation alongside inference (e.g., Sc had et al. 2021 ). Connections to ratio estimation and Monte Carlo design. Although we moti- v ated the analysis via Go o d c hec ks, the underlying structure is ab out density ratios and normalizing-constan t estimation. Section 4 highligh ts that the same half-order midp oin t √ p 1 p 2 naturally pro duces stable building blo c ks for ratio estimation (Prop ositions 4.1 and 4.2 ). F rom the p ersp ectiv e of bridge sampling and free-energy metho ds (Bennett 1976; Meng & W ong 1996; Shirts & Cho dera 2008; Gronau et al. 2017), the half-order identit y ma y be view ed as a geometric “baseline bridge” whose comp onen ts admit a distribution-free second-momen t guaran tee under mutual absolute con tinuit y . Similarly , Section 4.2 frames imp ortance-sampling degeneracy diagnostics in terms of the same ov erlap family: diagnostics based on E [ W 2 ] can b e ill-p osed in legitimate imp ortance-sampling problems, whereas the half-order transform yields an alw ays-finite and in terpretable o v erlap/degeneracy index (Prop osition 4.3 ). T ogether, these connections suggest that half-order o verlap can serv e as a common currency across Ba y es factor computation, ratio estimation and prop osal/bridge design. Limitations and directions for future w ork. Our minimax stability statement is delib erately worst-case and focuses on second momen ts. There are several natural extensions. First, man y practical Ba yes factor computations rely on MCMC rather than i.i.d. draws from p 1 and p 2 . While the half-order p er-dr aw v ariance bound remains relev ant, dep endence inflates Monte Carlo error via auto correlation, and a full analysis should incorp orate effectiv e sample sizes and Mark o v-chain CL T s. Dev eloping half-order Go od-chec k calibration rules that remain reliable under MCMC dependence is an imp ortan t practical direction. Second, although half order guaran tees finite v ariance uniformly , it does not guaran tee small v ariance: when ρ is extremely small, 1 − ρ 2 ≈ 1 and the t w o-sided chec k may still require 44 substan tial Monte Carlo effort to achiev e tight tolerances. In such regimes, the ov erlap estimate itself is useful as a warning signal that additional algorithmic measures are needed (e.g., bridgingwith intermediate distributions). A constructive design theory that uses estimated half-order o verlaps to adaptively place bridges (for example, c ho osing consecutiv e distributions to main tain a roughly constant pairwise ρ ) w ould op erationalize the ideas in Section 4 . Third, our analysis assumes m utual absolute contin uity so that the Bay es factor is well-defined almost ev erywhere. In some applied settings (e.g., mo dels with b oundary constrain ts, discrete–contin uous mixtures, or hard truncations), this assumption may fail or hold only appro ximately . Extending half-order diagnostics to suc h settings—for instance via truncation, regularization, or partial-ov erlap decomp ositions—would broaden applicabilit y . Concluding p ersp ective. The T uring–Go o d iden tities pro vide a rare bridge b et ween exact measure-theoretic relationships and practical computational diagnostics. The present w ork sho ws that this bridge becomes particularly sturdy at the half-order p oin t: t = 1 / 2 is the unique exp onent for which the iden tity can be c heck ed in a symmetric, tw o-sided, w orst-case stable manner, and the associated ov erlap ρ emerges as a robust summary of b oth mo del similarit y and Monte Carlo difficult y . W e hope that framing Ba yes factor v alidation and ratio-estimation design around half-order o verlap will help make Bay es factor workflo ws more reliable, transparent, and repro ducible in the regimes where they are most c hallenging. References Bennett, C. H. (1976), ‘Efficien t estimation of free energy differences from mon te carlo data’, Journal of Computational Physics 22 (2), 245–268. Dic k ey , J. M. (1971), ‘The w eigh ted lik eliho o d ratio, linear h yp otheses on normal lo cation parameters’, The A nnals of Mathematic al Statistics 42 (1), 204–223. 45 URL: https://pr oje cteuclid.or g/journals/annals-of-mathematic al-statistics/volume- 42/issue-1/The-W eighte d-Likeliho o d-R atio-Line ar-Hyp otheses-on-Normal-L o c ation- Par ameters/10.1214/aoms/1177693507.ful l Go o d, I. J. (1985), W eight of evidence: A brief survey , in J. M. Bernardo, M. H. DeGroot, D. V. Lindley & A. F. M. Smith, eds, ‘Bay esian Statistics 2’, North-Holland, Amsterdam, pp. 249–270. Pro ceedings of the Second V alencia In ternational Meeting (Sept 6–10, 1983). Gronau, Q. F., Sarafoglou, A., Matzk e, D., Ly , A., Bo ehm, U., Marsman, M., Leslie, D. S., F orster, J. J., W agenmak ers, E.-J. & Steingroever, H. (2017), ‘A tutorial on bridge sampling’, Journal of Mathematic al Psycholo gy 81 , 80–97. Gronau, Q. F., Singmann, H. & W agenmak ers, E.-J. (2020), ‘bridgesampling: An R pac kage for estimating normalizing constan ts’, Journal of Statistic al Softwar e 92 (10), 1–29. Hec k, D. W., Bo ehm, U., Böing-Messing, F., Bürkner, P .-C., Derks, K., Dienes, Z., F u, Q., Gu, X., Karimov a, D., Kiers, H. A. L., Klugkist, I., Kuip er, R. M., Lee, M. D., Leenders, R., Leplaa, H. J., Linde, M., Ly , A., Meijerink-Bosman, M., Mo erb eek, M., Mulder, J., Palfi, B., Sc hön bro dt, F. D., T endeiro, J. N., v an den Bergh, D., V an Lissa, C. J., v an Ra venzw aaij, D., V anpaemel, W., W agenmakers, E.-J., Williams, D. R., Zonderv an- Zwijnen burg, M. & Hoijtink, H. (2023), ‘A review of applications of the bay es factor in psyc hological research’, Psycholo gic al Metho ds 28 (3), 558–579. Epub 2022-03-17. Jaco d, J. & Shiry aev, A. (2003), Limit the or ems for sto chastic pr o c esses , Springer-V erlag. Kass, R. E. & Raftery , A. E. (1995), ‘Ba y es factors’, Journal of the A meric an Statistic al A sso ciation 90 (430), 773–795. Lee, M. D. & W agenmak ers, E.-J. (2013), Bayesian Co gnitive Mo deling: A Pr actic al Course , Cam bridge Univ ersity Press, Cam bridge, UK. First published 2013. Hardbac k ISBN: 978-1-107-01845-7; P ap erbac k ISBN: 978-1-107-60357-8. 46 Meng, X.-L. & W ong, W. H. (1996), ‘Sim ulating ratios of normalizing constants via a simple iden tit y: A theoretical exploration’, Statistic a Sinic a 6 , 831–860. URL: http://www3.stat.sinic a.e du.tw/statistic a/j6n4/j6n43/j6n43.htm Sc had, D. J., Betancourt, M. & V asish th, S. (2021), ‘T o w ard a principled Ba yesian w orkflow in cognitiv e science’, Psycholo gic al Metho ds 26 (1), 103–126. Sc hw arz, G. (1978), ‘Estimating the dimension of a mo del’, The A nnals of Statistics pp. 461–464. Sekulo vski, N., Marsman, M. & W agenmakers, E.-J. (2024), ‘A go od chec k on the ba y es factor’, Behavior R ese ar ch Metho ds 56 (8), 8552–8566. Shirts, M. R. & Cho dera, J. D. (2008), ‘Statistically optimal analysis of samples from m ultiple equilibrium states’, The Journal of Chemic al Physics 129 (12), 124105. 47

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment