Induced replication and the assessment of models

We study the assessment of semiparametric and other highly-parametrised models from the perspective of foundational principles of parametric statistical inference. In doing so, we highlight the possibility of avoiding the usual semiparametric conside…

Authors: Heather Battey, Nancy Reid

Induced replication and the assessmen t of mo dels H. S. Battey ∗ and N. Reid † Marc h 31, 2026 Abstract W e study the assessmen t of semiparametric and other highly-parametrised mo d- els from the p erspective of foundational principles of parametric statistical inference. In doing so, w e highlight the p ossibility of av oiding the usual semiparametric con- siderations, which typically require estimation of nuisance comp onen ts through ker- nel smoothing or basis expansion, with the associated difficulties of tuning-parameter c hoice that blur the distinction b etw een estimation and mo del assessmen t. A key asp ect is the av ailability of preliminary mano euvres that induce an in ternal replica- tion of kno wn form under the postulated mo del. This can b e cast as a generalised v ersion of the Fisherian sufficiency/co-sufficiency separation, replacing out-of-sample prediction error as a criterion for semiparametric mo del assessmen t b y a t yp e of within- sample prediction error. F ramed in this light are new metho dological contributions in multiple example settings, including mo del assessment for the prop ortional hazards mo del, for a time-dep enden t P oisson pro cess with semiparametric in tensity function, and for matched-pair and tw o-group examples. Also subsumed within the framework is a post-reduction inference approach to the construction of confidence sets of sparse regression mo dels. Numerical work confirms reco very of nominal error rates under the p ostulated model and high sensitivity to departures in the direction of semiparametric alternativ es. W e conclude by emphasising open c hallenges and unifying p erspectives. Some key wor ds: co-sufficiency; confidence sets of mo dels; exc hangeability; mo del adequacy; post-selection inference; p ost-reduction inference. 1 In tro duction Tw o widely used approac hes to assessing regression mo dels, semiparametric or otherwise, are to substitute unkno wn regression parameters by estimates and chec k residuals for an y anomalous b eha viour, or to assess the predictive p erformance of the fitted mo del. The visually compelling but sometimes informal approac hes based on residuals, through their connection to co-sufficiency in parametric mo dels, can b e viewed as a wa y of ev aluating predictiv e ac curacy within sample. If a statistic S = s ( Y ) constructed from observ ations Y = ( Y 1 , . . . , Y n ) is sufficien t for a scalar or vector parameter θ , then it con tains all the information in the data relev ant for ∗ Departmen t of Mathematics, Imp erial College London † Departmen t of Statistical Sciences, Univ ersit y of T oron to 1 inference on θ , potentially lea ving information remaining in Y for assessmen t of the mo del, an observ ation due to Fisher ( 1950 ). Sp ecifically if the observ ed v alue y o is extreme when calibrated against the conditional distribution of Y giv en the observed v alue S = s , then this provides evidence against the mo del ( Barndorff-Nielsen and Cox , 1994 , p.29). W e call this residual information after conditioning c o-sufficient in line with the terminology used in previous work. The idea, although conceptually appealing, t ypically do es not translate to a conv enien t statistical pro cedure, as the conditional distribution is multidimensional, often constrained to a manifold of inexplicit form, and usually do es not hav e a con v enient analytic expression. Sim ulation to calculate extremal regions is not straightforw ard either, as simple bo otstrap dra ws from the data violate the constraint S = s . Progress was made b y Engen and Lilleg ˚ ard ( 1997 ); Lindqvist et al. ( 2003 ); Lindqvist and T araldsen ( 2005 ); Lo c khart et al. ( 2007 ) who appear to ha v e b een the first to attempt to op erationalise co-sufficien t sampling in an y lev el of generality , and Barb er and Janson ( 2022 ) who sub- stan tially adv anced the metho dology and asso ciated theory . See also Battey ( 2024 ) for some relev an t geometric p ersp ectiv es and Battey , Rasines & T ang ( 2025 ) for an analytic approac h designed for a particular setting, outlined in § 3.3 . The purp ose of the present pap er is to explore other inferential structures, some of which being generalisations of the sufficiency/co-sufficiency separation, with a view to op ening new directions for researc h on mo del assessmen t in semiparametric and highly parametrised models. If a mo del contains a gen uinely infinite-dimensional comp onen t, there is no hop e for either first-order inference or mo del assessmen t from n observ ations. In practice, ho w- ev er, semiparametric mo dels are not gen uinely infinite dimensional. Instead, smo othness assumptions implicitly or explicitly constrain the effective parameter space to dimension smaller than n , often achiev ed b y truncating an expansion in basis functions, by regu- larisation, or via k ernel smo othing. The asso ciated tuning-parameter selection for such estimators is t ypically based on cross-v alidation, blurring the distinction betw een esti- mation and mo del assessmen t. The present work is an attempt to reconcile inference for semiparametric and other highly-parametrised mo dels with the Fisherian parametric foun- dations b y illustrating the feasibilit y and merits of an approach to model assessment that ev ades estimation of the high- or infinite-dimensional nuisance component. The follo wing simple example exp oses t w o imp ortan t p oin ts: that the need for smo oth- ness or other assumptions on infinite-dimensional nuisance parameters can sometimes b e remo ved, and that mo del assessment can b e p erformed without estimating nuisance pa- rameters, and without an additional sample to assess out of sample prediction error. Example 1.1. Let ( Y j 1 , Y j 0 ) be outcomes on treated and un treated units in the j th of m matc hed pairs of individuals, e.g. identical t wins. Because treatmen t is randomised, these outcomes are indep enden t, but inference on the treatment parameter ψ , and assessment of the mo del, is sup erficially challenging as every pair contributes a nuisance parameter γ j , so that the pairs are not identically distributed, and the mo del has m + 1 unknown parameters. This form ulation of the matc hed pair problem, although highly parametrised, mak es fewer assumptions than a standard semiparametric model in co v ariates x ∈ R p , whic h would mo del the n uisance parameters as γ j = h ( x j ) for some unkno wn function h : R p → Γ, where the parameter space Γ is usually R or R + . Supp ose no w that ( Y j 1 , Y j 0 ) are modelled as indep enden t and exponentially distributed 2 with rates γ j ψ , γ j resp ectiv ely . If the true distribution belongs to the mo del and has true parameter v alue ψ ∗ , then Y j 0 + Y j 1 ψ ∗ =: S j ( ψ ∗ ) is sufficien t for γ j and has densit y function f S j ( ψ ∗ ) ( s ) = γ 2 j s exp( − γ j s ) , (1) i.e., S j ( ψ ∗ ) is gamma distributed with shap e parameter 2 and rate parameter γ j . The conditional densit y of Y j 1 at y j 1 , giv en S j ( ψ ∗ ) = s j ( ψ ∗ ), is ψ ∗ /s j ( ψ ∗ ) uniformly in y j 1 , sho wing that Y j 1 is conditionally uniformly distributed b et ween 0 and s j ( ψ ∗ ) /ψ ∗ . Equiv- alen tly U j ( ψ ∗ ) := Y j 1 ψ ∗ /s j ( ψ ∗ ) has a standard uniform distribution for all j = 1 , . . . , m . The ab ov e conclusions are inv alidated if ψ ∗ is replaced by any other v alue ψ 0 , or if the mo del is wrong, suggesting a route to assessment of mo del adequacy via the empirical b e- ha viour of U 1 ( ψ 0 ) , . . . , U m ( ψ 0 ) for a p ostulated v alue ψ 0 , which are the idealised analogues of residuals. Since some p ostulated v alues ψ 0 ma y pro duce a distribution for U 1 ( ψ 0 ) , . . . , U m ( ψ 0 ) that is hard to distinguish from a standard uniform sample ev en when the mo del is violated, it is sensible for sufficien tly large m to replace ψ 0 b y the v alue returned by a consistent estimator ˆ ψ . In this example, the transformed random v ariables Z j = Y j 1 / Y j 0 are, under the mo del, iden tically distributed with density function f Z ( z ; ψ ∗ ) = ψ ∗ (1 + ψ ∗ z ) 2 . (2) Since ( 2 ) is free of all the pair-sp ecific nuisance parameters, inference on ψ ∗ can b e con- structed from a lik eliho od function based on ( 2 ), the resulting maximum-lik eliho od estima- tor b eing consisten t as m → ∞ . The approac h of replacing ψ ∗ b y ˆ ψ in U 1 ( ψ ∗ ) , . . . , U m ( ψ ∗ ) is formally justified through an application of Proposition 2.1 in § 2.3 . Example 1.1 replaces the usual semiparametric approac h of assessing prediction error after regularised estimation by an in-sample assessment based on the structure of the p ostulated mo del, av oiding, without sample splitting, an y p ost-selection mo del inference issues that would arise from using the same sample to c ho ose tuning parameters and to assess the mo del. Section 2 extracts the most imp ortan t structure from Example 1.1 , aiming to prob e the foundations of mo del inference for semiparametric and other highly parametrised for- m ulations. A requirement of the framew ork is that preliminary op erations achiev e in ternal replication, or exchangeabilit y , of known form when the mo del is correctly sp ecified, and fail to do so otherwise. The underlying logic is that of pro of b y con tradiction, familiar from classical statistical testing. Exact internal replication ma y not alwa ys b e ac hiev able, and it remains an op en question whether a mechanism can alwa ys b e found to approximate it. The presen t paper makes a more mo dest contribution, sho wing through a multitude of c hallenging examples the diversit y of wa ys in which in ternal replication can b e induced. 2 Broad form ulation 2.1 Join t assessmen t of a mo del and its in terest parameters Supp ose that there is structure in the p ostulated mo del ensuring the existence of new indep enden t random v ariables U j ( ψ 0 ), j = 1 , . . . , m , for every candidate v alue of the 3 in terest parameter ψ 0 , suc h that U j ( ψ 0 ) follo ws a standard uniform distribution if and only if the true distribution with parameter v alue ψ ∗ b elongs to the p ostulated mo del and ψ 0 = ψ ∗ . Standard uniformit y is a con venien t conv en tion and is equiv alent to the existence of any set of m ψ 0 -dep enden t random v ariables whose distribution is known under the p ostulated mo del at ψ 0 = ψ ∗ . The sp ecial case where m = 1 and U 1 ( ψ 0 ) = U 1 do es not dep end on ψ 0 can be view ed as a framing of the classical approac h to parametric mo del as sessmen t based on co-sufficiency . The precise op erationalisation of that idea in particular con texts has receiv ed little attention, but in cases where op erationalisation has b een attempted, the approach app ears to ha ve little or no p ow er without modification ( Barb er and Janson , 2022 ; Battey , Rasines & T ang , 2025 ). When an extension to m > 1 is a v ailable, whic h is ac hiev ed b y inducing in ternal replication, pow er to detect an erroneous mo del at a giv en ψ 0 = ψ ∗ is in principle achiev able provided that U j ( ψ 0 ) is not uniformly distributed when the postulated mo del is wrong. The following initial discussion considers mo del adequacy in terms of a confidence set for the parameter ψ , b efore introducing in § 2.3 a more p o werful approach mirroring that in Example 1.1 . A simple w ay to construct an α -lev el confidence set for the interest parameter ψ under correct sp ecification of the mo del is to use Fisher’s metho d for combining p -v alues ( Fisher , 1932 ), commonly used in meta analysis. The resulting confidence set is C ( α ) :=  ψ 0 ∈ Ψ : min n G 2 m  − 2 X log U j ( ψ 0 )  , 1 − G 2 m  − 2 X log U j ( ψ 0 )  o < α  , (3) where G 2 m is the distribution function of a χ 2 random v ariable with 2 m degrees of freedom. The use of Fisher’s statistic in ( 3 ) is conv enien t for analytic calculations of p o wer b ecause the exp ectation and v ariance of R = − log U simplify , by in tegration by parts, to E ( R ) = Z 1 0 F U ( u ) u du, V ( R ) = − 2 Z 1 0 log( u ) F U ( u ) u du −  Z 1 0 F U ( u ) u du  2 , (4) where F U is the distribution function of U . Both momen ts increase as the densit y function f U of U concentrates near zero, and the mean exceeds half the v ariance, and hence departs from the n ull χ 2 2 b eha viour, when f U departs from uniformit y asymmetrically tow ards zero. When departures instead o ccur to wards 1, the mean and v ariance are typically b oth to o small to b e compatible with the n ull distribution, although in this case sensitivit y can b e increased by replacing U j ( ψ 0 ) b y 1 − U j ( ψ 0 ) in ( 3 ). Th us, if the ma jorit y of departures from uniformit y are in the same direction, C ( α ) is empt y with high probability for sufficiently large m when the postulated mo del is misspecified. More irregular departures from uniformity ma y b e b etter detected b y alternative com bination rules (see e.g. Birnbaum , 1954 ; Heard and Rubin-Delanc hy , 2018 ). 2.2 Tw o parallel analyses In subsequent sections, w e present the main conceptual developmen t of the pap er via a b ody of examples illustrating differen t w ays through which internal replication can b e 4 ac hieved under the p ostulated mo del, and through whic h con tradictions are forced when the mo del is violated. Before turning to those constructions, we analyse the b eha viour of the resulting confidence set C ( α ), assuming the existence of random v ariables U j ( ψ 0 ). The aim is not to advocate Fisher’s metho d, whose prop erties are w ell understo od under i.i.d. alternativ es, but to supply qualitativ e insigh t through a reasonably general analysis, sho wing ho w systematic departures from uniformit y translate to collapse of C ( α ) as m → ∞ . Calculations at this lev el of generalit y are inevitably idealised. When the mo del is erroneous or if it con tains the true distribution but ψ 0  = ψ ∗ , then U j ( ψ 0 ), j = 1 , . . . , m , are by definition not standard uniform, and for the example cases we hav e in mind, are not identically distributed either. Since U j ( ψ 0 ) is necessarily supp orted on [0 , 1], insight can b e obtained by approximating its density function using the parametric family (1 − ϑ j ) u − ϑ j , (0 < u < 1 , 0 < ϑ j < 1) , (5) so that the null distribution is recov ered in the limit as ϑ j → 0. The mo del ( 5 ) is delib- erately o versimplified in that it specifies the directions of departure from uniformit y to b e in the same direction for all j . Under ( 5 ), the k th moment of R j := − log U j ( ψ 0 ) has the con venien t form E ( R k j ) = (1 − ϑ j ) Z 1 0 ( − log u ) k u − ϑ j du = 1 (1 − ϑ j ) k Z ∞ 0 q k e − q dq = Γ( k + 1) (1 − ϑ j ) k , k ∈ { 1 , 2 , . . . } . The mean and v ariance of R := 2 P j R j are E ( R ) = 2 m X j =1 (1 − ϑ j ) − 1 , V ( R ) = 4 m X j =1 (1 − ϑ j ) − 2 , (6) reco vering the χ 2 2 m mean and v ariance as ϑ j → 0 for all j and showing that, pro vided that the ϑ j are not all zero, R has a larger mean and v ariance than a χ 2 2 m random v ariable, with the discrepancies increasing as m → ∞ . On letting k α b e the 1 − α quantile of the χ 2 2 m distribution and t max = 1 − ϑ max , Mark ov’s inequalit y in the form presented in App endix A.2 sho ws that pr( R ≥ k α ) ≥ 1 − inf 0 0 , where η j = ( γ j + ∆ ∗ ) /γ j ψ 0 . This is monotonically increasing in η j > 0 and crosses E ( R j ) = 1 at η j = 1, whic h corresp onds to ∆ ∗ > γ j ( ψ 0 − 1). Thus, sensitivit y in the righ t tail tends to increase with ∆ ∗ , and v alues ψ 0 ≤ 1 are more easily detected as erroneous than ψ 0 > 1 when ∆ ∗ > 0, confirming intuition. The criterion ∆ ∗ > γ j ( ψ 0 − 1) do es, ho wev er, suggest that there are larger v alues of p ostulated ψ 0 that make the erroneous p ostulated mo del difficult to refute on the basis of the aggregation criterion ( 3 ), in spite of violation of standard uniformit y; this is supp orted b y n umerical c hecks. It seems necessary in this example, therefore, to use the v ersion describ ed in § 2.3 , ec hoing the conclusion from the previous example. App endix A.6.2 presen ts a lo cal asymptotic expansion of the maximum lik eliho o d so- lution in a neighbourho o d of the point of intersection ∆ ∗ = 0 of the tw o mo dels showing, as exp ected, that there is little sensitivit y to departures from the p ostulated multiplicativ e mo del in the direction of the additiv e model at small ∆ ∗ . The purp ose of Examples 4.3 and 4.4 is to pro vide insight in to the b eha viour by wa y of some special cases where in tuition can be recov ered from direct calculation. These p erturbativ e cases are ones in whic h the p ostulated mo del is so close to the true one that there is little harm in using it as a basis for inference. More substan tial violations of mo delling assumptions are prob ed by sim ulation in § 5.1 . 17 4.3 Extension to un balanced strata The exp erimen tal setting of § 4 is a sp ecial case of a more general tw o-group problem arising in observ ational settings. Sp ecifically , observ ations on treated and untreated individuals are stratified in to groups that are as similar as possible. This t ypically leads to un balanced strata, having a different num b er of individuals in the treated and un treated groups. In- ference is based on the sufficient statistics S j 1 and S j 0 within treatmen t groups and strata, whic h can b e treated in an analogous wa y to the paired observ ations of § 4 , as the strata sizes r j 1 and r j 0 are known. Let ( Y ij 1 ) r j 1 i =1 and ( Y ij 0 ) r j 0 i =1 b e observ ations within the j th stratum for treated and un treated individuals resp ectiv ely . As a first example, if the observ ations ( Y ij 1 ) r j 1 i =1 and ( Y ij 0 ) r j 0 i =1 are normally distributed with means γ j + ψ ∗ and γ j and v ariance τ , the lik eliho o d con tribution to the j th stratum dep ends on the data only through S j 1 = P r j 1 i =1 Y ij 1 /r j 1 and S j 0 = P r j 0 i =1 Y ij 0 /r j 0 . The difference Z j = P r j 1 i =1 Y ij 1 /r j 1 − P r j 0 i =1 Y ij 0 /r j 0 is normally distributed of mean ψ ∗ and v ariance τ r j 0 r j 1 / ( r j 0 + r j 1 ), which can be handled as discussed in § 2.3 with τ also estimated. This is an example of Structure 3 , but the same answer is ac hieved via a conditioning argument based on Structure 1 . If the individual observ ations are Poisson distributed coun ts of rates γ j ψ ∗ and γ j (Co x and W ong, 2010), the sufficien t statistics S j 1 and S j 0 are sums of these coun ts, P oisson distributed of rates r j 1 γ j ψ ∗ and r j 0 γ j . The distribution of S j 1 or S j 0 conditional on S j 1 + S j 0 eliminates γ j in the usual w ay . This is an example of Structure 1 . The third example is more interesting as it relies on Structure 4 to eliminate the n uisance parameters. Example 4.5. If the originating v ariables are exponentially distributed of rates γ j ψ ∗ and γ j , the sufficien t statistics S j 1 and S j 0 are gamma-distributed sums of shap e and rate parameters ( r j 1 , γ j ψ ∗ ) and ( r j 0 , γ j ) resp ectiv ely . Then Z j ( ψ ∗ ) := r j 0 ψ ∗ S j 1 r j 1 S j 0 has the F distribution with parameters 2 r j 1 and 2 r j 0 . The strategy of § 2.3 based on Prop osition 4.1 then applies directly . 5 Numerical p erformance 5.1 Matc hed pairs Data w ere generated from a join t mo del for matc hed pairs ( Y j 0 , Y j 1 ) m j =1 in whic h an addi- tiv e treatmen t effect ∆ ∗ op erates on the rate scale for W eibull distributed outcomes with baseline rate parameters ( γ j ) m j =1 and shap e parameter ς , where ( γ j ) m j =1 w ere generated from a standard uniform distribution. The assumed mo del is exp onen tial with a m ultiplicative rate parameter as in Example 1.1 . This notional parameter was estimated by maxim um lik eliho o d based on the density function of the ratios Z j = Y j 1 / Y j 0 whic h, from ( 18 ) is ψ / (1 + ψ z ) 2 . Fisher’s statistic 2 P m j =1 U j ( ˆ ψ ) was then constructed based on ( 18 ) with ψ ∗ replaced by ˆ ψ and z replaced by Z j ; T able 1 rep orts the results. At (∆ ∗ , ς ) = (0 , 1), the 18 true mo del intersects with the p ostulated mo del with null treatmen t effect, captured by v alue ψ = 1. Thus, the prop ortion of rejected tests of size α w ould be α by construction if ψ was perfectly estimated. T able 1 rep orts a b eneficial under-rejection rate in that case. Elsewhere in the parameter space, p o wer to detect departures from the mo del increases rapidly to 1 with increasing m except when ς = 1, where the true W eibull distribution collapses to the exp onen tial and therefore is closer to the point at whic h the true and p ostulated mo dels in tersect. Both the true distribution and the postulated one are in this example constructed from a parameter space whose dimension diverges at the same rate as the sample size m , so b oth mo dels are semiparametric. W e also assessed the approach based on ( 3 ) without estimation of the notional param- eter ψ , computing the prop ortion of confidence sets that w ere empty . If the parameter space of p ermissible v alues is constrained to some reasonable interv al suc h as [0 , 4], then suc h an approach tends to b e more p ow erful for the sample sizes rep orted, but a larger range, say [0 , 10] requires a m uch larger sample size in order for the confidence set to b e empt y with high probability . T able 2 rev erses the roles of the t wo t ypes of mo dels, taking the baseline distribution of Y j 0 to be the same as b efore but obtaining the distribution of Y j 1 through m ultiplication of the rate parameter γ j b y a v alue ψ ∗ . The p ostulated model, by con trast, is the exponential additiv e rates mo del of Example 4.2 . W e see high p ow er to detect the erroneous mo del, and asymptotically reco ver the nominal level of the test at the p oin t ( ψ ∗ , ς ) = (1 , 1) at whic h the true and p ostulated mo dels coincide. 5.2 Time-dep enden t P oisson pro cess Sim ulation of an inhomogeneous P oisson pro cess can b e performed most simply using the argument of Co x and Lewis ( 1966 , pp. 27–28) that on the transformed time scale τ i ( t ) = R t 0 λ i ( u ) du the pro cess is P oisson of unit constant rate. It follows that the sequence T i 1 , . . . , T im i of ordered ev ent times for individual i satisfy τ i ( T ik i ) = P k i j =1 E ij , where E ij are indep endent unit exponential random v ariables. F or λ i ( t ) = e γ i + β t , w e ha ve τ i ( t ) = e γ i ( e β t − 1) /β , and the ev ent times are distributed as T ik i d = τ − 1 i  k i X j =1 E ij  = log  1 + ( β /e γ i ) k i X j =1 E ij  /β , (25) the limit as β → 0 b eing T ik i = e − γ i P E ij as exp ected. T o c heck that the nominal error rates are reco vered under the p ostulated mo del, we used this generating pro cess with γ i dra wn from a standard normal distribution, generating unit exp onen tial random v ariables to use in ( 25 ) until their sum exceeded the endp oint of the transformed timescale τ i ( t 0 ) with t 0 = 5. Since w e wish to assess cases where the mo del is missp ecified, we also sim ulated using the p o wer-la w intensit y function λ i ( t ) = e γ i t ρ with ρ ∈ {− 0 . 5 , 0 , 0 . 5 , 1 } . W e generated T ik i d = τ − 1 i  k i X j =1 E ij  =  ( ρ + 1) P k i j =1 E ij e γ i  1 / ( ρ +1) (26) 19 Direction (∆ ∗ , ς ) Square ro ot of num b er of pairs m of sensitivity 5 8 11 14 17 20 left (0, 0.5) 0.657 0.977 1 1 1 1 righ t (0, 0.5) 0.640 0.983 1 1 1 1 left (0, 1) 0.002 0.001 0 0 0 0 righ t (0, 1) 0.001 0.001 0 0.001 0 0 left (0, 2) 0 0.221 0.994 1 1 1 righ t (0, 2) 0 0.232 0.997 1 1 1 left (1, 0.5) 0.717 0.982 1 1 1 1 righ t (1, 0.5) 0.685 0.990 1 1 1 1 left (1, 1) 0.008 0.015 0.020 0.042 0.075 0.134 righ t (1, 1) 0.002 0.003 0 0.005 0.007 0.010 left (1, 2) 0 0 0.105 0.420 0.774 0.951 righ t (1, 2) 0 0.005 0.647 0.992 1 1 left (2, 0.5) 0.731 0.983 1 1 1 1 righ t (2, 0.5) 0.696 0.992 1 1 1 1 left (2, 1) 0.010 0.025 0.030 0.073 0.134 0.236 righ t (2, 1) 0.002 0.003 0.002 0.006 0.012 0.023 left (2, 2) 0 0 0.041 0.215 0.555 0.825 righ t (2, 2) 0 0.004 0.482 0.979 1 1 T able 1: W eibull generating distribution with additive treatment parameter ∆ ∗ and shap e ς . The assumed mo del is exp onen tial with multiplicativ e treatment parameter ψ . Pro- p ortion of 1000 sim ulation runs in whic h the 5% test based on ( 8 ) rejects the p ostulated mo del. The p oin t of mo del intersection (∆ ∗ , ς ) = (0 , 1) is highligh ted. 20 Direction ( ψ ∗ , ς ) Square ro ot of num b er of pairs m of sensitivity 5 8 11 14 17 20 left (0.5, 0.5) 0.442 0.589 0.763 0.859 0.926 0.964 righ t (0.5, 0.5) 0.846 0.995 1 1 1 1 left (0.5, 1) 0.231 0.623 0.907 0.991 0.998 1 righ t (0.5, 1) 0.239 0.669 0.942 0.998 1 1 left (0.5, 2) 0.875 1 1 1 1 1 righ t (0.5, 2) 0.002 0.035 0.182 0.536 0.817 0.939 left (1, 0.5) 0.708 0.925 0.990 1 1 1 righ t (1, 0.5) 0.565 0.881 0.984 1 1 1 left (1, 1) 0.079 0.068 0.082 0.059 0.070 0.058 righ t (1, 1) 0.061 0.075 0.073 0.060 0.065 0.054 left (1, 2) 0.046 0.399 0.826 0.985 0.999 1 righ t (1, 2) 0.073 0.450 0.894 0.991 1 1 left (2, 0.5) 0.896 0.999 1 1 1 1 righ t (2, 0.5) 0.329 0.466 0.606 0.740 0.873 0.936 left (2, 1) 0.253 0.626 0.925 0.993 1 1 righ t (2, 1) 0.252 0.620 0.922 0.991 1 1 left (2, 2) 0.011 0.042 0.192 0.515 0.763 0.901 righ t (2, 2) 0.889 1 1 1 1 1 T able 2: W eibull generating distribution with multiplicativ e treatmen t prameter ψ ∗ and shap e ς . The assumed mo del is exp onen tial with additive treatmen t parameter ∆. Pro- p ortion of 1000 sim ulation runs in whic h the 5% test based on ( 8 ) rejects the p ostulated mo del. The p oin t of mo del intersection ( ψ ∗ , ξ ) = (1 , 1) is highlighted. 21 Direction β estimated/true Square ro ot of num b er of individuals n of sensitivity 3 4 5 6 7 8 left 0 true 0.05 0.08 0.02 0.07 0.05 0.06 righ t 0 true 0.05 0.07 0.04 0.05 0.02 0.05 left 0 estimated 0.03 0.04 0 0.03 0.02 0.05 righ t 0 estimated 0.03 0.03 0.02 0.03 0.01 0.03 left 1 true 0.02 0.07 0.06 0.04 0.07 0.09 righ t 1 true 0.03 0.04 0.03 0.03 0.05 0.05 left 1 estimated 0.01 0.01 0 0 0 0 righ t 1 estimated 0 0 0.01 0 0 0 left 2 true 0.06 0.03 0.06 0.05 0.06 0.05 righ t 2 true 0.06 0.04 0.04 0.07 0.03 0.04 left 2 estimated 0 0 0 0 0 0 righ t 2 estimated 0.01 0 0 0.01 0 0 T able 3: The generating pro cess is the inhomogeneous P oisson process with intensit y function λ i ( t ) = e γ i + β t as p ostulated. Prop ortion of 200 sim ulation runs in whic h the 5% test based on F S i | M i and ( 8 ) rejects the true mo del. un til the corresp onding sum of unit exp onen tials exceeded τ i ( t 0 ) = e γ i t ρ +1 0 / ( ρ + 1). A t ρ = β = 0 the true and postulated mo dels in tersect, but since the lik eliho od function ( 10 ) is indeterminate there, w e do not necessarily exp ect to recov er the nominal level. F rom n individuals w e estimated β in the p ostulated mo del b y maximising the condi- tional log-likelihoo d function ℓ ( β ) = n X i =1 m i log  β e β t 0 − 1  + β n X i =1 m i X j =1 t ij (27) based on ( 10 ). The mo del was subsequen tly assessed using the statistic − 2 P n i =1 log U i ( ˆ β ) where U i ( ˆ β ) = F S i | M i ( S i | m i ; ˆ β ) and the conditional distribution was approximated either through Monte Carlo sampling using 1000 replicates under the p ostulated model if m i w as b elo w 40, or by a normal appro ximation to the distribution of the sum under the p ostulated mo del if m i exceeded 40. An analogous statistic was computed with ˆ β replaced by its true v alue. The results are rep orted in T able 3 for data generated under the p ostulated model and in T able 4 for data generated under the p o w er-law mo del. T able 3 shows that the pro cedure rarely rejects the true mo del when the conditional maxim um likelihoo d estimate is used in the construction of each U i ( ˆ β ). When the true v alue of β is used, it attains a rejection rate close to the nominal lev el, the discrepancy b eing primarily due to Monte Carlo sampling error in the approximation to F S i | M i ( S i | m i ; ˆ β ). T able 4 sho ws high p ow er to detect departures from the p ostulated mo del, ev en at v alues of ρ very close to the p oint of in tersection ρ = 0. 22 Direction ρ Square ro ot of num b er of individuals n of sensitivity 3 4 5 6 7 8 left 0 0.03 0.04 0 0.03 0.02 0.05 righ t 0 0.03 0.03 0.02 0.03 0.01 0.03 left 0.1 0.18 0.34 0.42 0.61 0.73 0.92 righ t 0.1 0.11 0.17 0.23 0.33 0.47 0.51 left 0.2 0.72 0.88 0.98 1 1 1 righ t 0.2 0.66 0.81 0.95 0.99 1 1 T able 4: The generating pro cess is the inhomogeneous P oisson process with intensit y function λ i ( t ) = e γ i + ρ log t ; the postulated mo del has intensit y function λ i ( t ) = e γ i + β t . Prop ortion of 200 simulation runs in whic h the 5% test based on F S i | M i and ( 8 ) rejects the p ostulated mo del. The p oin t of mo del intersection ρ = 0 is highligh ted. 5.3 Confidence sets of regression mo dels This section presen ts n umerical p erformance of the approac h discussed in § 3.3 . The orig- inal pro cedure outlined in Battey , Rasines & T ang ( 2025 ) w as in tended for the situation in which the full set of v ariables contemplated is inordinately large, necessitating some preliminary reduction by v ariable screening prior to assessmen t of low-dimensional subsets of v ariables. Since, in the presen t pap er, w e are illustrating a particular p oin t, w e simplify the problem b y assuming that the num b er of starting v ariables is 15, so that no preliminary reduction is needed. If that were actually the case, there would b e no adv an tage to using an ything other than a likelihoo d-ratio test of every low-dimensional mo del against the comprehensiv e mo del, but it is reassuring to see cov erage probabilities for the confidence sets based on U j = F Z ( Z j ) from ( 14 ). These are reported as a function of the num b er of syn thetic replicates k and the sample size n . W e also rep ort the simulation-a v erage size of the resulting confidence set of models, i.e. the num b er of false mo dels that are included in the confidence set on a verage o ver sim ulation runs. The experiment was conducted as follows. In eac h of 500 sim ulations, the n rows of the n × d cov ariate matrix X w ere drawn from a mean-zero normal distribution with correlation 0 . 9 b etw een s + a of the d v ariables, and correlation zero elsewhere, where d = 15, s = 5 is the num b er of signal v ariables and a = 3 is the n umber of noise v ariables that are correlated with signal v ariables. Asso ciated with X is a v ector θ of regression co efficien ts with en tries 1 in the p ositions corresp onding to the signal v ariables, and zeros elsewhere. The outcome vector w as constructed as Y = X θ + ε , where the entries of ε w ere tak en as standard normally distributed. F or every p ossible mo del of size d 0 ≤ 5, the corresp onding columns X 0 of X w ere extracted. F or this p ostulated mo del we constructed a basis for the orthogonal pro jec- tion onto the null space of X 0 b y taking the n − d 0 eigen vectors of I − X 0 ( X T 0 X 0 ) − 1 X 0 corresp onding to the unit eigenv alues. Let V 0 denote this n × ( n − d 0 ) matrix of eigen v ec- tors. F rom the vector of outcomes Y , k ∈ { 4 , 8 , 12 , 16 } synthetic replicates ˜ Y 1 , . . . , ˜ Y k w ere generated according to equation (4.3) of Battey , Rasines & T ang ( 2025 ) and the corre- sp onding co-sufficient replicates were constructed as ˜ Q j = V T 0 ˜ Y j / ∥ V T 0 ˜ Y j ∥ 2 . The statistics Z 1 , . . . , Z m w ere then computed as the m = k ( k − 1) / 2 inner pro ducts ⟨ ˜ Q j , ˜ Q i ⟩ , j  = i . 23 Direction ( n , k ) Cov erage ˆ E |M\S | ˆ E (# false mo dels) # mo dels tested of sensitivity left (100, 4) 0.944 613 0.124 righ t (100, 4) 0.954 603 0.122 left (100, 8) 0.954 508 0.103 righ t (100, 8) 0.962 504 0.102 left (100, 12) 0.964 477 0.097 righ t (100, 12) 0.956 476 0.096 left (100, 16) 0.956 462 0.093 righ t (100, 16) 0.968 460 0.093 left (200, 4) 0.950 270 0.055 righ t (200, 4) 0.954 250 0.051 left (200, 8) 0.956 213 0.043 righ t (200, 8) 0.958 206 0.041 left (200, 12) 0.958 197 0.040 righ t (200, 12) 0.958 192 0.039 left (200, 16) 0.956 190 0.038 righ t (200, 16) 0.950 186 0.038 T able 5: Simulated co verage probabilit y and exp ected size of the confidence sets of models from 500 simulations. These were constructed from ( 8 ) based on the distribution ( 14 ) of the cosine angles b et ween pro jected synthetic replicates under each p ostulated sparse mo del. When the p ostulated mo del is true, these follow the distribution ( 14 ) and U j = F Z ( Z j ) has a standard uniform distribution. The rejection regions for the mo del were thus de- fined in the usual wa y via ( 8 ). T able 5 rep orts the resulting co verage probabilities of the 0 . 95 nominal confidence sets and the av erage num b er of false mo dels in the set from 500 sim ulations. 6 Discussion and op en problems Assessmen t of a statistical model for its compatibilit y with the data is inevitably a discrete pro cess unless competing models are nested, or can be artificially nested in an encompass- ing parametric mo del in such a w ay that the mo del space can b e contin uously trav ersed through v ariation of a parameter. By contrast, the simplest approaches to inference on the parameters of a giv en statistical mo del often inv olv e maximisation of a log-likelihoo d or other ob jectiv e function, and therefore t ypically inv ok e a notion of contin uit y on the parameter space. P erhaps for this reason, certain v aluable inferential structures, such as interest-dependent co-sufficiency , hav e b een ov erlo ok ed in the literature, yet emerge naturally in the context of highly parametrised mo dels with a low-dimensional interest parameter. The work pro vides some new p erspectives on the assessmen t of semiparametric and highly parametrised mo dels, illustrating the p ossibilit y and v alue, in settings where the 24 mo del admits this, of circum ven ting estimation of the infinite- or high-dimensional com- p onen t. The broad idea is to use the p ostulated mo del to achiev e internal replication of kno wn form if and only if the postulated mo del holds to an adequate order of approxima- tion. The examples illustrate v arious wa ys in which the requisite internal replication can b e ac hieved. There are broad principles underpinning all of the examples presented, extracted in § 2 , how ev er the exact manner in whic h the internal replication is induced app ears to b e problem-sp ecific. This raises the question of whether any general approac hes to the inducemen t of internal replication might b e formulated. W e close by p oin ting to some underdev elop ed ideas in this direction, for which a general resolution would constitute a ma jor step forw ard. In the absence of internal replication b y design, as migh t arise in a longitudinal data set, or in settings like § 3.2 or § 4 , where exact elimination of nuisance parameters is some- times feasible, an important op en question concerns the possibility of eliminating nuisance parameters appro ximately . A notion of approximate exchangeabilit y was formulated by Barb er and Janson ( 2022 ) for a particular con text, and this notion seems broadly appro- priate across the range of settings considered here. An alternativ e migh t instead seek to collapse nuisance parameters into a lo w-dimensional summary that is b oth estimable and relativ ely insensitive to lo cal p erturbations under the p ostulated mo del. Battey , Co x and Lee ( 2024 ) was a first attempt at a constructive approach to finding transformations of observ able random v ariables whose distribution is free or appro ximately free of nuisance parameters. In the context of matc hed pair examples, they framed known examples in whic h n uisance parameters are straigh tforw ardly eliminated in terms in tegro- differen tial equations; these could b e conv erted to standard types of partial differential equations and solv ed b y established metho ds. The purp ose of that w ork w as not to solv e easy examples via an unnecessarily complicated method, but to sho w ho w the results could b e obtained through an application of general theory , in the hop e that this migh t gener- alise. Such generalisation is a difficult op en c hallenge. While Battey , Cox and Lee ( 2024 ) sough t nuisance-eliminating transformations b y solving differential equations analytically , an alternativ e might b e to seek those transformations n umerically , in the v ein of Bo x & Co x ( 1964 ). Inducemen t of in ternal replication under a p ostulated mo del can be viewed as a form of inducement of population-level sparsit y . Battey ( 2023 ) framed four examples from this p erspective that sought to achiev e, through tra versal of data-transformation space or of parametrisation space, a p opulation-lev el sparsity that was not presen t in the initial for- m ulation. The w ork cited in the previous paragraph is one example, another is parameter orthogonalisation ( Cox and Reid , 1987 ). It is p ossible in view of this, although not ob- vious, that reparametrisation-based assessments of mo del adequacy migh t b e av ailable: if the mo del is violated, the reparametrisation fails to induce a p opulation-lev el sparsity . 25 A Pro ofs and deriv ations A.1 Pro of of Prop osition 2.1 Pr o of. Conv ergence in distribution of U j ( ˆ ψ ) to U j ( ψ ∗ 0 ) is metricised b y b oth the Prohoro v and bounded Lipsc hitz metrics, the latter defined for any tw o probabilit y measures P and Q as β ( P , Q ) := sup n    Z f d ( P − Q )    : ∥ f ∥ BL ≤ M o , for an y finite M , where ∥ f ∥ BL := ∥ f ∥ L + f ∞ and ∥ f ∥ L := sup( x  = y ) | f ( x ) − f ( y ) | /d ( x, y ), where d is a metric. By Dudley ( 2002 , Theorem 11.3.3), β ( P , Q ) → 0 if and only if P → Q . Prop osition 2.1 is thus established b y sho wing that E f ( U j ( ˆ ψ )) → E f ( U j ( ψ ∗ 0 )) for all b ounded functions f with b ounded Lipsc hitz norm. Let A b e the ev ent {| ˆ ψ − ψ ∗ 0 | ≤ ε } . F or an arbitrary η > 0, let ε = η / 2 ∥ f ∥ L and let n 0 b e a sample size suc h that pr( | ˆ ψ − ψ ∗ 0 | > ε ) < η / 4 ∥ f ∥ ∞ for n > n 0 . Thus, for n exceeding n 0 , | E f ( U j ( ˆ ψ )) − E f ( U j ( ψ ∗ 0 )) | ≤   E  f ( U j ( ˆ ψ )) − f ( U j ( ψ ∗ 0 ))  I 1( A )    +   E  f ( U j ( ˆ ψ )) − f ( U j ( ψ ∗ 0 ))  I 1( A c )    ≤ ∥ f ∥ L E  | ˆ ψ − ψ ∗ 0 | I 1( {| ˆ ψ − ψ ∗ 0 | ≤ ε } )  + 2 ∥ f ∥ ∞ pr( | ˆ ψ − ψ ∗ 0 | > ε ) ≤ η / 2 + η / 2 = η . Prop osition 2.1 follo ws b y the arbitrariness of η . A.2 Deriv ation of equation ( 7 ) This is a standard argumen t but some care is needed with min us signs. Consider the momen t generating function of R : M R ( s ) = E ( e sR ) = m Y j =1 ϑ j − 1 1 + 2 s − ϑ j , ( s > ϑ j − 1 ∀ j ) . (28) Existence for s > ϑ j − 1 for all j = 1 , . . . , m implies existence for all s > ϑ max − 1. Th us, on letting k α b e the 1 − α quantile of the χ 2 2 m distribution, Marko v’s inequalit y implies pr( R ≥ k α ) = 1 − pr( − R ≥ − k α ) ≥ 1 − e tk α E ( e − tR ) , t > 0 . The exp ectation on the right hand side is the moment generating function with s = − t , the p ermissible range is thus s min < s < 0, where s min = ϑ max − 1, so that the tigh test b ound is, from ( 28 ), pr( R ≥ k α ) ≥ 1 − inf s min β , f S i ( s | m i ; β ) = β m i ( e β t 0 − 1) m i 1 2 π i Z τ + i ∞ τ − i ∞ e z s ( e ( β − z ) t 0 − 1) m i ( β − z ) m i dz = β m i ( e β t 0 − 1) m i m i X v =0  m i v  ( − 1) v e v βt 0 1 2 π i Z τ + i ∞ τ − i ∞ e sz e − v zt 0 ( z − β ) m i dz b y the binomial formula. Let k ∗ ( z ) = e z ( s − vt 0 ) ( z − β ) m i . Its contour in tegral is the residue at z = β which, for a p ole of order m i , is Res( k ∗ ( z ) , β ) = 1 ( m i − 1)! lim z → β d ( m i − 1) dz ( m i − 1) n ( z − β ) m i k ∗ ( z ) o = 1 ( m i − 1)! lim z → β d ( m i − 1) dz ( m i − 1) e z ( s − vt 0 ) = ( s − v t 0 ) ( m i − 1) ( m i − 1)! e β ( s − vt 0 ) . Th us, f S i ( s | m i ; β ) = β m i e β s ( e β t 0 − 1) m i Γ( m i ) m i X v =0  m i v  ( − 1) v ( s − v t 0 ) ( m i − 1) I 1 { s > v t 0 } , = β m i e β s ( e β t 0 − 1) m i Γ( m i ) ⌊ s/t 0 ⌋ X v =0  m i v  ( − 1) v ( s − v t 0 ) ( m i − 1) , s < m i t 0 (29) and zero otherwise. Integration giv es F S i | M i ( s | m i ; β ) = β m i ( e β t 0 − 1) m i Γ( m i ) ⌊ s/t 0 ⌋ X v =0  m i v  ( − 1) v e β vt 0 Z s − v t 0 0 e β w w ( m i − 1) dw . This is equation ( 12 ). A.4 Pro of of Prop osition 4.1 Pr o of. F or notational con venience, in troduce V ( k ) for the, p erhaps notional, random v ari- able relev ant for exploiting a postulated mo del with Structure k . This is Z and Z ( ψ 0 ) for k = 3 and k = 4 resp ectiv ely and is the notional random v ariable Y • ( S ) and Y • ( S ( ψ 0 )) 27 for k = 1 and k = 2. These are not functions of S in the con ven tional sense, and are no- tional in the sense that they typically cannot b e expressed in terms of the original random v ariables, but once S = s or S ( ψ 0 ) = s ( ψ 0 ) is observ ed, Y • ( S ) and Y • ( S ( ψ 0 )) ha ve the conditional distribution of Y • giv en S = s or S ( ψ 0 ) = s ( ψ 0 ). T emporarily dropping superscripts, let F V ( v ; ψ 0 ) b e the distribution function of V under the assumption that the p ostulated mo del is true at ψ 0 . Let G ∗ V ( v ) b e the true distribution of V . Thus, if G ∗ V b elongs to the family F V := { F V ( · ; ψ ) , ψ ∈ Ψ } , then there exists a ψ ∗ suc h that G ∗ V ( v ) = F V ( v ; ψ ∗ ). One direction is by direct calculation: if ψ 0 = ψ ∗ , then the distribution of F V ( V ; ψ 0 ) is uniform. F or the conv erse direction, suppose for a con tradiction that F V ( V ; ψ 0 ) is uniformly distributed but that G ∗ V ( v )  = F V ( v ; ψ 0 ). By uniformity , u = pr  F V ( V ; ψ 0 ) ≤ u  (30) for all u ∈ [0 , 1]. First supp ose that G ∗ V ∈ F V . Since F V ( v , ψ ) is bijectiv e in v for an y fixed ψ , there exists a u ∈ [0 , 1] suc h that F − 1 V ( u ; ψ 0 )  = F − 1 V ( u ; ψ ′ 0 ) for any ψ ′ 0  = ψ 0 . Thus, for an y such v alue of u , F V  F − 1 V ( u ; ψ 0 ); ψ ∗   = F V  F − 1 V ( u ; ψ ∗ ); ψ ∗  , where the righ t hand side is equal to u by definition. It follows that F V  F − 1 V ( u ; ψ 0 ); ψ ∗   = u , which con tradicts ( 30 ). Supp ose no w that uniformit y is ac hiev ed and G ∗ V / ∈ F V . Then G ∗ V ( F − 1 V ( u ; ψ 0 )) replaces F V  F − 1 V ( u ; ψ 0 ); ψ ∗  in the argument of the previous paragraph. Equalit y for all u can only b e achiev ed at the points ψ 0 where G ∗ V ( v ) = F V ( v ; ψ 0 ). But at suc h points G ∗ V b elongs to F V , a contradiction. The remaining question is whether there are distributions ov er ( Y 0 , Y 1 ) that induce the same distribution G ∗ V ( v ) = F V ( v ; ψ 0 ) o ver V that w ould hold if the p ostulated model for ( Y 0 , Y 1 ) were true at ψ 0 . The corresp onding equiv alence classes are most easily expressed in terms of density functions g ∗ = g ∗ Y 0 ,Y 1 , inducing a density function g ∗ V on V . These are the sets E ( k ) ( ψ 0 ) from Prop osition 4.1 , made explicit in App endix A.5 . A.5 Equiv alence classes for Prop osition 4.1 Prop osition 4.1 refers to equiv alence classes of densit y functions g ov er pairs ( Y 0 , Y 1 ), within which the approach to ac hieving internal replication based on Structures 1 – 4 has no p ow er to reject the p ostulated mo del. In other words, if the true densit y function violates the postulated model in the directions of an y model b elonging to the corresp onding equiv alence class, then this will not b e detected at any sample size. As in the statement of Prop osition 4.1 , we drop the pair-index subscripts. With | J | the absolute Jacobian determinan t corresp onding to the transformation from ( Y 0 , Y 1 ) to ( S, Y 1 ) for k = 1, to ( S ( ψ 0 ) , Y 1 ) for k = 2, to ( Z, Y 1 ) for k = 3 and to ( Z ( ψ 0 ) , Y 1 ) for k = 4, the equiv alence 28 classes are: E (1) ( ψ 0 ) = n g : g  y 0 ( s, y 1 ) , y 1  | J | R Y 1 ( s ) g  y 0 ( s, y 1 ) , y 1  | J | dy 1 = f Y 1 | S ( y 1 | s ; ψ 0 ) o , E (2) ( ψ 0 ) = n g : g  y 0 ( s ( ψ 0 ) , y 1 ) , y 1  | J | R Y 1 ( s ( ψ 0 )) g  y 0 ( s ( ψ 0 ) , y 1 ) , y 1  | J | dy 1 = f Y 1 | S ( ψ 0 )  y 1 | s ( ψ 0 )  o , E (3) ( ψ 0 ) = n g : Z Y 1 g  y 0 ( z , y 1 ) , y 1  | J | dy 1 = f Z ( z ; ψ 0 ) o , E (4) ( ψ 0 ) = n g : Z Y 1 g  y 0 ( z ( ψ 0 ) , y 1 ) , y 1  | J | dy 1 = f Z ( ψ 0 )  z ( ψ 0 )  o . The abov e equiv alence relations are in terms of generic expressions and usually do not sp ecify the simplest route to calculation in particular cases. F or instance, the distribution of Z j from Example 4.1 can typically b e calculated directly without explicit preliminary transformation from ( Y 0 , Y 1 ) to ( Z , Y 1 ). A.6 Deriv ations for § 4.2 A.6.1 Local consistency under the W eibull mo del Maximisation of the log-lik eliho o d function based on ( 18 ) when the true distribution is giv en b y ( 20 ) pro duces a maximum lik eliho o d estimator ˆ ψ → p ψ ∗ 0 , where ψ ∗ 0 is the v alue of ψ that maximises the exp ected log-likelihoo d function ψ ∗ 0 = argmax ψ ∈ R +  log ψ − 2 Z ∞ 0 log(1 + ψ z ) ψ ∗ ς z ς − 1 dz (1 + ψ ∗ z ς ) 2  . (31) This solution satisfies 1 ψ ∗ 0 = 2 ψ ∗ ς Z ∞ 0 z ς dz (1 + ψ ∗ z ς ) 2 (1 + ψ ∗ 0 z ) . (32) Since w e are interested in the b eha viour near the p oin t of intersection of the tw o mo dels ς = 1, write ς = 1 + ε and expand the integral for small ε , giving 1 ψ ∗ 0 = 2 ψ ∗ (1 + ε )  Z ∞ 0 z dz (1 + ψ ∗ z ) 2 (1 + ψ ∗ 0 z ) + O ( ε )  On expanding the integral in partial fractions it follows that, to first order in ε , ψ ∗ 0 solv es ψ ∗ ψ ∗ 0 = 2 ψ ∗ (1 + ε )  ψ ∗ log( ψ ∗ /ψ ∗ 0 ) + ψ ∗ 0 − ψ ∗ ( ψ ∗ − ψ ∗ 0 ) 2  . Equiv alently , x = ψ ∗ /ψ ∗ 0 solv es ( x − 1) 2 = 2(1 + ε ) x (log x + 1 /x − 1) . (33) The left and right hands sides are b oth conv ex with a unique minim um at x = 1 for any ε . Th us, to first order in ε , the maxim um likelihoo d solution ψ ∗ 0 under the postulated mo del is equal to the true v alue ψ ∗ . 29 A.6.2 Local asymptotic analysis for the additive exp onen tial mo del The limiting maximum lik eliho o d solution is in this case more complicated in view of the n uisance parameters, but by the law of large num bers the a verage of the log-likelihoo d con tributions conv erges to the a verage of the exp ected log-likelihoo d con tributions, th us ψ ∗ 0 = lim m →∞ argmax ψ ∈ R +  log ψ − 2 m m X j =1 γ j ( γ j + ∆ ∗ ) Z ∞ 0 log(1 + ψ z ) dz ( γ j + ( γ j + ∆ ∗ ) z ) 2  . Differen tiation shows that ψ ∗ 0 solv es 1 ψ ∗ 0 = lim m →∞ 2 m m X j =1 γ j ( γ j + ∆ ∗ ) Z ∞ 0 z dz (1 + ψ ∗ 0 z )( γ j + ( γ j + ∆ ∗ ) z ) 2 . Analogously to Example 4.3 w e can expand the in tegrand around ∆ ∗ = 0, which corre- sp onds to the p oin t at whic h the t wo models in tersect, giving, to first order 1 ψ ∗ 0 =  ψ ∗ 0 − 1 − log ψ ∗ 0 ( ψ ∗ 0 − 1) 2 + O (∆ ∗ )  lim m →∞ 2 m m X j =1 ( γ j + ∆ ∗ ) γ j . T o the same order, this is of almost identical form to ( 33 ) with the term 1 + ε replaced b y the limiting a verage a of ( γ j + ∆ ∗ ) /γ j . The first-order solution under the notional asymptotic regime ∆ ∗ → 0 is th us the logical one ψ ∗ 0 = 1 for all a , ψ ∗ = 1 and ∆ ∗ = 0 corresp onding to the n ull treatment effect and the point at whic h the t wo mo dels in tersect. Th us, in tuitiv ely , there is little sensitivit y to departures from the p ostulated multiplicativ e mo del in the direction of the additiv e model at small ∆ ∗ . References Barb er, R. and Janson, L. (2022). T esting go o dness of fit and conditional indep endence with approximate cosufficien t subsampling. Ann Statist. , 50, 2514–2544. Barndorff-Nielsen, O. E. and Cox, D. R. (1994). Infer enc e and Asymptotics . Chapman and Hall, London. Battey , H. S. (2023). Inducement of population-level sparsit y . Canad. J. Statist. , 51, 760— 768 (F estsc hrift for Nancy Reid). Battey , H. S. (2024). Maximal co-ancillarit y and maximal co-sufficiency . Information Ge- ometry , 7, 355–369. Battey , H. S., Co x, D. R. and Lee, S. H. (2024). On partial lik eliho o d and the construction of factorisable transformations. Information Ge ometry , 7, 9–28. Battey , H. S. and Cox, D. R. (2018). Large num b ers of explanatory v ariables: a proba- bilistic assessment. Pr o c. R oy. So c. L ond. A: Math. Phys. Sci. , 474, 20170631. 30 Battey , H. S., Rasines, D. G. and T ang, Y. (2025). Post-reduction inference for confidence sets of mo dels. Birn baum, A. (1954). Com bining indep enden t tests of significance. J. A mer. Statist. As- so c. , 49, 559–574. Bo x, G. E. P . and Cox, D. R. (1955). An analysis of transformations (with discussion). J. R. Statist. So c. B , 26, 211–252. Co x, D. R. (1955). Some statistical methods connected with series of even ts (with discus- sion). J. R. Statist. So c. B , 17, 129–164. Co x, D. R. (1972). Regression mo dels and life-tables (with discussion). J. R. Statist. So c. B , 34, 187–220. Co x, D. R. (1975). Partial lik eliho o d. Biometrika , 62, 269–276. Co x, D. R. and Lewis, P . A. W. (1966). The Statistic al Analysis of Series of Events . Springer, London. Co x, D. R. and Oak es, D. (1984). The A nalysis of Survival Data . Chapman and Hall, London. Co x, D. R. and Reid, N. (1987). Parameter orthogonalit y and appro ximate conditional inference (with discussion). J. R. Statist. So c. B , 49, 1–39. Dharamshi, A., Neufeld, A., Gao, L. L., Bien, J. and Witten, D. (2026). Decomp osing Gaussians with unknown co v ariance. Biometrika , 113, article n umber asaf057. Dudley , R. M. (2002). R e al Analysis and Pr ob ability . Cambridge Univ ersity Press, New Y ork. Engen, S. and Lilleg ˚ ard, M. (1997). Sto c hastic simulations conditioned on sufficien t statis- tics. Biometrika , 84, 235–240. Fisher, R. A. (1915). F requency distribution of the v alues of the correlation co efficien t in samples from an indefinitely large p opulation. Biometrika , 10, 507–521. Fisher, R. A. (1932). Statistic al Metho ds for R ese ar ch Workers . Oliver and Boyd, Edin- burgh. Fisher, R. A. (1950). The significance of deviations from exp ectation in a P oisson series. Biometrics , 6, 17–24. Heard, N. A. and Rubin-Delanch y , P . (2018). Cho osing b et ween metho ds of combining p -v alues. Biometrika , 105, 239–246. Lindqvist, B. H., T araldsen, G., Lilleg ˚ ard, M. and Engen, S. (2003). A coun terexample to a claim ab out sto c hastic simulations. Biometrika , 90, 489–490. 31 Lindqvist, B. H. and T araldsen, G. (2005). Mon te Carlo conditioning on a sufficient statis- tic. Biometrika , 92, 451–464. Lo c khart, R. A., O’Reilly , F. J. and Stephens, M. A. (2007). Use of the Gibbs sampler to obtain conditional tests, with applications. Biometrika , 94, 992–998. Rasines, D. G. and Y oung, G. A. (2023). Splitting strategies for p ost-selection inference. Biometrika , 110, 597–614. W ong, W. (1982). Theory of partial likelihoo d. Ann. Statist. , 14, 88–123. 32

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment