The Geometry of Heterogeneous Extremes: Optimal Transport and Entropic Design

The Geometry of Heterogeneous Extremes: Optimal T ransp ort and En tropic Design ∗ I. Sebastian Buhai † SOFI at Sto c kholm Univ ersity Instituto de Economia at UC Chile NIPE at Univ ersity of Minho V ersion of Marc h 22, 2026. Latest version. Abstract Extreme outcomes dep end not only on shock tails but also on heterogeneit y in ho w man y opportunities agen ts get to sample. In the mixed-Poisson search framework, a randomly drawn agent’s normalized maxim um con verges to H γ ,F ( x ) = P 0  v γ ( x )  , a Laplace-transform mixture of a classical extreme-v alue law, with F the mean-one distribution of opp ortunit y in tensities. T reating F as primitive, we study the op erator F 7→ H γ ,F . That operator represen tation organizes the pap er: con vex-order comparisons, the homogeneous b enchmark, and point wise cdf b ounds are Laplace-analytic consequences. Its main quan titative pa yoﬀ is geometric. Via the canonical coupling represen tation Z γ ,F = w γ ( E /X ), optimal transport on transformed types yields explicit W asserstein stabilit y for the entire induced la w, in tegrated con trol of the corresponding quantile schedule, canonical am bient interpolation paths, and an explicit renormalization bridge back to the mean-one economic slice. W e also giv e a second-order expansion that separates extreme v alue thery (EVT) appro ximation error from the heterogeneit y k ernel. As a complemen tary contribution, we study a Kullback-Leibler regularized design problem that c ho oses F sub ject to a mean constraint relative to a baseline. F or ob jectives linear in F , including exp ected utility of normalized extremes under canonical representation, the solution is the corresp onding exp onen tial tilt, with the heterogeneous-EVT kernel supplying the score. A stylized labor market net work application in terprets F as the cross-sectional distribution of access to job opp ortunities, sho ws ho w the adapted geometry con trols coun terfactual mo v ements in the full top w age distribution, and renders explicit that the op erational robustness claims are conditional on the maintained metric: the main linear theorem lives in the adapted distance d γ ,p , while the bridge from raw space W asserstein error can b e only H¨ older in the economically-relev an t F r ´ ec het regime. W e also distinguish ambien t transp ort geometry from mean-preserving economic coun terfactuals and separate ﬁnite horizon robustness from asymptotic appro ximation. JEL co des: C46; C61; D83; D85; J31; J64. Keyw ords: Extreme v alue theory; heterogeneous search; optimal transport; W asserstein distance; en tropic design; lab or mark et netw orks. ∗ This pap er grew out of many long conv ersations ab out how extremes arise in economics and why they matter. I am grateful to several colleagues in economics, and esp ecially to a few in adjacen t disciplines, who convinced me that the argumen t needed more structure to carry conviction. I also gratefully ac knowledge R eﬁne (https://www.reﬁne.ink/) for assistance in c hec king proofs in a preliminary draft. All remaining errors, extreme or otherwise, are mine. † Email: sebastian.buhai@sofi.su.se . F ull co ordinates at https://www.sebastian buhai.com. 1 1 In tro duction 1.1 Motiv ation and o v erall con tribution Extreme outcomes at the agen t or unit lev el often shap e broader inequalit y and eﬃciency patterns. Examples include a work er’s b est wage oﬀer, a ﬁrm’s b est sourcing or pro ductivity opp ortunit y , and the b est matc h generated through a net work. In a broad class of mo dels, the realized pay oﬀ is the maxim um of a set of opp ortunities. The size of that set can dep end, inter alia, on searc h, attention, or net work p osition. Standard extreme v alue theory (EVT) studies maxima of i.i.d. sequences under deterministic sample sizes. Economic en vironments often feature heterogeneous opp ortunit y sets: diﬀeren t agents sample diﬀeren t num b ers of opp ortunities, and the draw count is itself random. A leading economic form ulation is the mixed Poisson framew ork of Mangin (2026), under which the distribution of a randomly dra wn agent’s normalized maximum admits a Laplace-transform represen tation. If X ∼ F indexes an agen t’s draw intensit y and P 0 ( z ) = E [ exp ( − z X )] denotes the Laplace transform of F , then the distribution of normalized maxima conv erges to H γ ,F ( x ) = P 0  v γ ( x )  , v γ ( x ) = − log H γ ( x ) , where H γ is the classical extreme-v alue limit as sociated with the underlying opp ortunit y distribution. This Laplace-mixture form is the analytical backbone of the pap er. Our con tribution is to take the heterogeneity distribution itself as the primitiv e ob ject and to analyze the induced op erator F 7→ H γ ,F on the space of t yp e distributions. The no velt y claim is deliberately sharp. Once the Laplace-mixture form is established, conv ex-order comparisons, b enc hmark comparisons with the homogeneous economy , and several p oint wis e statements follow from standard one-dimensional Laplace and W asserstein arguments. Those results are useful, but they are not the pap er’s main quantitativ e no velt y . The op erator represen tation has a geometric pa yoﬀ. Once the canonical coupling representation Z γ ,F = w γ ( E /X ) is a v ailable, Theorem 1 turns optimal transp ort on transformed types in to explicit W asserstein control of the entire induced law of extremes. Corollary 2 pac k ages the same geometry in to canonical ambien t interpolation paths, Proposition 10 gives an explicit bridge back to the mean-one economic slice, Corollary 3 translates the same b ound into control of the entire quantile sc hedule, and Theorem 2 separates second-order EVT approximation error from the heterogeneity k ernel. These are the results that deliver robustness to missp eciﬁcation or estimation error in F and quantitativ e comparative statics for whole laws, rather than only p oin twise cdf comparisons. The main linear statemen t is formulated in the adapted metric d γ ,p ; the bridge bac k to raw space W asserstein error can b e weak er and is made explicit later. W e also study a complementary normativ problem on the same t yp e space. There, the ob ject is not a full t wo-marginal transp ort problem, and its solution does not rely on the W asserstein geometry developed for the p ositiv e results. Instead, the planner c ho oses a new heterogeneity distribution F sub ject to a mean constraint and a Kullbac k-Leibler (KL) p enalt y relative to a baseline F 0 . This is a one-marginal entrop y pro jection. The Gibbs form of the solution is standard in that class of problems; our contribution is to show how the same Laplace kernel that deﬁnes F 7→ H γ ,F generates the score functions en tering the planner’s ﬁrst-order conditions for cdf criteria and for exp ected utilit y under the canonical sto chastic represen tation. Being explicit ab out that division of lab or sharp ens b oth the mathematics and the interpretation. Throughout, the formal ob ject is the cross-sectional la w of a randomly drawn agen t’s normalized maxim um under a common oﬀer distribution. Aggregate maxima across agents, as w ell as settings in whic h heterogeneity also shifts oﬀer quality or induces dep endence b et ween opp ortunit y access and oﬀer v alues, lie outside the present reduced-form scop e. 2 1.2 Main results The pap er contributes four sets of results. A structural decomp osition of heterogeneous extremes. W e formalize the op erator view T : F 7→ H γ ,F and separate what is purely (Laplace-)analytic from what is genuinely geometric. Order comparisons and sev eral p oint wise inequalities follo w from the conv exit y of the Laplace k ernel. By contrast, the quantitativ e comparison of entire laws requires an adapted metric on the t yp e space. This distinction matters for b oth interpretation and application. W asserstein geometry and quantitativ e robustness. The pap er’s main new theorem is Theorem 1: under the regime-sp eciﬁc moment conditions intrinsic to extreme-v alue theory , the map F 7→ H γ ,F is Lipschitz from the am bient transformed t yp e space to the space of extreme la ws equipp ed with W asserstein distance. The linear b ound is therefore stated in the adapted metric d γ ,p , whic h is itself an ordinary W asserstein distance after a monotone transform of types. The same construction also furnishes canonical constant-speed geo desics in that am bient geometry , summarized in Corollary 2. When the transform nonlinear, those adapted geodesics need not preserv e the economic normalization E [ X ] = 1 at intermediate times; Proposition 10 therefore pro vides an explicit quantitativ e bridge from the ambien t path bac k to the mean-one economic slice. The same whole law control also yields int egrated b ounds for the full quantile schedule of normalized extremes, not only ﬁxed threshold cdf comparisons. Op erationally , this means that one either con trols d γ ,p directly or uses a separate bridge from raw space error, whic h can b e only H¨ older in F r´ echet settings, unless extra supp ort restrictions make the transform Lipsc hitz. 1 A complemen tary entrop y-regularized design problem on the same t yp e space. F or ob jectives that are linear in F , including exp ected utilit y of normalized extremes under the canonical sto c hastic representation, we solv e the planner’s problem with a KL p enalt y and a mean constrain t. The optimizer is the corresp onding exp onen tial tilt of the baseline. This problem is b est understo od as a one-marginal en tropy pro jection. Its tight connection to the p ositiv e analysis is kernel-based rather than transp ort-based: the score functions entering the tilt are generated b y the same heterogeneous-EVT primitives that determine the op erator F 7→ H γ ,F and its directional deriv ative. Application to labor mark et net w orks. W e apply the framework to a reduced-form lab or mark et en vironment in whic h net work p osition gov erns oﬀer arriv al while the oﬀer distribution remains common across work ers. The opp ortunit y distribution induced by degree, w eighted degree, or other v alidated pro xies for access to searc h translates into a heterogeneous extreme-v alue la w for a randomly drawn w orker’s normalized top wage. The p ositiv e results quantify ho w net work inequalit y c hanges that cross-sectional right tail distribution, giv e a concrete whole law coun terfactual for the full schedule of top w age quantiles, and make explicit the conditions under whic h measurement error in netw ork heterogeneit y propagates into tail prediction error. 1 In plain terms, the theorem controls not just one tail probability but the whole distribution of normalized extremes, and hence the whole quantile curve, in an integrated sense. In applications, the most direct route is to measure heterogeneit y in the adapted metric d γ ,p . If one instead starts from error in the original type space, an additional comparison is needed. In F r´ ec het settings that comparison ma y be w eaker than linear unless supp ort restrictions keep the rele v ant transform Lipsc hitz, for instance b y k eeping the supp ort aw ay from zero. 3 1.3 Empirical interface and identiﬁcation of heterogeneit y The empirical con tent of the framew ork is promising, thogh one should b e careful ab out what is and is not identiﬁed. If a cross section of oﬀer coun ts is observ ed at a ﬁxed horizon, then the data identify the coun t distribution and its pgf. Under the mixed-P oisson structure, that pgf iden tiﬁed from data pins do wn the Laplace transform P 0 on the interv al z ∈ [0 , θ ]. Because P 0 is a Laplace transform, kno wledge of that transform on any op en in terv al identiﬁes the underlying mixing distribution F at the p opulation level. The real diﬃcult y is statistical rather than logical: analytic contin uation and Laplace inv ersion are severely ill-p osed, so richer v ariation, e.g., multiple horizons, rep eated exp osures, or parametric/semi-parametric structure, remains highly v aluable for stable estimation rather than for p oin t identiﬁcation p er se. A second empirical route uses direct proxies for opp ortunit y intensit y , such as normalized degree, referral exp osure, centralit y , or estimated arriv al rates. That route identiﬁes an empirical appro ximation to F after one sp eciﬁes a mo deling map from the observed proxy into the latent in tensity entering the mixed-Poisson la w. Lik ewise, netw ork data by themselves do not identify the adapted metric d γ ,p ; that metric b ecomes op erational once the researc her has committed to a tail index γ and to a maintained measuremen t mo del linking the observ ed pro xy to the laten t t yp e en tering the relev ant stability theorem. The same caution applies to inference from extremes. Using the heterogeneous extreme-v alue limit to infer F from tail b eha vior requires the tail-limit ingredients, i.e. the index γ , the normalizations ( a θ , b θ ), and, for second-order reﬁnements, the corresp onding second-order ob jects, either to b e kno wn or to b e estimated in a ﬁrst step. The full paren t law G need not b e known for the EVT appro ximation itself. Accordingly , our pap er emphasizes robustness and error propagation more than stable nonparametric reco very . The main op erational message is therefore conditional: once the analyst has an estimate b F and justiﬁed con trol of the adapted metric d γ ,p around b F , the stability results con vert uncertain ty ab out heterogeneity into explicit uncertain ty ab out tail predictions. If one starts instead from ra w space W asserstein error, Prop osition 11 supplies the bridge. In particular, when 0 < γ < 1 and the assumptions of Theorem 1 hold with pγ < 1, that bridge is only H¨ older of order γ , so F r´ echet-side error propagation can b e materially slo wer than linear. 1.4 Related literature Heterogeneous extreme v alue theory . W e build on Mangin (2026); an earlier circulation is Bec ker and Mangin (2023). On the probability side, the mixed-Poisson law H γ ,F sits in the classical lineage on maxima under random sample size or random indexing (Barndorﬀ-Nielsen, 1964; Galam b os, 1973; Silv estrov and T eugels, 1998), with related max-geometric and max-semistable notions in, e.g., Rachev and Resnick (1991) and Megyesi (2002). Our departure from that literature is to treat the heterogeneity distribution itself as the primitive ob ject and to study the induced op erator F 7→ H γ ,F through order, metric, and v ariational to ols. Adjacent econometric work by Einmahl and He (2023) studies extreme-v alue estimation under heterogeneous marginals rather than heterogeneit y in the intensit y of opp ortunit y arriv al. F or classical EVT background, see de Haan and F erreira (2006) and Resnick (2008); for complementary p eaks-ov er-threshold foundations, see Balk ema and de Haan (1974) and Pick ands (1975). Optimal transport, coupling, and en trop y pro jection. Our p ositiv e results use optimal transp ort as a geometry on distributions (Villani, 2009; Galic hon, 2016) and are close in spirit to the coupling approach to extremes in Bobbia et al. (2021); complementary W asserstein-type con trol of F r´ ec het approximation via Stein methods is developed in Mansanarez et al. (2025). On the normativ e side, our planner problem is a one-marginal entrop y pro jection with a linear momen t 4 constrain t. The dualit y and exponential-tilt characterization connect to the Gibbs v ariational principle (Donsker and V aradhan, 1975) and, more lo osely , to mo dern Schr¨ odinger and en tropic optimal transp ort treatments (Cuturi, 2013; L ´ eonard, 2014; P eyr´ e and Cuturi, 2019; Ghosal et al., 2022; Nutz, 2022). W e therefore use optimal transp ort and entrop y regularization in related but distinct w ays in the p ositiv e and normative parts of the pap er. Lab or mark et net w orks. Our application dra ws on the literature on job information and referrals in netw orks, including Grano vetter (1973), Mon tgomery (1991), T opa (2001), Calv´ o-Armengol and Jac kson (2004), and Ioannides and Loury (2004), as well as more recen t work on referral inequality and segregation such as Buhai and v an der Leij (2023) and Bolte et al. (2024). Relative to that literature, w e emphasize a tail-sensitiv e reduced form: how the distribution of opp ortunities maps in to the distribution of w orker-lev el extremes, ho w robust that mapping is to p erturbations in heterogeneit y , and how p olicy might reshap e the opp ortunit y distribution when the righ t tail is economically consequen tial. 1.5 Roadmap and regime map Structure of the pap er. Section 2 introduces the heterogeneous extreme-v alue environmen t. Section 3 develops order prop erties, misallo cation indices, and geo desic structure on the type space. Section 4 con tains the main metric stability results, ﬁnite horizon robustness b ounds, and the second-order expansion. Section 5 studies the com plemen tary en tropy-regularized design problem. Section 6 presents the lab or market netw ork application. Section 7 concludes. App endix A contains pro ofs and App endix B records technical to ols. Before moving on to the main sections, w e already include T able 1 here in the Introduction, as a pragmatic o verview of the pap er’s regime-sp eciﬁc scop e. A recurring theme is that low er-supp ort restrictions are needed only when logarithmic, negative-pow er, or in verse-pow er transforms of X en ter the argument. Another is that the adapted geo desics of Section 4 liv e in an am bient transformed t yp e space; when s γ is nonlinear, they need not preserve the mean-one normalization at intermediate times. Prop osition 10 quan tiﬁes the discrepancy b et ween the ambien t path and its mean-one renormalization, while the raw space geo desics and p oin twise comparisons in Section 3 remain a v ailable when exact mean preserv ation is required throughout. T able 1: Scop e of the main analytical results across extreme-v alue regimes Result Regime Additional conditions Prop osition 1, Prop osition 2, Corollary 4, Prop osition 13, Theorem 2 all γ uniﬁed baseline Assumptions 1–2; no positive lo wer supp ort bound is imp osed Prop osition 4 and Lemma 3 F r ´ echet / γ > 0 pro duct representation sp eciﬁc to the F r´ echet case; no support bound a wa y from zero is re- quired there Theorem 1 all γ d γ ,p ( F 1 , F 2 ) < ∞ is the primitive domain con- dition; equiv alently , one needs ﬁnite trans- formed p th momen ts, namely X γ p for γ > 0, | log X | p for γ = 0, and X −| γ | p for γ < 0 Prop osition 11 γ ∈ (0 , 1] or γ ≤ 0 for γ ∈ (0 , 1], no low er supp ort b ound is needed; for γ ≤ 0, supp ort on [ a, ∞ ) is im- p osed explicitly Section 5.5 pow er and in verse-pow er tilts all regimes in princi- ple in verse-pow er scores require baseline support b ounded a wa y from zero; p ositiv e-p o wer scores require the baseline tail to b e light enough for the exponen tial tilt to be in tegrable 5 2 En vironmen t: heterogeneous extreme v alue primitives 2.1 Baseline extreme v alue theory Let { Y j } j ≥ 1 b e i.i.d. real v alued random v ariables with common cdf G . Let M n := max 1 ≤ j ≤ n Y j denote the sample maximum. W e imp ose a standard maxim um domain of attraction condition. Equiv alently , after aﬃne normalization, the law of M n con verges to a ge neralized extreme v alue distribution. Assumption 1 (Domain of attraction) . There exist sequences a n > 0 and b n ∈ R and a parameter γ ∈ R suc h that Pr  M n − b n a n ≤ x  → H γ ( x ) as n → ∞ , for all contin uity p oints x of H γ . W e use the canonical generalized extreme v alue family H γ ( x ) = ( exp  − (1 + γ x ) − 1 /γ  , γ  = 0 , 1 + γ x > 0 , exp( − e − x ) , γ = 0 , with the con ven tions H γ ( x ) = 0 for γ > 0 and x ≤ − 1 /γ , and H γ ( x ) = 1 for γ < 0 and x ≥ − 1 /γ . Deﬁne the tail transform v γ ( x ) := − log H γ ( x ) , so that v γ ( x ) = (1 + γ x ) − 1 /γ for γ  = 0 on { 1 + γ x > 0 } and v 0 ( x ) = e − x on R . 2.2 Mixed Poisson heterogeneit y in dra w coun ts Eac h agen t draws a random n umber N ( θ ) of i.i.d. opp ortunities, where θ > 0 scales the av erage arriv al rate. F ollowing Mangin (2026), the dra w coun t is mixed Poisson. Assumption 2 (Mixed Poisson heterogeneity) . Let X ∼ F on [0 , ∞ ) with E [ X ] = 1 and Pr ( X = 0) = 0. Conditional on X , the draw count is Poisson: N ( θ ) | X ∼ P oisson( θ X ) , θ > 0 . The sequence { Y j } j ≥ 1 is indep enden t of ( X , N ( θ )). R emark 1 (Why we exclude an atom at zero) . The restriction Pr ( X = 0) = 0 is mainly a uniﬁed domain con v ention rather than a substan tiv e claim that near-zero opp ortunit y t yp es are unimp ortan t. Mass arbitrarily close to zero is allow ed throughout the baseline mo del. The reason to exclude an atom exactly at zero is that, for γ ≤ 0, it pro duces an atom at −∞ in the normalized limit law, while sev eral metric constructions b elo w use log x or x γ and are therefore naturally formulated on (0 , ∞ ). When a p ositiv e lo wer supp ort b ound is genuinely needed, we imp ose it explicitly in the corresp onding prop osition. The distribution of N ( θ ) is mixed Poisson with mean E [ N ( θ )] = θ . Let P 0 ( z ) := E  e − z X  , z ≥ 0 , denote the Laplace transform of F . The next lemma summarizes the key generating function iden tity that will b e used rep eatedly . 6 Lemma 1 (Probabilit y generating function of mixed P oisson) . Under Assumption 2, for any y ∈ [0 , 1] and any θ > 0 , E h y N ( θ ) i = P 0 ( θ (1 − y )) . R emark 2 (Search tec hnology v ersus type heterogeneit y) . Assumption 2 admits t wo complementary in terpretations. One may tak e the mixed Poisson law of N ( θ ) as the primitive “searc h tec hnology”, in which case F is an equiv alen t representation. Alternatively , one ma y take F as the primitive distribution of types X , where higher X implies a higher mean draw count θ X . The Laplace transform P 0 summarizes the search tec hnology and is the only ob ject from F that enters the distribution of extremes. The mixed P oisson sp eciﬁcation is analytically v aluable precisely b ecause the conditional distribution function of the maxim um takes the exp onen tial form Pr( M θ ≤ x | X ) = exp {− θ X (1 − G ( x )) } , so heterogeneity enters through a Laplace transform rather than through a more complicated count generating function. Its complemen t gives the corresponding conditional tail probability . This sp ecial role of Poisson search is also emphasized in Beck er and Mangin (2023), which contrasts the P oisson b ench mark with more general count technologies. 2.3 Heterogeneous maxima and their distribution Giv en N ( θ ) draws, the realized outcome is the maximum M θ := sup 1 ≤ j ≤ N ( θ ) Y j , with the conv ention sup ∅ = −∞ . F or any x ∈ R , Pr( M θ ≤ x | X ) = E h G ( x ) N ( θ ) | X i = exp( − θ X (1 − G ( x ))) , and hence, by using Lemma 1, Pr( M θ ≤ x ) = P 0 ( θ (1 − G ( x ))) . (1) 2.4 Heterogeneous extreme v alue limit la w W e now combine the domain of attraction condition with the mixed P oisson structure. Deﬁne a θ := a ⌊ θ ⌋ and b θ := b ⌊ θ ⌋ for θ ≥ 1. This step extension is conv enien t for the ﬁrst-order limit theorem b elo w. In the second-order subsection later on, we revert to con tinuous normalizing functions, as is standard in extreme v alue theory , to av oid discretization artifacts. Prop osition 1 (Heterogeneous extreme v alue law) . Supp ose Assumptions 1 and 2 hold. Then for al l c ontinuity p oints x of H γ ,F , Pr  M θ − b θ a θ ≤ x  → H γ ,F ( x ) := P 0  v γ ( x )  as θ → ∞ . Mor e over, if F = δ 1 , then P 0 ( z ) = e − z and H γ ,F = H γ . 7 R emark 3 (F ormal scop e of the limit law) . The ob ject H γ ,F is the limit distribution of the normalized maxim um for a randomly dra wn agen t under a common oﬀer distribution G . It is therefore a cross- sectional agen t-level ob ject. The pap er do es not analyze the economy-wide maximum max i M i,θ , nor settings in whic h netw ork p osition also changes the distribution of the ofer qualit y or induces dep endence b et ween X and the Y ij . Those extensions are p oten tially imp ortant, but they require additional structure b ey ond the present mixed-Poisson lay er. R emark 4 (Connection to random-sample-size maxima) . Proposition 1 is a mixed-Poisson instance of the classical literature on maxima with random sample size or random indexing; see, e.g., Barndorﬀ- Nielsen (1964), Galam b os (1973), and Silvestro v and T eugels (1998). In that literature, random indexing changes the form of the limit law while often preserving the underlying extreme-v alue t yp e. The v alue added here is not another domain-of-attraction result, but the decision to treat the mean one heterogeneit y distribution F and its Laplace transform P 0 as primitiv es for the order, metric, and design analysis developed b elo w. Commen t. The representation (1) reduces the asymptotics of M θ to the b eha vior of θ (1 − G ( b θ + a θ x )). Assumption 1 implies that G ( b θ + a θ x ) ⌊ θ ⌋ → H γ ( x ) = e − v γ ( x ) and thus θ (1 − G ( b θ + a θ x )) → v γ ( x ) at contin uity p oin ts. The formal argument is giv en in App endix A. Our focus on block maxima is natural in searc h en vironments where the economic ob ject is the b est oﬀer ov er a horizon. Complemen tary p eaks-o ver-threshold formulations are classical in extreme v alue theory (see, e.g., Pic k ands (1975) and Balkema and de Haan (1974)). 2.5 A canonical pro duct represen tation in the F r´ ec het case A particularly useful representation emerges under F r´ echet normalization, whic h is natural for hea vy-tailed primitives. Supp ose γ > 0 and w ork with the standard F r´ echet cdf H F r γ ( z ) = exp  − z − 1 /γ  , z > 0 . If Z γ is a F r´ echet random v ariable with cdf H F r γ and Z γ is indep enden t of X , then Pr( X γ Z γ ≤ z ) = E h exp  − X z − 1 /γ i = P 0  z − 1 /γ  , z > 0 . Th us the heterogeneous F r´ echet limit admits the pro duct representation Z d = X γ Z γ , (2) where Z has cdf z 7→ P 0 ( z − 1 /γ ) on (0 , ∞ ). Equiv alently , the canonically normalized GEV v ariable Z GEV := Z − 1 γ has cdf H γ ,F . W e shall use (2) to construct couplings and to obtain sharp W asserstein b ounds in Section 4. 2.6 Extreme outcome functionals The pap er studies tail-sensitive functionals of the distribution of extremes. Let ζ : R → R b e a measurable pa yoﬀ function. F or ﬁnite θ , deﬁne Φ θ ( F ) := E [ ζ ( M θ )] . 8 F or normalized extremes, deﬁne Z θ := M θ − b θ a θ , Ψ θ ( F ) := E [ ψ ( Z θ )] for measurable ψ suc h that the exp ectation is w ell deﬁned. Section 4 establishes p oin twise ﬁnite horizon stabilit y for the cdf of Z θ , W asserstein stability for the limit la w F 7→ L ( Z ), and derives Lipsc hitz-type b ounds for classes of functionals ψ that are natural for tail analysis. Section 5 uses these ob jects to deﬁne and solve an entrop y-regularized design problem for F . 3 Geometry of heterogeneous extreme v alue laws This section develops structural and geometric prop erties of the heterogeneous extreme v alue op erator introduced in Section 2. Throughout, F denotes a probability measure on [0 , ∞ ) with E F [ X ] = 1 and Pr F ( X = 0) = 0, and we write P 0 ( z ) := E F  e − z X  = Z ∞ 0 e − z x F ( dx ) , z ≥ 0 , for the Laplace transform of F . F or a ﬁxed tail index γ ∈ R and tail transform v γ ( x ) := − log H γ ( x ) (deﬁned on { x : H γ ( x ) ∈ (0 , 1) } ), the heterogeneous extreme v alue la w is H γ ,F ( x ) = P 0  v γ ( x )  . W e view the mapping T : F 7→ H γ ,F as an op erator from a space of type distributions to a space of extreme v alue la ws. The section has tw o roles. Subsection 3.1 records order and b enc hmark consequences of the Laplace-mixture structure. Those results are useful for interpretation and for the application, but they are not the core geometric ino v ation. The later subsections then isolate the coupling and v ariational ingredien ts needed for the genuinely geometric stability results in Section 4 and, more indirectly , for the complemen tary design problem in Section 5. 3.1 Order comparisons and extreme misallo cation A cen tral economic question is ho w disp ersion in opp ortunities changes extreme outcomes. In our setting, disp ersion en ters through the distribution F of draw intensities X . A natural formalization is con vex order, which coincides with mean-preserving spreads. Deﬁnition 1 (Conv ex order) . Let F 1 , F 2 b e probabilit y measures on [0 , ∞ ) with ﬁnite ﬁrst moment and common mean. W e write F 2 ⪰ cx F 1 if Z φ ( x ) F 2 ( dx ) ≥ Z φ ( x ) F 1 ( dx ) for ev ery conv ex function φ : [0 , ∞ ) → R for whic h b oth exp ectations are ﬁnite. When F 2 ⪰ cx F 1 , F 2 is a mean preserving spread of F 1 in the sense of Rothschild and Stiglitz (1970). Con vex order has an immediate implication for Laplace transforms b ecause x 7→ e − z x is conv ex on [0 , ∞ ) for every z ≥ 0. 9 Prop osition 2 (Conv ex order implies Laplace order) . L et F 1 , F 2 b e pr ob ability me asur es on [0 , ∞ ) with E [ X ] = 1 . If F 2 ⪰ cx F 1 , then for every z ≥ 0 , P (2) 0 ( z ) ≥ P (1) 0 ( z ) . Conse quently, for every x such that v γ ( x ) < ∞ , H γ ,F 2 ( x ) ≥ H γ ,F 1 ( x ) . Equivalently, if Z i has c df H γ ,F i , then Pr( Z 2 > t ) ≤ Pr( Z 1 > t ) for al l t ∈ R . Corollary 1 (Heterogeneit y low ers extremes relative to the homogeneous b enc hmark) . L et F b e a pr ob ability me asur e on [0 , ∞ ) with E [ X ] = 1 . Then for every z ≥ 0 , P 0 ( z ) ≥ e − z , with strict ine quality for al l z > 0 whenever F  = δ 1 . Conse quently, for every x such that v γ ( x ) < ∞ , H γ ,F ( x ) ≥ H γ ( x ) , so the heter o gene ous extr eme value limit is sto chastic al ly smal ler than the homo gene ous limit. R emark 5 (T ail index v ersus level eﬀects) . Order comparisons suc h as Prop osition 2 op erate at the lev el of the en tire distribution function. They are consisten t with the tail index in v ariance emphasized b y Mangin (2026): heterogeneity in the mixed P oisson draw counts changes the level of extreme outcomes but do es not change the limiting tail index γ . Under the mean-one normalization, one can actually say more: since 0 ≤ (1 − e − z X ) /z ≤ X for z > 0 and E [ X ] = 1, dominated con vergence giv es P 0 ( z ) = 1 − z + o ( z ) as z ↓ 0. Consequently , 1 − H γ ,F ( x ) = 1 − P 0  v γ ( x )  ∼ v γ ( x ) ∼ 1 − H γ ( x ) whenev er v γ ( x ) ↓ 0. Thus heterogeneit y changes distributional lev els and high quantiles, but it do es not c hange the ﬁrst-order ultra-tail mass. The previous results motiv ate quantitativ e measures of how far F is from the egalitarian b enc hmark δ 1 . Deﬁnition 2 (Extreme misallo cation indices) . Fix the b enc hmark F eq := δ 1 . 1. F or p ≥ 1 and F ∈ P p ([0 , ∞ )), deﬁne the metric misallo cation index M p ( F ) := W p ( F , δ 1 ) . 2. F urther deﬁne the Laplace misallo cation curv e ∆ F ( z ) := P 0 ( z ) − e − z , z ≥ 0 . R emark 6 (Interpretation) . M p ( F ) measures opp ortunit y disp ersion in a transp ortation geometry and will b e directly compatible with the stabilit y b ounds in Section 4. Because the benchmark is a p oin t mass, the ﬁrst t wo indices are esp ecially transparent: M 1 ( F ) = E [ | X − 1 | ] and, under E [ X ] = 1, M 2 ( F ) = p E [( X − 1) 2 ] = p V ar( X ) . The Laplace curve z 7→ ∆ F ( z ) measures the induced distortion in extreme v alue la ws at the Laplace scale. By Corollary 1, ∆ F ( z ) ≥ 0 for all z ≥ 0, and ∆ F ≡ 0 if and only if F = δ 1 . In particular, these indices quan tify ho w far the opp ortunit y distribution is from the homogeneous b enc hmark b efore Section 4 translates that gap in to explicit b ounds on the induced law of extremes. 10 3.2 W asserstein geometry on the type space No w let P p ([0 , ∞ )) denote the set of probability measures on [0 , ∞ ) with ﬁnite p th momen t. F or µ, ν ∈ P p ([0 , ∞ )), the W asserstein p distance is W p ( µ, ν ) := inf π ∈ Γ( µ,ν ) Z [0 , ∞ ) 2 | x − y | p π ( dx, dy ) ! 1 /p , where Γ( µ, ν ) is the set of couplings with marginals µ and ν . W e refer to Villani (2009) and Galichon (2016) for background. In one dimension, the geometry is particularly explicit. Lemma 2 (Quantile represen tation and canonical geo desics in one dimension) . L et µ, ν ∈ P p ([0 , ∞ )) and let Q µ , Q ν denote their quantile functions. Then W p p ( µ, ν ) = Z 1 0 | Q µ ( u ) − Q ν ( u ) | p du. Mor e over, the p ath ( µ t ) t ∈ [0 , 1] deﬁne d by Q µ t ( u ) = (1 − t ) Q µ ( u ) + t Q ν ( u ) , u ∈ (0 , 1) , is a c onstant-sp e e d W p ge o desic fr om µ to ν . F or p > 1 it is the unique ge o desic, while for p = 1 it is the c anonic al monotone ge o desic and uniqueness ne e d not hold. This one-dimensional c haracterization has a direct implication for con vexit y of expectations along the canonical monotone geo desic. Prop osition 3 (Conv exit y of expectations along t yp e geo desics) . L et µ, ν ∈ P 1 ([0 , ∞ )) and let ( µ t ) t ∈ [0 , 1] b e the c anonic al monotone Wasserstein ge o desic fr om µ to ν . L et φ : [0 , ∞ ) → R b e c onvex and such that R | φ | dµ t < ∞ for al l t ∈ [0 , 1] . Then t 7→ R φ ( x ) µ t ( dx ) is c onvex on [0 , 1] . 3.3 The extreme v alue map as a pushforw ard and coupling transform The map T admits a useful representation in cases where the limit la w can be written as a measurable image of ( X , Ξ) for an auxiliary v ariable Ξ indep enden t of X . This representation is the basis for the coupling constructions in Section 4 and for the v ariational arguments in Section 5. Prop osition 4 (Pushforward representations in F r´ ec het and Gumbel cases) . 1. F r´ echet c ase ( γ > 0 ). L et Z γ have the standar d F r ´ echet law on (0 , ∞ ) . Its c df is z 7→ exp ( − z − 1 /γ ) , and Z γ is indep endent of X ∼ F . Then the heter o gene ous F r´ echet limit Z satisﬁes Z d = X γ Z γ , and its c df is z 7→ P 0 ( z − 1 /γ ) on (0 , ∞ ) . Equivalently, the c anonic al ly normalize d variable Z GEV := Z − 1 γ is distribute d ac c or ding to H γ ,F . 2. Gumb el c ase ( γ = 0 ). L et E b e exp onential with me an one, indep endent of X ∼ F . Deﬁne Z := log ( X/E ) . Then Z has c df x 7→ P 0 ( e − x ) , henc e Z is distribute d ac c or ding to H 0 ,F . 11 A key implication is that man y extreme outcome functionals are linear in the mixing distribution F once the auxiliary v ariable is in tegrated out. Lemma 3 (Kernel representation of exp ectations in the F r ´ ec het case) . Assume γ > 0 and let Z = X γ Z γ b e as in Pr op osition 4. L et ψ : (0 , ∞ ) → R b e me asur able and assume E [ | ψ ( Z ) | ] < ∞ . Deﬁne the kernel κ ψ ( x ) := E [ ψ ( x γ Z γ )] , x > 0 . Then E [ ψ ( Z )] = Z ∞ 0 κ ψ ( x ) F ( dx ) . Lemma 3 is conceptually imp ortan t for design: whenev er the planner ob jectiv e can b e written as E [ ψ ( Z )] with Z in pro duct form, the dep endence on F is through a linear functional with kernel κ ψ . 3.4 Diﬀeren tiating the map F 7→ H γ ,F Because H γ ,F ( x ) is an in tegral of a ﬁxed kernel against F , the op erator T has an explicit v ariational deriv ative. Let M ([0 , ∞ )) denote the space of ﬁnite signed measures on [0 , ∞ ). Prop osition 5 (Gˆ ateaux deriv ative of the extreme v alue op erator) . Fix γ and c onsider the map T : F 7→ H γ ,F . L et ν ∈ M ([0 , ∞ )) satisfy ν ([0 , ∞ )) = 0 and R ∞ 0 u ν ( du ) = 0 , and let F ε := F + εν b e a p erturb ation for ε in a neighb orho o d of 0 such that F ε r emains a pr ob ability me asur e. Then for every x with v γ ( x ) < ∞ , d dε H γ ,F ε ( x )     ε =0 = Z ∞ 0 e − v γ ( x ) u ν ( du ) . R emark 7 (Wh y the mean constraint is stated here) . The deriv ativ e formula itself is algebraically v alid without the restriction R u ν ( du ) = 0. W e imp ose it in the prop osition b ecause the economic domain studied in the pap er ﬁxes E [ X ] = 1, so admissible p erturbations should remain on that slice. R emark 8 (Link to en tropic design) . Prop osition 5 iden tiﬁes the k ernel that go verns marginal c hanges in the distribution of extremes when the type distribution is p erturb ed. In Section 5, the same kernel reapp ears in the entrop y-regularized design problem whenev er the planner ob jectiv e is expressed through cdf levels or their smo oth linear combinations. 3.5 Geo desics and con v exit y W e now com bine the one-dimensional canonical W asserstein geo desics with conv exit y of the Laplace k ernel. Prop osition 6 (Conv exity of heterogeneous extreme v alue laws along type geo desics) . L et F 0 , F 1 ∈ P 1 ([0 , ∞ )) with E [ X ] = 1 and let ( F t ) t ∈ [0 , 1] b e the c anonic al monotone Wasserstein ge o desic b etwe en them. Then for every z ≥ 0 , the map t 7→ P t 0 ( z ) := R e − z x F t ( dx ) is c onvex on [0 , 1] . Conse quently, for every x such that v γ ( x ) < ∞ , the map t 7→ H γ ,F t ( x ) = P t 0  v γ ( x )  is c onvex on [0 , 1] . Prop osition 6 provides a disciplined w ay to compare extremes along canonical paths in the space of heterogeneit y distributions. It is also useful for iden tiﬁcation and p olicy discussions: when F is estimated, geo desic p erturbations provide a structured lo cal sensitivity analysis. 12 R emark 9 (Raw-space geo desics preserv e the mean one slice) . If F 0 and F 1 b oth satisfy R x F k ( dx ) = 1, then the canonical monotone W asserstein geo desic ( F t ) t ∈ [0 , 1] in Prop osition 6 also satisﬁes R x F t ( dx ) = 1 for every t . Indeed, in one dimension the mean equals the integral of the quantile function, and quantiles interpolate linearly along this path. This mean-preserving prop ert y is sp eciﬁc to the ra w type geometry used in this section and will contrast with the adapted geo desics of Section 4. W e also record a momen t functional that will recur in heavy tailed applications. Prop osition 7 (Geodesic con vexit y of negative p o wer moments) . Fix ρ > 0 . L et F 0 , F 1 b e supp orte d on [ a, ∞ ) for some a > 0 , and let ( F t ) t ∈ [0 , 1] b e their c anonic al monotone Wasserstein ge o desic. Then the map t 7→ Z x − ρ F t ( dx ) is c onvex on [0 , 1] . R emark 10 (Where this enters later) . In F r ´ ec het settings, many tail sensitive functionals can b e written in terms of p o wer moments of X through the pro duct represen tation in Prop osition 4 and the k ernel representation in Lemma 3. The conv exity statemen ts ab o ve alow one to control these ob jects along structured p erturbations of F . 4 W asserstein stabilit y and coupling b ounds This section con tains the genuinely geometric la y er of the pap er. The Laplace-mixture represen tation already gives several p oin twise and order-theoretic statemen ts. What transp ort adds is diﬀerent: b y coupling transformed types with a common exp onen tial sho c k, we obtain quantitativ e control of whole induced laws, canonical am bient in terp olation paths, and robustness statements with explicit moduli. The k ey ob ject remains the operator T : F 7→ H γ ,F . The substan tive conten t is not me rely that one may rewrite the type space after a monotone transform, but that the op erator representation and the canonical coupling mak e this transformed geometry propagate in to la w-level W asserstein b ounds for heterogeneous extremes. Complementary quantitativ e con trol of extreme-v alue approximation in W asserstein-type distances has also recently b een dev elop ed with Stein-t yp e metho ds for F r´ echet laws; see Mansanarez et al. (2025). 4.1 W asserstein preliminaries and dualit y W e work on R equipp ed with the metric d ( x, y ) = | x − y | . F or p ≥ 1 and µ, ν ∈ P p ( R ), W p ( µ, ν ) :=  inf π ∈ Γ( µ,ν ) Z R 2 | x − y | p π ( dx, dy )  1 /p , where Γ( µ, ν ) is the set of couplings with marginals ( µ, ν ). In one dimension, W p admits the quantile represen tation, and for p = 1 it admits the Kantoro vic h-Rubinstein dual formulation. W e collect the statemen ts used b elo w in App endix B.1. 4.2 A canonical coupling representation Recall the canonical GEV parametrization from Section 2.1 and deﬁne v γ ( x ) := − log H γ ( x ) . 13 Let w γ denote the inv erse of v γ on its range. F or the canonical GEV family , w γ ( t ) =      t − γ − 1 γ , γ  = 0 , − log t, γ = 0 , t > 0 . Prop osition 8 (Canonical represen tation of H γ ,F ) . L et X ∼ F and let E ∼ Exp (1) b e indep endent of X . Deﬁne Z γ ,F := w γ  E X  . Then Z γ ,F has c df H γ ,F . Equivalently, Z γ ,F =      X γ E − γ − 1 γ , γ  = 0 , log X − log E , γ = 0 , and this r andom variable has c df H γ ,F . Prop osition 8 reduces the construction of couplings for H γ ,F to couplings on the type space. Giv en any coupling of X 1 ∼ F 1 and X 2 ∼ F 2 , one obtains a coupling of the corresp onding extreme la ws by using a common exp onen tial v ariable E . 4.3 Stabilit y of the extreme v alue op erator The canonical representation dep ends on X through the transform x 7→ x γ when γ  = 0, and through x 7→ log x when γ = 0. This motiv ates metrics on the type space that are adapted to the index γ . F or the geometric results in this section, the mean one normalization from Assumption 2 is not needed. W e therefore temp orarily enlarge the domain from the normalized economic t yp e space to the ambien t space of probability measures on (0 , ∞ ) for which the relev an t transformed p th moment is ﬁnite. That ambien t domain is the natural one for the adapted metric and, crucially , the one on whic h the geo desic construction is closed. Deﬁnition 3 (Type metrics adapted to γ ) . Fix p ≥ 1 and γ ∈ R . Let s γ : (0 , ∞ ) → R b e deﬁned by s γ ( x ) := ( x γ , γ  = 0 , log x, γ = 0 . F or F 1 , F 2 on (0 , ∞ ) such that the pushforwards F i ◦ s − 1 γ b elong to P p ( R ), deﬁne d γ ,p ( F 1 , F 2 ) := W p  F 1 ◦ s − 1 γ , F 2 ◦ s − 1 γ  . Prop osition 9 (Domain of the adapted metric and b eha vior near zero) . Fix p ≥ 1 and γ ∈ R . F or pr ob ability me asur es F 1 , F 2 on (0 , ∞ ) , the distanc e d γ ,p ( F 1 , F 2 ) is ﬁnite if and only if Z | s γ ( x ) | p F i ( dx ) < ∞ , i ∈ { 1 , 2 } . Equivalently: 1. if γ > 0 , one ne e ds R x γ p F i ( dx ) < ∞ for i = 1 , 2 ; 14 2. if γ = 0 , one ne e ds R | log x | p F i ( dx ) < ∞ for i = 1 , 2 ; 3. if γ < 0 , one ne e ds R x −| γ | p F i ( dx ) < ∞ for i = 1 , 2 . In p articular, mass ne ar zer o is harmless for γ > 0 but c an make d γ ,p inﬁnite for γ ≤ 0 even when an or dinary Wasserstein distanc e on the r aw typ e sp ac e is ﬁnite. R emark 11 (Low-access types and the limits of adapted geometry) . The transp ort theorem therefore remains av ailable for v ery small access t yp es as long as the relev ant transformed momen ts are ﬁnite. F or γ = 0 this is a logarithmic in tegrability requirement; for γ < 0 it is an in verse-momen t requiremen t of order | γ | p . If an application p ermits heavier mass near zero than those conditions allo w, the adapted metric is no longer the righ t comparison to ol. One should then revert to the Laplace-order statements, the raw space mean-preserving geo desics of Section 3, or the ﬁnite horizon and p oin t wise b ounds in this section, none of whic h require d γ ,p < ∞ . W e no w state the main stability b ound. The constants are explicit and reﬂect momen t restrictions in trinsic to extreme-v alue tails. Theorem 1 (W asserstein stability of F 7→ H γ ,F ) . Fix γ ∈ R and p ≥ 1 . L et F 1 , F 2 b e pr ob ability me asur es on (0 , ∞ ) such that d γ ,p ( F 1 , F 2 ) < ∞ . If γ > 0 , assume in addition that pγ < 1 . If γ  = 0 , let E ∼ Exp (1) and deﬁne Ξ γ := E − γ . Then W p ( H γ ,F 1 , H γ ,F 2 ) ≤ ( E [ | Ξ γ | p ]) 1 /p | γ | d γ ,p ( F 1 , F 2 ) . If γ = 0 , then W p ( H 0 ,F 1 , H 0 ,F 2 ) ≤ d 0 ,p ( F 1 , F 2 ) . R emark 12 (What the transp ort lay er adds) . The transform s γ b y itself is only a reparameterization of t yp es. The substan tive con tent of Theorem 1 is that, under the canonical coupling Z γ ,F = w γ ( E /X ), optimal transp ort on s γ ( X ) yields an explicit W asserstein b ound on the entire induced extreme law, with constant determined by the tail index through E [ E − pγ ] in the F r ´ echet regime. Corollary 2 then propagates this to canonical ambien t paths in law space. These quantitativ e whole la w statements do not follow from the Laplace mixture alone. R emark 13 (Moment restrictions are intrinsic) . F or γ > 0 one has E [ E − pγ ] = Γ(1 − pγ ), so E [ | Ξ γ | p ] < ∞ holds if and only if pγ < 1. F or γ < 0, b y contrast, Ξ γ = E | γ | has ﬁnite moments of ev ery order. The hea vy-tail restriction is therefore not an artifact of the coupling argument. It reﬂects that the F r´ echet-t yp e limit law has ﬁnite p th moment only when p < 1 /γ , and therefore W p ( H γ ,F 1 , H γ ,F 2 ) is only deﬁned in the usual W asserstein sense in that regime. Theorem 1 is pro v ed b y applying the random Lipschitz con traction Lemma 9 to the represen tation in Prop osition 8, using a coupling of s γ ( X 1 ) and s γ ( X 2 ) and a common exp onen tial v ariable. The construction is in the same spirit as the coupling arguments emphasized by Bobbia et al. (2021), but here the comparison is b etw een heterogeneous extreme la ws generated b y diﬀeren t opp ortunit y distributions. The theorem b ecomes most useful once the adapted metric is treated as a geometry in its o wn righ t. Because d γ ,p is an ordinary W asserstein distance after the transform s γ , it automatically induces canonical constant-speed geo desics in the am bient transformed type space. Along those paths, the en tire law of extremes mo ves in a Lipschitz w ay . This is the main result in the pap er that sp eciﬁcally requires the W asserstein geometry rather than only the Laplace form. 15 Corollary 2 (Geo desic control in the adapted type geometry) . Fix γ ∈ R and p ≥ 1 . L et F 0 , F 1 b e two pr ob ability me asur es on (0 , ∞ ) for which the assumptions of The or em 1 hold. L et G 0 := s γ # F 0 and G 1 := s γ # F 1 , and let ( G t ) t ∈ [0 , 1] b e the c anonic al monotone c onstant-sp e e d W p ge o desic fr om G 0 to G 1 . Deﬁne F t := ( s − 1 γ )# G t . Then, for al l s, t ∈ [0 , 1] , d γ ,p ( F s , F t ) = | t − s | d γ ,p ( F 0 , F 1 ) . Mor e over, W p  H γ ,F s , H γ ,F t  ≤ C γ ,p | t − s | d γ ,p ( F 0 , F 1 ) , wher e C γ ,p denotes the c onstant fr om The or em 1. R emark 14 (Am bient geo desics v ersus the mean one normalization) . The path in Corollary 2 is a geo desic in the ambien t transformed t yp e space, not necessarily in the mean one slice used in the economic mo del. Indeed, Q G t ( u ) is aﬃne in t in transformed co ordinates, but R s − 1 γ ( Q G t ( u )) du is generally not constant unless s − 1 γ is aﬃne, which o ccurs only when γ = 1. Thus the adapted path should b e read as a geometric comparison device, not as a literal mean-preserving p olicy path. If one instead wan ts exact mean preserv ation at every step, the raw space geo desics of Section 3.5 remain a v ailable, but they need not inherit the constan t-sp eed statement for d γ ,p . Prop osition 10 (Renormalization bridge back to the mean one slice) . Fix γ ∈ R and p ≥ 1 . L et F b e a pr ob ability me asur e on (0 , ∞ ) with ﬁnite me an m ( F ) := R x F ( dx ) ∈ (0 , ∞ ) and ﬁnite tr ansforme d p th moment. Deﬁne the me an one r enormalization e F := ( x 7→ x/m ( F )) # F . Then d γ ,p ( F , e F ) =    | 1 − m ( F ) − γ |  R x γ p F ( dx )  1 /p , γ  = 0 , | log m ( F ) | , γ = 0 . If, in addition, the assumptions of The or em 1 hold for F and e F , then W p  H γ ,F , H γ , e F  ≤ C γ ,p d γ ,p ( F , e F ) , wher e C γ ,p is the mo dulus fr om The or em 1. In p articular, if m t := R x F t ( dx ) and e F t := ( x 7→ x/m t ) # F t along the ambient ge o desic fr om Cor ol lary 2, the same b ound applies p ointwise in t . Corollary 3 (Quantile-sc hedule con trol of the induced extreme law) . Under the assumptions of The or em 1, let Q γ ,F i : (0 , 1) → R denote the quantile function of the law H γ ,F i for i ∈ { 1 , 2 } . Then  Z 1 0 | Q γ ,F 1 ( u ) − Q γ ,F 2 ( u ) | p du  1 /p = W p ( H γ ,F 1 , H γ ,F 2 ) ≤ C γ ,p d γ ,p ( F 1 , F 2 ) . A long the adapte d ge o desic fr om Cor ol lary 2, this yields ∥ Q γ ,F s − Q γ ,F t ∥ L p (0 , 1) ≤ C γ ,p | t − s | d γ ,p ( F 0 , F 1 ) . An immediate b enc hmark implication is obtained b y comparing F to δ 1 . Whenev er the conditions of Theorem 1 hold, W p ( H γ ,F , H γ ) ≤ C γ ,p d γ ,p ( F , δ 1 ) , where C γ ,p denotes the constant from Theorem 1. Combined with Prop osition 11 in the regimes co vered there, this translates metric misalo cation on the type space into an explicit distortion b ound for extreme outcomes relative to the homogeneous b enchmark. 16 Corollary 4 (Poin t wise stability of the cdf ) . Fix γ ∈ R and let F 1 , F 2 ∈ P 1 ([0 , ∞ )) . Then for every x in the interior of the supp ort of H γ , | H γ ,F 1 ( x ) − H γ ,F 2 ( x ) | ≤ v γ ( x ) W 1 ( F 1 , F 2 ) . R emark 15 (P oint wise versus geometric b ounds) . Corollary 4 follo ws directly from the Laplace represen tation H γ ,F ( x ) = R e − v γ ( x ) z F ( dz ) and the fact that z 7→ e − v γ ( x ) z is v γ ( x )-Lipsc hitz on [0 , ∞ ). It is therefore not itself a geometric theorem. Theorem 1, Prop osition 10, and Corollary 3 are the results that gen uinely use the adapted W asserstein geometry . Those statements con trol whole induced laws, the bridge back to the mean one slice, and the full quantile sc hedule. They are not reco vered from ﬁxed threshold cdf b ounds without additional densit y information. In many applications one also wan ts conditions under whic h d γ ,p is controlled by a standard W asserstein distance on the t yp e space. W e record suﬃcient conditions that isolate the role of b eha vior near 0. Prop osition 11 (Relating adapted t yp e metrics to standard W asserstein distances) . L et p ≥ 1 and let F 1 , F 2 ∈ P p ([0 , ∞ )) . Case 1: 0 < γ ≤ 1 . Then d γ ,p ( F 1 , F 2 ) ≤ W p ( F 1 , F 2 ) γ . Case 2: γ < 0 . Assume F 1 and F 2 ar e supp orte d on [ a, ∞ ) for some a > 0 . Then d γ ,p ( F 1 , F 2 ) ≤ | γ | a γ − 1 W p ( F 1 , F 2 ) . Case 3: γ = 0 . Assume F 1 and F 2 ar e supp orte d on [ a, ∞ ) for some a > 0 . Then d 0 ,p ( F 1 , F 2 ) ≤ a − 1 W p ( F 1 , F 2 ) . R emark 16 (Why w e k eep d γ ,p in the main b ound) . In the p ositiv e-tail regime cov ered by Theorem 1, namely 0 < γ < 1 /p , the map x 7→ x γ is not globally Lipschitz on [0 , ∞ ). F or 0 < γ < 1, the obstruction is b eha vior near zero. Consequen tly , a linear stabilit y b ound stated directly in terms of W p ( F 1 , F 2 ) would either require extra supp ort restrictions or lose the natural linear geometry of the canonical coupling. The adapted metric d γ ,p is therefore the correct primitive quantit y in the main b ound, while Prop osition 11 records conditions under whic h it can b e con trolled b y a standard W asserstein distance. 4.4 Stabilit y of extreme outcome functionals The W asserstein b ounds ab ov e immediately imply stability of a large class of tail-sensitive functionals. Prop osition 12 (Lipschitz functionals of the limit la w) . L et ψ : R → R b e L -Lipschitz. L et Z i ∼ H γ ,F i for i ∈ { 1 , 2 } and assume W 1 ( H γ ,F 1 , H γ ,F 2 ) < ∞ . Then | E [ ψ ( Z 1 )] − E [ ψ ( Z 2 )] | ≤ L W 1 ( H γ ,F 1 , H γ ,F 2 ) . Combining with The or em 1 yields explicit b ounds in terms of d γ ,p ( F 1 , F 2 ) whenever the c onditions of that the or em hold. 17 4.5 Finite horizon maxima and tail probabilit y stabilit y Fix θ > 0 and let M θ b e the maximum of N θ i.i.d. dra ws from G , where N θ | X ∼ P oisson ( θ X ) as in Section 2. Conditional on X , one has Pr( M θ ≤ x | X ) = exp  − θ X (1 − G ( x ))  , and therefore Pr( M θ ≤ x ) = E [exp( − θ X (1 − G ( x )))]. Prop osition 13 (Poin twise stability for ﬁnite θ ) . L et F 1 , F 2 ∈ P 1 ([0 , ∞ )) . Then for every x ∈ R ,     Pr F 1 ( M θ ≤ x ) − Pr F 2 ( M θ ≤ x )     ≤ θ (1 − G ( x )) W 1 ( F 1 , F 2 ) . R emark 17 (Finite-horizon versus asymptotic robustness) . Prop osition 13 is a ﬁnite horizon ro- bustness statemen t. It do es not use the extreme-v alue limit. By con trast, Theorem 2 below is an asymptotic expansion around the limit law. Keeping these tw o lay ers separate is imp ortan t in empirical w ork: ﬁnite-sample prediction error and EVT appro ximation error are diﬀeren t ob jects. 4.6 Rates under second-order tail conditions W e conclude with a second-order expansion for the prelimit cdf of normalized maxima. The pro of is given in App endix A. In this subsection, unlik e the ﬁrst-order limit theorem in Section 2.4, w e work with contin uous normalizing functions a : (1 , ∞ ) → (0 , ∞ ) and b : (1 , ∞ ) → R . This is the standard con tinuous-parameter formulation in extreme v alue theory and a voids the artiﬁcial O (1 /θ ) oscillation created b y the step extension a ⌊ θ ⌋ , b ⌊ θ ⌋ . Assumption 3 is a standard second-order regularit y condition; see, for example, de Haan and F erreira (2006, Chapters 2.10 and 9). Assumption 3 (Second-order tail condition) . In addition to Assumption 1, supp ose there exist con tinuous normalizing functions a : (1 , ∞ ) → (0 , ∞ ) and b : (1 , ∞ ) → R , a scalar function A ( θ ) → 0, and a lo cally b ounded function h γ suc h that θ  1 − G ( b ( θ ) + a ( θ ) x )  = v γ ( x ) + A ( θ ) h γ ( x ) + o  A ( θ )  as θ → ∞ , lo cally uniformly in x on the interior of the supp ort of H γ . Theorem 2 (Second-order expansion for heterogeneous extremes) . Supp ose Assumptions 2 and 3 hold and let Z θ := M θ − b ( θ ) a ( θ ) . Fix a c omp act set K c ontaine d in the interior of the supp ort of H γ . Then ther e exists a r emainder r θ,K on K such that sup x ∈ K | r θ,K ( x ) | | A ( θ ) | → 0 as θ → ∞ , and, uniformly for x ∈ K , Pr( Z θ ≤ x ) = H γ ,F ( x ) − A ( θ ) h γ ( x ) E h X e − v γ ( x ) X i + r θ,K ( x ) . R emark 18 (Interpretation) . The leading correction term is multiplicativ ely separable. The function h γ captures the second-order tail b ehavior of G , while x 7→ E [ X e − v γ ( x ) X ] is the heterogeneit y k ernel inherited from the Laplace mixture. This separation is what allows one to distinguish EVT appro ximation error from heterogeneity-driv en sensitivity . Conv erting the expansion into global W 1 rates require additional tail-env elop e conditions and we therefore left outside the present theorem. 18 5 A complemen tary en trop y pro jection for heterogeneit y In this section w e study a tractable normative problem on the same t yp e space used in the p ositiv e analysis. A planner reallo cates opp ortunities across agen ts to impro ve a tail-sensitive criterion while p enalizing deviations from a baseline distribution b y relative en tropy . The ob ject is delib erately mo dest: it is a one-marginal entrop y pro jection, not a full tw o-marginal transp ort problem, and it do es not rely on the W asserstein geometry developed earlier. It is useful in our framew ork, but the connection should b e describ ed accurately: this section is complementary rather than foundational. The structural link to the p ositive analysis is that the same Laplace k ernel that deﬁnes F 7→ H γ ,F also generates the marginal score functions that enter the planner’s ﬁrst-order conditions for cdf criteria and, through the sto c hastic representation below, for exp ected utility of normalized extremes. 5.1 A v ariational planner problem Fix a baseline distribution F 0 on (0 , ∞ ) suc h that E F 0 [ X ] = 1. W e interpret F 0 as the status quo distribution of eﬀective search intensities, netw ork degrees, or other shifters of arriv al rates. A p olicy induces a new distribution F , and we write X ∼ F . F or F ≪ F 0 , deﬁne the relative entrop y D KL ( F ∥ F 0 ) := Z log  dF dF 0  dF , with the conv ention D KL ( F ∥ F 0 ) = + ∞ if F is not absolutely contin uous with resp ect to F 0 . Let H γ ,F denote the heterogeneous extreme-v alue law from Proposition 1. W e fo cus on tail- sensitiv e ob jectiv es of the form U ( F ) = E  u ( Z )  , Z ∼ H γ ,F , for a measurable function u suc h that the expectation is w ell deﬁned under the candidate distributions considered b elo w. This class includes exp ected utility of normalized extremes and many smo othed tail criteria. W e comment on high-quantile ob jectiv es at the end of the section. The planner solves sup F : E F [ X ]=1 n U ( F ) − λ D KL ( F ∥ F 0 ) o , (3) where λ > 0 go verns the marginal cost of reshaping opp ortunities aw ay from the baseline. R emark 19 (Interpretation) . The constraint E F [ X ] = 1 normalizes the aggregate scale of opp ortuni- ties. In a netw ork interpretation it ﬁxes total exp ected oﬀer arriv al in the p opulation and alows the planner to reallo cate the cross-sectional distribution. The entrop y term p enalizes concen trated reallo cations and yields a strictly concav e problem whenever U is linear in F , which is the main case treated b elo w. R emark 20 (Normalized w elfare and prelimit maxima) . The criterion U ( F ) = E [ u ( Z )] should b e read as the asymptotic counterpart of prelimit preferences ov er maxima. Indeed, if u is b ounded and con tinuous, then Prop osition 1 implies E  u  M θ − b θ a θ  → U ( F ) as θ → ∞ . Equiv alently , the planner may b e viewed as ev aluating a sequence of prelimit utilities u θ ( m ) := u (( m − b θ ) /a θ ) that preserves the aﬃne normalization used by extreme-v alue theory . 19 5.2 A sto c hastic represen tation and linearization of exp ected utilit y The canonical representation from Section 4.2 immediately linearizes exp ected-utilit y ob jectiv es in the heterogeneit y distribution. Let w γ := v − 1 γ denote the inv erse map in tro duced there. Lemma 4 (Linearization of expected utility) . L et X ∼ F and let E ∼ Exp (1) b e indep endent. Deﬁne Z := w γ  E X  , so that Z ∼ H γ ,F by Pr op osition 8. If u : R → R is me asur able and E  | u ( Z ) |  < ∞ , then U ( F ) = E  u ( Z )  = Z ψ u ( x ) F ( dx ) , ψ u ( x ) := E  u  w γ  E x  . Pr o of. The distributional representation of Z is Prop osition 8. The formula for U ( F ) then follows b y iterated exp ectation. R emark 21 (F r´ ec het specialization) . When γ > 0, Proposition 8 reduces to the product represen tation from Section 2.5: if Ξ γ := E − γ , then 1 + γ Z = X γ Ξ γ . 5.3 Dualit y and exp onen tial tilting W e now solv e (3) in the canonical case where the ob jective is linear in F . This case is b oth the baseline design problem and the relev ant reduction for exp ected-utility ob jectives by Lemma 4. The exp onen tial-tilt form itself is the standard en tropy-pro jection conclusion. What is sp eciﬁc here is that the heterogeneous-EVT op erator supplies the score ψ and reduces the mean constraint to a one-dimensional dual v ariable. The duality argumen t is recorded in App endix B.2; see also Donsk er and V aradhan (1975), L´ eonard (2014), Ghosal et al. (2022), and Nutz (2022), for background on the broader Schr¨ odinger and entropic-transport literature. Assumption 4 (Linear ob jective, lo cal in tegrability , and in teriority) . There exists a measurable function ψ : (0 , ∞ ) → R such that U ( F ) = Z ψ ( x ) F ( dx ) for all F with E F [ X ] = 1 and D KL ( F ∥ F 0 ) < ∞ . F or η ∈ R , deﬁne Z ( η ) := Z exp  ψ ( x ) + η x λ  F 0 ( dx ) and, for k ∈ { 1 , 2 } , M k ( η ) := Z x k exp  ψ ( x ) + η x λ  F 0 ( dx ) . Assume there exists a nonempty op en interv al D ⊂ R suc h that, for all η ∈ D , Z ( η ) < ∞ , M 1 ( η ) < ∞ , M 2 ( η ) < ∞ , Z | ψ ( x ) | exp  ψ ( x ) + η x λ  F 0 ( dx ) < ∞ . Deﬁne m ( η ) := M 1 ( η ) Z ( η ) . Assume moreo ver that there exist η − , η + ∈ D with m ( η − ) < 1 < m ( η + ) . 20 Theorem 3 (Strong duality and exp onen tial tilt) . Supp ose Assumption 4 holds. Then ther e exists a unique sc alar η ⋆ ∈ D satisfying m ( η ⋆ ) = 1 . The design pr oblem (3) admits a unique optimizer F ⋆ , and F ⋆ is given by the exp onential tilt dF ⋆ dF 0 ( x ) = exp  ψ ( x )+ η ⋆ x λ  R exp  ψ ( t )+ η ⋆ t λ  F 0 ( dt ) . (4) Mor e over, E F ⋆ [ X ] = 1 . The optimal value of (3) admits the dual r epr esentation sup F : E F [ X ]=1 n U ( F ) − λD KL ( F ∥ F 0 ) o = inf η ∈D  λ log Z exp  ψ ( x ) + η x λ  F 0 ( dx ) − η  . (5) Commen t. Equation (4) is the k ey structural conclusion of the section. Relativ e to F 0 , the planner rew eights types with higher score ψ ( x ), with the magnitude of rew eighting disciplined by λ . The scalar η ⋆ is the Lagrange multiplier on the mean one constrain t and is pinned down by a single scalar equation m ( η ) = 1. With the sign con ven tion in (5) , the shadow v alue of relaxing the target mean upw ard is − η ⋆ . The no velt y here is not the Gibbs form b y itself, but the identiﬁcation of the heterogeneous-EVT score ψ and the explicit one-dimensional implementation of the mean constrain t. The in teriority assumption in Assumption 4 is the precise condition that guaran tees existence of a ﬁnite dual optimizer. In man y applications it is veriﬁed through a standard steepness argumen t for the tilted mean m ( η ). A complete pro of of Theorem 3 is given in App endix A. It applies the entrop y-dualit y to ols in App endix B.2, but the pro of is written directly for the one-dimensional mean constraint used here so that the existence of the dual optimizer is explicit rather than implicit. 5.4 First-order conditions and the link to the heterogeneous EVT k ernel The design problem inherits its structure from the heterogeneous extreme-v alue op erator H γ ,F ( x ) = Z e − v γ ( x ) z F ( dz ) . Section 3.4 records the corresp onding directional deriv ative, δ H γ ,F ( x ) = Z e − v γ ( x ) z δ F ( dz ) , whic h makes explicit that the Laplace kernel z 7→ e − v γ ( x ) z is the marginal channel through which p erturbations of F aﬀect the distribution of extremes. In the linear class treated in Theorem 3, the score function ψ is the marginal v alue of shifting mass tow ard t yp e x . F or expected-utility ob jectiv es U ( F ) = E [ u ( Z )], Lemma 4 identiﬁes this marginal v alue explicitly as ψ u ( x ) = E [ u ( w γ ( E /x ))]. This is the direct bridge to Section 4: the same sto c hastic representation that yields linearit y for design also provides the canonical coupling device for stability . R emark 22 (Direct cdf criteria use the EVT k ernel literally) . F or any ﬁxed threshold y , the criterion U y ( F ) := H γ ,F ( y ) 21 is already linear in F , with score ψ y ( x ) = e − v γ ( y ) x . More generally , any weigh ted av erage of cdf lev els of the form R H γ ,F ( y ) ν ( dy ) has score x 7→ R e − v γ ( y ) x ν ( dy ) whenev er the in tegral is ﬁnite. This is the cleanest sense in which the same k ernel go verns both the heterogeneous extreme-v alue op erator and a non trivial class of normativ e ob jectives. 5.5 Closed-form solutions for canonical tail ob jectiv es Closed-form solutions are esp ecially transparen t when ψ is a simple transform of x . This class is relev ant economically b ecause several extreme outcome statistics in heterogeneous EVT reduces to momen ts or inv erse moments of X under F r ´ ec het-type normalization, as in Mangin (2026). The form ulas b elo w should b e read together with their admissibility conditions: the exp onen tial tilt is meaningful only when Assumption 4 is satisﬁed. P ow er and in v erse-p o wer ob jectiv es. Let ρ > 0 and consider ob jectiv es of the form U ( F ) = C · E F  X − ρ  or U ( F ) = C · E F  X ρ  , for a constant C ∈ R . These corresp ond to ψ ( x ) = C x − ρ or ψ ( x ) = C x ρ in Assumption 4. When the relev ant integrabilit y conditions hold, Theorem 3 yields dF ⋆ dF 0 ( x ) ∝ exp  C x − ρ + η ⋆ x λ  or dF ⋆ dF 0 ( x ) ∝ exp  C x ρ + η ⋆ x λ  , with η ⋆ c hosen so that E F ⋆ [ X ] = 1. Admissibilit y conditions. F or inv erse-p o w er scores x 7→ C x − ρ , a suﬃcient condition is that the baseline supp ort be b ounded aw ay from zero, say supp ( F 0 ) ⊆ [ a, ∞ ) with a > 0; without such a condition, the case of the p ositiv e co eﬃcien t C > 0 typically makes the normalizing in tegral diverge near zero. F or p ositiv e-p o w er scores x 7→ C x ρ , admissibilit y dep ends on the right tail of F 0 . A suﬃcien t condition is that F 0 b e light enough so that Z exp  C x ρ + η x λ  F 0 ( dx ) < ∞ for all η in some op en interv al containing the optimizer. When ρ > 1 and C > 0, this excludes exp onen tial or heavier baseline tails. These supp ort and tail restrictions are not tec hnical decoration; they are exactly what ensures the closed-form tilt is w ell deﬁned. Implemen tation. In applications, η ⋆ is obtained b y solving the scalar equation m ( η ) = 1. The v alue function is then obtained from the dual expression (5) . The entire computation is one-dimensional once the score ψ is known. 5.6 In terpretation: p olicy as reallo cation of opp ortunities The en tropy p enalty D KL ( F ∥ F 0 ) imp oses a reduced-form cost of reallo cating opp ortunities across t yp es. In a lab or market net work interpretation, F can for instance represent the distribution of eﬀectiv e meeting rates or centralities, and p olicy c hanges referral frictions, search subsidies, or platform rules that alter the induced F . The entrop y term summarizes implementation frictions that render highly concentrated reallocations costly . In ric her environmen ts, one can in terpret 22 the exp onen tial tilt as the reduced-form allo cation induced b y subsidy , tax, or referral p olicies that change the priv ate return to search intensit y . It is b ey ond the scop e of this pap er to study decen tralized implementation, but w e note that the one-dimensional form of (4) mak es that question esp ecially tractable. A natural extension, whic h w e keep separate from the core analysis to preserve the one- dimensional tractabilit y of (3) , replaces the marginal p enalt y by a genuine en tropic transp ort problem on a richer type space with cost c ( x 0 , x ). That extension w ould pro duce a pair of dual p oten tials on the source and target t yp e spaces. The present section should therefore b e read as an en tropy pro jection on the target marginal, not as a full Schr¨ odinger bridge. Quan tile ob jectives. If the ob jective is a high quantile, U ( F ) = Q 1 − α ( H γ ,F ), then U is typically not linear in F . Tw o tractable approaches are to work with smo othed tail criteria or tail exp ectations that admit linear representations via Lemma 4, or to characterize optimizers through ﬁrst-order conditions using the directional deriv ative of F 7→ H γ ,F from Section 3.4. W e implemen t the linear class in the core text, leaving exact quantile design as a promising extension. 6 Application: lab or mark et net w orks and the distribution of top w ages This section illustrates how heterogeneous extreme-v alue theory can b e used as a reduced-form device in lab or mark et settings where access to job opp ortunities is mediated by so cial netw orks. The key ob ject is the cross-sectional distribution F of eﬀective oﬀer-arriv al intensities induced b y heterogeneity in netw ork p osition, referrals, and related opp ortunit y channels. Once F is sp eciﬁed or measured, the general results in Sections 3, 4, and 5 translate in to three t yp es of statemen ts: (i) c omparativ e statics for the right tail of w ages as a function of netw ork inequality , (ii) robustness b ounds for tail predictions under measuremen t or estimation error in F , and (iii) an entrop y-regularized p olicy problem that reallo cates opp ortunities sub ject to implemen tation frictions. While the broader lab or netw ork motiv ation w as discussed in Section 1.4, here our goal is to translate the general results into a familiar reduced-form environmen t for tail wage outcomes. Throughout, the ob ject remains the cross-sectional law of a randomly dra wn work er’s normalized maxim um under a common oﬀer distribution. Remark that we do not seek to mo del herein equilibrium wage setting, endogenous searc h, or welfare. Accordingly , statements below ab out segregation, homophily , or policy should b e read as claims ab out ho w those forces reshap e the reduced-form access distribution F , not as standalone equilibrium or w elfare conclusions. 6.1 A stylized net w ork-based searc h environmen t T yp es and oﬀer arriv als. Consider a large population of w orkers indexed b y i . Eac h w orker has a net work p osition summarized by a scalar t yp e X i > 0 capturing eﬀectiv e access to job opp ortunities. The interpretation of X i is as a shifter of job oﬀer arriv al rates, generated by a referral netw ork, lo cal contacts, or platform-mediated matc hing. W e impose the normalization E [ X i ] = 1, so the a verage scale of opp ortunities is ﬁxed and the ob ject of interest is the cross-sectional distribution of access. Fix a mark et thickness or horizon parameter θ > 0. Conditional on X i , the num b er of oﬀers receiv ed by work er i follows the mixed-Poisson sp eciﬁcation from Section 2: N i,θ | X i ∼ Poisson( θX i ) . 23 Let F denote the cross-sectional distribution of X i in the p opulation, so X ∼ F for a randomly dra wn work er. Oﬀer v alues and realized wages. Let { Y ij } j ≥ 1 b e i.i.d. wage oﬀers with common distribution G , indep enden t of ( X i , N i,θ ). W ork er i accepts the b est oﬀer o ver the horizon, M i,θ := sup 1 ≤ j ≤ N i,θ Y ij , with the con ven tion sup ∅ = −∞ . The economic con tent is that heterogeneity en ters through access to opp ortunities, while the distribution of oﬀer v alues is common across work ers. F rom net w ork primitiv es to F . The distribution F can b e link ed to standard netw ork ob jects. As a b enchmark, supp ose work ers are no des in an undirected con tact netw ork and eac h neighbor generates job leads at approximately indep enden t random times. Conditional on degree D i , the n umber of leads ov er a horizon is then w ell approximated b y a P oisson random v ariable with mean prop ortional to θ D i . After normalization, one can set X i := D i / E [ D ], so that E [ X i ] = 1 and F is the distribution of normalized degrees. With weigh ted links, one can take X i as normalized weigh ted degree. If the arriv al mec hanism aggregates information ov er longer paths, X i can represen t a normalized centralit y index. The analysis that follo ws uses F as the reduced-form suﬃcient statistic for the cross-sectional opp ortunit y structure. T ail normalization and the heterogeneous extreme-v alue la w. Assume G is in the max domain of attraction of the generalized extreme-v alue law H γ with index γ . Under the normalization in Section 2.1, the distribution of normalized outcomes for a randomly dra wn work er con verges to the heterogenous extreme-v alue la w from Prop osition 1: H γ ,F ( x ) = E  exp  − X v γ ( x )  , X ∼ F . Th us, once F is sp eciﬁed or estimated, the mo del deliv ers a tractable mapping from net work heterogeneit y into the cross-sectional distribution of w orkers’ top wages in the sense of maxima o ver oﬀers. 6.2 P ositiv e implications: net w ork heterogeneit y and the right tail Net work inequalit y and tail losses under con vex order. A natural notion of increased net work inequality is a mean-preserving spread of F holding E [ X ] = 1 ﬁxed, equiv alen tly an increase in the conv ex order; see, e.g., Rothsc hild and Stiglitz (1970) and Shaked and Shan thikumar (2007). In the present setting, this order has a direct implication for extremes b ecause, for each z > 0, the k ernel x 7→ exp ( − z x ) is conv ex and decreasing. This is precisely the mec hanism b ehind the Jensen and Laplace-ordering results in Section 3.1. Corollary 5 (Net w ork inequalit y and normalized top wages) . Fix γ and let F 1 , F 2 satisfy E [ X ] = 1 . If F 2 is a me an-pr eserving spr e ad of F 1 , then H γ ,F 2 ( x ) ≥ H γ ,F 1 ( x ) for al l x, so the c orr esp onding limit distribution of normalize d top wages under F 2 is ﬁrst-or der sto chastic al ly smal ler than under F 1 . Equivalently, for any incr e asing me asur able φ for which exp e ctations ar e ﬁnite, E  φ ( Z 2 )  ≤ E  φ ( Z 1 )  , Z k ∼ H γ ,F k . 24 Corollary 5 isolates a clean reduced-form message ab out the right tail. Holding ﬁxed the mean scale of opp ortunities and the common oﬀer distribution, redistributing access more unequally lo wers the distribution of maxima for a randomly drawn w orker b ecause the Laplace k ernel is con vex in types. In this sense, homophily , segregation, or referral frictions matter here through the extent to which they reshap e the opp ortunit y distribution F . At the same time, one should not ov erread the result at the furthest tail. By Remark 5, the mean one normalization implies 1 − H γ ,F ( x ) ∼ v γ ( x ) ∼ 1 − H γ ( x ) whenever v γ ( x ) ↓ 0, so greater net work inequalit y changes distributional levels and high quantiles, but not the ﬁrst-order ultra-tail mass. T ranslating that comparativ e static in to welfare, equilibrium eﬃciency , or group incidence w ould require additional structure on search b eha vior, oﬀer-quality heterogeneity , and mark et clearing. A concrete rewiring coun terfactual. Consider a referral-platform redesign, men toring inter- v ention, or lo cal rewiring exp erimen t that changes the empirical distribution of normalized degree from F 0 to F 1 while preserving the p opulation mean. If F 1 is a mean-preserving con traction of F 0 , Corollary 5 implies an everywhere improv ement in the normalized top w age distribution for a randomly drawn work er. If the change is not ordered by con vex order, the adapted geometry b elo w still pro vide a disciplined whole la w comparison. A simple numerical illustration. Consider a baseline netw ork distribution F 0 = 0 . 8 δ 0 . 5 + 0 . 2 δ 3 , whic h can b e read as a lab or market with many p eripheral work ers and a smaller highly connected group, while preserving the normalization E [ X ] = 1. At the centered threshold x = 0, one has v γ (0) = 1 for every γ , so H γ ,F 0 (0) = P 0 (1) = 0 . 8 e − 0 . 5 + 0 . 2 e − 3 ≈ 0 . 495 , H γ (0) = e − 1 ≈ 0 . 368 . Th us the unequal-opp ortunity economy places ab out 12 . 7 p ercen tage p oin ts more mass b elow the centered threshold, equiv alently ab out 12 . 7 p ercen tage points less mass ab o v e it, than the homogeneous b enc hmark. The corresp onding misallo cation indices are M 1 ( F 0 ) = E [ | X − 1 | ] = 0 . 8 and M 2 ( F 0 ) = p V ar( X ) = 1. This simple example mak es concrete how a small, highly connected minorit y can co exist with a materially w eaker distribution of normalized top wages for a randomly dra wn work er. Geometric coun terfactual paths. When tw o economies with t yp e distributions F 0 and F 1 are not ordered by con vex order, the adapted-W asserstein geometry can still provide, quite remark ably , a disciplined comparison. Corollary 2 constructs a canonical interpolation path ( F t ) t ∈ [0 , 1] in the am bient transformed t yp e geometry such that the induced law of top wages mov es at most linearly in path length. Corollary 3 turns the same result into an economically direct statement for the en tire counterfactual schedule of top w age quantiles: ∥ Q γ ,F s − Q γ ,F t ∥ L p (0 , 1) ≤ C γ ,p | t − s | d γ ,p ( F 0 , F 1 ) . Th us the adapted metric is not only a p oint wise cdf device. It controls the whole w age distribution in quan tile space, whic h is the natural ob ject for many coun terfactual exercises. Because transformed co ordinate geo desics need not preserve E [ X ] = 1 at intermediate times when s γ is nonlinear, this path should b e read as a sensitivity device in am bient shap e space, not as a literal mean-preserving p olicy path. If a counterfactual must keep aggregate opp ortunity scale ﬁxed throughout, Prop osition 10 quan tiﬁes the discrepancy b et ween the ambien t path and its mean one renormalization, while the ra w space mean-preserving geo desics of Section 3 remain av ailable when exact preserv ation is required throughout. 25 Robust tail predict ion under measuremen t error in F . In practice, F can b e estimated from netw ork data or inferred from proxies for the intensit y of job contact. The W asserstein stability results in Section 4 then provide a direct wa y to translate estimation error in F in to error b ounds for predicted tail outcomes. The main linear b ound is stated in the adapted metric d γ ,p . When the a v ailable statistical control is instead in ra w space W asserstein distance, Prop osition 11 provides the additional bridge, which can b e weak er in the F r´ echet regime. Corollary 6 (Robustness of predicted tail distributions) . L et b F b e an estimator of F with E [ X ] = 1 . Assume the c onditions of The or em 1 hold for some p ≥ 1 . Then the induc e d err or on the pr e dicte d distribution of normalize d top wages satisﬁes W p  H γ , b F , H γ ,F  ≤ C γ ,p d γ ,p  b F , F  , for the mo dulus C γ ,p char acterize d in Se ction 4. Under the supp ort and tail-shap e c onditions of Pr op osition 11, the adapte d distanc e on the right-hand side c an in turn b e c ontr ol le d by a standar d Wasserstein distanc e on the typ e sp ac e. Mor e over, if ψ is Lipschitz on the supp ort of H γ ,F and the r elevant moments exist, then   E [ ψ ( Z b F )] − E [ ψ ( Z F )]   ≤ Lip( ψ ) W 1  H γ , b F , H γ ,F  , wher e Z b F ∼ H γ , b F and Z F ∼ H γ ,F . Corollary 6 is the op erational bridge b et ween statistical or measurement uncertaint y in netw ork heterogeneit y and uncertaint y in predicted tail-wage outcomes, but only after the main tained comparison metric has b een sp eciﬁed clearly . If the researcher controls d γ ,p directly , the b ound is linear. If the researc her b egins from ra w space W asserstein error, the relev ant conclusion is obtained only after applying Prop osition 11, and in F r ´ ec het settings that bridge can b e merely H¨ older. F or p eripheral or intermitten tly inactive work ers, the relev an t is sue is b eha vior near X = 0. Prop osition 9 mak es the b oundary explicit: the adapted transp ort b ounds remain av ailable as long as the transformed momen ts required b y d γ ,p are ﬁnite. When those conditions fail, one should fall bac k on the p oin t wise or ﬁnite horizon comparisons rather than on the adapted metric. Group heterogeneity , segregation, and tail gaps. The mapping can of course be aplied group by group. If tw o groups g ∈ { A, B } face diﬀerent induced opp ortunit y distributions F g due to segregation, homophily , or diﬀeren tial access to referrals, then the mo del implies group-sp eciﬁc limit la ws H γ ,F g . The geometric to ols in Sections 3 and 4 then provide tw o complementary comparisons: order comparisons when F A dominates F B in con vex order, and metric comparisons when F A and F B are close in W asserstein distance but not ordered. 6.3 Normativ e implications: opportunity reallo cation and p olicy design Man y interv entions that aﬀect lab or mark et outcomes op erate through c hanging access to opp ortu- nities, rather than directly changing oﬀer v alues. Examples include referral programs, mentoring and placement initiatives, c hanges in searc h subsidies, and institutional designs that aﬀect who meets whom. At an abstract lev el, such p olicies can all b e represented as p erturbations of the induced t yp e distribution F . Section 5 provides a tractable design problem in which a planner c ho oses F to optimize a tail-sensitiv e ob jectiv e while controlling deviations from a bas eline F 0 via relativ e entrop y . In the presen t setting, a canonical ob jective is exp ected utility of normalized top wages, U ( F ) = E [ u ( Z )] , Z ∼ H γ ,F , 26 or a smo othed tail criterion that places higher marginal w eigh t on large wage realizations. The planner problem takes the form sup F : E F [ X ]=1 n U ( F ) − λD KL ( F ∥ F 0 ) o , where λ go verns the marginal cost of reshaping the cross-sectional distribution of access. When U is linear in F , including the exp ected-utilit y class identiﬁed via the sto c hastic representation in Section 5.2, Theorem 3 yields a unique optimizer F ⋆ giv en by an exp onential tilt of F 0 . A discrete illustration with genuine degrees of freedom. T o mak e the mean-one constraint non-v acuous, consider a baseline with three supp ort p oin ts, F 0 = π 0 1 δ x 1 + π 0 2 δ x 2 + π 0 3 δ x 3 , 0 < x 1 < x 2 < x 3 , with P j π 0 j = 1 and P j π 0 j x j = 1. The optimal p olicy keeps the same supp ort and reweigh ts the masses according to π ⋆ j ∝ π 0 j exp  ψ ( x j ) + η ⋆ x j λ  , j = 1 , 2 , 3 , where η ⋆ is c hosen so that P j π ⋆ j x j = 1. Pairwise o dds satisfy π ⋆ i /π ⋆ j π 0 i /π 0 j = exp  ψ ( x i ) − ψ ( x j ) + η ⋆ ( x i − x j ) λ  . Unlik e the t wo-point case, the mean-one restriction do esn’t pin down the feasible distribution uniquely , so the entrop y penalty and the score pro duce a gen uine tradeoﬀ. This pairwise-o dds equation is the discrete reduced-form analogue of the full optimal p olicy tradeoﬀ in our netw ork application: the tail score ψ , the implementation friction λ , and the shadow v alue η ⋆ join tly determine ho w mass is reallo cated across access t yp es. This upshot formalizes the idea that impro ving tail outcomes can require reallo cating access across the p opulation, but such reallo cations are constrained by implementation frictions. It also complemen ts ric her normative analyses of netw ork p olicy by isolating the tail-sensitive, reduced-form margin through which access is reweigh ted. 6.4 Microfoundations and discussion The mixed-Poisson arriv al structure can b e grounded in standard net work-based mec hanisms of job searc h. F or instance, if w orkers receive job leads from contacts and each contact generates opp ortunities at approximately indep endent random times, then conditional on net work exp osure the n umber of leads ov er a horizon is naturally approximated by a P oisson random v ariable with mean proportional to a net work index. This is the reduced-form counterpart of the job searc h mec hanisms discussed in the net work-of-con tacts literature ov erview ed in Section 1.4. In mo dern settings, X can b e in terpreted more broadly than degree. F or instance, it can incorp orate platform-mediated matching intensit y , diﬀerential search eﬀort, or institution-sp eciﬁc access, while still entering the analysis only through its distribution F and the normalization E [ X ] = 1. This interpretation is esp ecially natural when referral pro cesses, homophily , or segregation generate p ersisten t diﬀerences in access to opp ortunities across work ers or groups. 6.5 Empirical interface and identiﬁcation ca v eats The framew ork suggests a useful empirical in terface, but the scope of identiﬁcation should b e delimited carefully . 27 Coun ts p oin t-iden tify F in population, but in version is ill-p osed. If one observes oﬀer coun ts at a single horizon θ , then the data identify the count distribution and its pgf. Under the mixed-P oisson structure, that data-identiﬁed pgf pins down the Laplace transform P 0 on the interv al z ∈ [0 , θ ]. Because P 0 is the Laplace transform of a p ositiv e measure, its v alues on an y op en interv al determine the en tire transform and therefore the mixing law F . Population p oin t identiﬁcation is th us not the issue. Rather, the diﬃcult y is statistical: recov ering F from one observ ed interv al of the transform is a severely ill-p osed inv erse problem, so m ultiple horizons, rep eated observ ations, or parametric/semi-parametric structure remain v aluable for stable estimation and regularization. Net work proxies require a measuremen t mo del. Observ ed degree, weigh ted degree, referral exp osure, or centralit y can b e informativ e pro xies for opp ortunit y intensit y , but they do not mec hanically equal the laten t t yp e X . Using them to estimate F requires a main tained mapping from observ ed netw ork statistics to the shifter of arriv al rates en tering the mixed-Poisson sp eciﬁcation. F or the same reason, net work pro xies alone do not identify the adapted metric d γ ,p without a main tained c hoice of tail index γ and a maintained measuremen t mo del for how the observ ed pro xy maps in to the laten t type entering the extreme-v alue law. Extreme-based in version requires a ﬁrst-step tail analysis. Inferring F from extreme outcomes requires the tail-limit ingredients, namely the index γ , the normalization ( a θ , b θ ), and, for second-order reﬁnemen ts, the corresp onding second-order ob jects, either to b e known or to b e estimated se parately . The full parent la w G need not b e kno wn for the EVT appro ximation itself. The asymptotic expansion in Theorem 2 useful precisely b ecause it separates second-order EVT error from the heterogeneit y kernel. But it do es not, on its o wn, solv e the statistical reco very problem for F . Error propagation is conditional on the sampling scheme and can b e H¨ older in F r´ ec het settings. Once an empirical appro ximation b F is av ailable, Corollary 6 conv erts metric error in b F in to error in predicted tail laws. If 0 < γ < 1 and the assumptions of Theorem 1 hold with pγ < 1, Prop osition 11 yields W p  H γ , b F , H γ ,F  ≤ C γ ,p W p ( b F , F ) γ . Com bined with Corollary 3, the same argument gives ∥ Q γ , b F − Q γ ,F ∥ L p (0 , 1) ≤ C γ ,p W p ( b F , F ) γ . Th us the bridge from ra w space estimation error in F to error in predicted extremes is generally only H¨ older, not linear, in the economically imp ortan t F r´ echet regime. If W p ( b F , F ) = O P ( r n ), the induced law error is only O P ( r γ n ). Standard b enc hmark W asserstein rates suc h as F ournier and Guillin (2015) therefore remain informative, but only after this nonlinear p enalt y is made explicit. Shared-net work data, ho wev er, feature cross-sectional dep endence, so those rates should b e treated as b enc hmarks rather than as automatic guaran tees. Additional graph-dep endence assumptions w ould b e needed for a full statistical theory . Finite-horizon and asymptotic claims should b e kept distinct. F or prelimit predictions at a ﬁxed horizon θ , Prop osition 13 is the relev an t robustness statemen t. F or asymptotic tail appro ximations, Theorem 1 and Theorem 2 are the relev ant to ols. Keeping those ob jects separate helps preven t ﬁnite-sample robustness, asymptotic approximation, and statistical identiﬁcation from b eing conﬂated. 28 7 Discussion and conclusion This pap er studies extremes in environmen ts where the num b er of opp ortunities is heterogeneous and random. Starting from the mixed-Poisson heterogeneous agent extreme v alue limits in Mangin (2026), we treat the heterogeneity distribution as the primitive ob ject and analyze the induced map F 7→ H γ ,F with order, W asserstein-geometric, and complemen tary entrop y-pro jection to ols. 7.1 Main takea wa ys F our p oin ts structure the con tribution. First, heterogeneous extremes admit a compact Laplace-transform representation. Once the primitiv e oﬀer distribution is in a classical domain of attraction, the eﬀect of unequal access to opp ortunities is summarized by the mean-one distribution F of draw intensities and its Laplace transform. This makes heterogeneity analytically tractable without collapsing it to a scalar index. Second, the pap er’s main new quan titative theorem is geometric. Theorem 1 shows that, once the canonical coupling representation is in hand, p erturbations in F propagate Lipschitzly into W asserstein p erturbations of the en tire induced law of extremes. Corollary 2 pack ages the same structure in to canonical ambien t in terp olation paths, while Theorem 2 cleanly separates second-order EVT approximation error from the heterogeneity kernel. By contrast, conv ex-order comparisons and sev eral p oin twise inequalities are consequences of the Laplace form and are included b ecause they mak e the op erator economically usable. Third, the lab or mark et netw ork application sho ws how these ingredien ts ﬁt together in a familiar en vironment, but only in reduced form. Netw ork p osition maps into opp ortunit y in tensity , net work inequalit y maps into tail distortions for a randomly drawn w orker’s top wage, and the adapted geometry con trols the whole counterfactual quantile schedule together with the renormalization error induced by returning to the m ean one slice. T urning those ob jects into equilibrium or welfare claims w ould require additional structure b ey ond the scop e of the present pap er. F ourth, the same t yp e space supp orts a complemen tary design problem. Under KL regularization and a mean constrain t, the planner’s problem is a one-marginal en tropy pro jection with a unique exp onen tial-tilt solution. The connection to the p ositiv e analysis is structural rather than geometric: the same Laplace kernel go verns both the heterogeneous extreme-v alue op erator and the score functions of a broad class of linear ob jectiv es, including cdf-based criteria and exp ected utility of normalized extremes. 7.2 Extensions Sev eral extensions seem natural. • Endogenous netw orks and strategic exp osure. In applications where net work p osition is c hosen or co-determined with outcomes, one could embed the mixed-P oisson intensit y into an equilibrium model of link formation or eﬀort choice. The present stability and design results can then b e used either as reduced-form comparative statics or as primitives inside a ﬁxed-p oin t argumen t. • Dynamic designs and Sc hr¨ odinger bridges. Our en tropy-regularized form ulation in Section 5 is static. A dynamic extension would allow the planner to steer heterogeneit y ov er time sub ject to intertemporal entropic costs, whic h naturally leads to Schr¨ odinger-bridge-type problems. That extension would b e genuinely ric her than the one-marginal pro jection studied here. 29 • Dynamic record pro cesses and sequential s earc h. Because each new oﬀer can b e viewed as a p otential record, one natural extension is to study the full record pro cess generated by heterogeneous arriv al rates rather than only the terminal maxim um. Classical references in this con text include Arnold et al. (1998). Suc h an extension would connect the present geometry of extremes to questions ab out the timing of record improv ements and threshold-crossing times. • Bey ond mixed P oisson counts and beyond indep endence. The mixed-Poisson structure pro vides tractabilit y through a Laplace transform. Other count mo dels or dep endence structures can b e accommo dated whenever one can represent the distribution of maxima b y a tractable transform and obtain couplings that separate heterogeneity from idiosyncratic sho c ks. The classical random-sample-size maxima literature provides the natural b enc hmark for that extension; see Barndorﬀ-Nielsen (1964), Galambos (1973), or Silv estrov and T eugels (1998). • Statistical iden tiﬁcation and inference. The rate expansion in Section 4.6 and the stabilit y results in Section 4 suggest a path to ward inference for F from tail observ ations or coun t data. A full treatment w ould hav e to com bine second-order tail estimation for G , stable reco very or regularized inv ersion of F from observ ed count distributions or extremes, and a dep endence-a w are theory for net work data. The present pap er isolates the op erator-theoretic ingredien ts needed for that future step rather than claiming to complete it. Conclusion. Extremes are economically salien t precisely because they aggregate rare opportunities and heterogeneit y in access. The framework developed here provides a tractable wa y to quantify ho w heterogeneity shap es agent-lev el extremes, how robust those implications are to p erturbations in heterogeneit y , and how a planner can reshap e heterogeneity when the ob jectiv e is tail sensitive and p olicy changes are entrop y p enalized. The transp ort results should b e read on their natural geometric domain, Prop osition 10 as the explicit link back to the mean-one economic slice, and the lab or-net w ork application as a disciplined reduced-form rather than a complete equilibrium mo del. A Pro ofs A.1 Auxiliary limit for maxima W e rep eatedly use the elementary implication that turns conv ergence of p o wers in to a ﬁrst-order tail appro ximation. Lemma 5 (F rom pow ers to ﬁrst-order tails) . L et { u n } n ≥ 1 ⊂ [0 , 1) and supp ose ther e exists v ∈ [0 , ∞ ) such that (1 − u n ) n → e − v as n → ∞ . Then u n → 0 and nu n → v . Pr o of. Since (1 − u n ) n → e − v ∈ (0 , 1], we m ust hav e 1 − u n → 1, hence u n → 0. W rite (1 − u n ) n = exp  n log(1 − u n )  . T aking logs yields n log (1 − u n ) → − v . F or u ∈ [0 , 1), the inequalities − u 1 − u ≤ log (1 − u ) ≤ − u hold. Applying them with u = u n and m ultiplying by − n giv es nu n 1 − u n ≥ − n log (1 − u n ) ≥ nu n . 30 Letting n → ∞ and using u n → 0 and − n log(1 − u n ) → v , we obtain lim sup n →∞ nu n ≤ v and lim inf n →∞ nu n ≥ v , so nu n → v . A.2 Pro of of Lemma 1 Pr o of. Fix y ∈ [0 , 1] and θ > 0. By conditioning on X and using the probability generating function of a Poisson random v ariable, E h y N ( θ ) | X i = exp  − θ X (1 − y )  . T aking exp ectations o ver X yields E h y N ( θ ) i = E  exp  − θ X (1 − y )  = P 0 ( θ (1 − y )) , whic h is the claimed iden tity . A.3 Pro of of Prop osition 1 Pr o of. Fix x ∈ R . Let n = ⌊ θ ⌋ and recall that a θ = a n and b θ = b n b y deﬁnition. Set x θ := b θ + a θ x = b n + a n x. Step 1: r e duc e to a L aplac e ar gument. By (1), Pr  M θ − b θ a θ ≤ x  = Pr( M θ ≤ x θ ) = P 0  θ  1 − G ( x θ )  . Since P 0 is the Laplace transform of a nonnegative random v ariable, it is con tinuous on [0 , ∞ ). Step 2: identify the limit of θ (1 − G ( x θ )) . Assumption 1 implies Pr  M n − b n a n ≤ x  = Pr( M n ≤ b n + a n x ) = G ( b n + a n x ) n → H γ ( x ) = e − v γ ( x ) at all contin uit y p oin ts x of H γ . Deﬁne u n ( x ) := 1 − G ( b n + a n x ) ∈ [0 , 1). Then G ( b n + a n x ) n = (1 − u n ( x )) n . If H γ ( x ) > 0, equiv alently v γ ( x ) < ∞ , Lemma 5 giv es nu n ( x ) → v γ ( x ) and u n ( x ) → 0 . Hence   θ u n ( x ) − nu n ( x )   ≤ | θ − n | u n ( x ) ≤ u n ( x ) → 0 , since | θ − n | < 1. Therefore θ  1 − G ( b n + a n x )  = θ u n ( x ) → v γ ( x ) . If instead H γ ( x ) = 0, then (1 − u n ( x )) n → 0. W e claim that nu n ( x ) → ∞ . If not, there would exist a subsequence n k and a constant M < ∞ suc h that n k u n k ( x ) ≤ M . Then u n k ( x ) ≤ M /n k → 0. 31 P assing to a further subsequence if needed, we may assume n k u n k ( x ) → L for some ﬁnite L ∈ [0 , M ]. Since log(1 − u ) = − u + o ( u ) as u ↓ 0, it follows that n k log(1 − u n k ( x )) → − L, and therefore (1 − u n k ( x )) n k → e − L > 0 , con tradicting (1 − u n ( x )) n → 0. Thus nu n ( x ) → ∞ . Since θ ≥ n , it follows that θ u n ( x ) ≥ nu n ( x ) → ∞ . Step 3: p ass to the limit thr ough P 0 . If H γ ( x ) > 0, Step 2 and contin uity of P 0 yield P 0 ( θ (1 − G ( x θ ))) → P 0  v γ ( x )  = H γ ,F ( x ) . If H γ ( x ) = 0, then θ (1 − G ( x θ )) → ∞ b y Step 2. Because e − z X → 1 { X =0 } as z → ∞ and Assumption 2 imp oses Pr( X = 0) = 0, dominated con vergence gives P 0 ( z ) = E [ e − z X ] → 0 as z → ∞ . Hence P 0 ( θ (1 − G ( x θ ))) → 0 = H γ ,F ( x ) . This pro ves the ass erted conv ergence at all contin uit y p oin ts of H γ ,F . De gener ate c ase. If F = δ 1 , then X = 1 almost surely , so P 0 ( z ) = E [ e − z X ] = e − z and hence H γ ,F ( x ) = P 0 ( v γ ( x )) = e − v γ ( x ) = H γ ( x ) . A.4 Pro ofs for Section 3 Pr o of of Pr op osition 2. Let X 1 ∼ F 1 and X 2 ∼ F 2 with E [ X 1 ] = E [ X 2 ] = 1. If F 2 is a mean- preserving spread of F 1 , then X 2 dominates X 1 in con vex order. Equiv alen tly , E [ φ ( X 2 )] ≥ E [ φ ( X 1 )] for every conv ex φ for which b oth exp ectations are ﬁnite . Fix z ≥ 0 and consider φ z ( x ) := e − z x . Then φ z is con vex on [0 , ∞ ) (since φ ′′ z ( x ) = z 2 e − z x ≥ 0) and b ounded by 1. Hence, P (2) 0 ( z ) = E [ e − z X 2 ] ≥ E [ e − z X 1 ] = P (1) 0 ( z ) for all z ≥ 0 . Comp osing with z = v γ ( x ) ≥ 0 yields H γ ,F 2 ( x ) = P (2) 0  v γ ( x )  ≥ P (1) 0  v γ ( x )  = H γ ,F 1 ( x ) for all x. Therefore, the distribution function under F 2 is everywhere larger, whic h is equiv alent to ﬁrst-order sto c hastic dominance in the direction that extremes are smaller under F 2 . 32 Pr o of of Cor ol lary 1. Apply Prop osition 2 with F 1 = δ 1 and F 2 = F . Since every mean one distribution on [0 , ∞ ) dominates δ 1 in con vex order, one obtains P 0 ( z ) ≥ e − z for all z ≥ 0 , and therefore H γ ,F ( x ) ≥ H γ ( x ) for every x with v γ ( x ) < ∞ . If F  = δ 1 , then strict con vexit y of x 7→ e − z x for ev ery z > 0 and Jensen’s inequalit y imply strict inequality for all z > 0. Pr o of of Pr op osition 3. By Lemma 2, the canonical monotone geo desic ( µ t ) t ∈ [0 , 1] satisﬁes Q µ t ( u ) = (1 − t ) Q µ ( u ) + tQ ν ( u ) , u ∈ (0 , 1) . Since φ is conv ex, φ  Q µ t ( u )  ≤ (1 − t ) φ  Q µ ( u )  + tφ  Q ν ( u )  for ev ery u ∈ (0 , 1) . In tegrating ov er u ∈ (0 , 1) and using the quantile representation of integrals gives Z φ dµ t ≤ (1 − t ) Z φ dµ + t Z φ dν. Applying the same argument to any pair of times s, t ∈ [0 , 1] in place of 0 , 1 yields conv exit y of t 7→ R φ dµ t on [0 , 1]. Pr o of of Pr op osition 4. F or the F r ´ echet case, let Z γ b e indep enden t of X ∼ F and satisfy Pr( Z γ ≤ z ) = exp( − z − 1 /γ ) , z > 0 . Then, for z > 0, Pr( X γ Z γ ≤ z | X ) = Pr  Z γ ≤ z X − γ | X  = exp  − X z − 1 /γ  . T aking exp ectations o ver X giv es Pr( X γ Z γ ≤ z ) = E h e − X z − 1 /γ i = P 0 ( z − 1 /γ ) , whic h is the cdf of the heterogeneous F r´ ec het limit. If Z GEV := ( X γ Z γ − 1) /γ , then Pr( Z GEV ≤ x ) = Pr( X γ Z γ ≤ 1 + γ x ) = P 0  (1 + γ x ) − 1 /γ  = H γ ,F ( x ) . F or the Gumbel case, let E ∼ Exp (1) be indep enden t of X ∼ F and deﬁne Z := log ( X/E ). Then, for every x ∈ R , Pr( Z ≤ x | X ) = Pr( E ≥ X e − x | X ) = e − X e − x . T aking exp ectations yields Pr( Z ≤ x ) = E [ e − X e − x ] = P 0 ( e − x ) = H 0 ,F ( x ) . 33 Pr o of of L emma 3. By Prop osition 4, Z = X γ Z γ with Z γ indep enden t of X . Hence, by iterated exp ectation, E [ ψ ( Z )] = E [ E [ ψ ( X γ Z γ ) | X ]] = E [ κ ψ ( X )] = Z ∞ 0 κ ψ ( x ) F ( dx ) . Pr o of of Pr op osition 5. F or every x with v γ ( x ) < ∞ , H γ ,F ε ( x ) = Z e − v γ ( x ) u F ε ( du ) = Z e − v γ ( x ) u F ( du ) + ε Z e − v γ ( x ) u ν ( du ) . Diﬀeren tiating with resp ect to ε at 0 yields the claim. Pr o of of Pr op osition 6. Fix z ≥ 0 and apply Prop osition 3 with the conv ex function φ z ( x ) = e − z x . Then t 7→ Z e − z x F t ( dx ) = P t 0 ( z ) is con vex on [0 , 1]. Setting z = v γ ( x ) giv es the con vexit y of t 7→ H γ ,F t ( x ). Pr o of of Pr op osition 7. On (0 , ∞ ), the function φ ( x ) = x − ρ is con vex for ev ery ρ > 0. Since the geo desic is supp orted on [ a, ∞ ), the function is also b ounded on the supp ort of ev ery F t . Applying Prop osition 3 with this φ yields conv exity of t 7→ Z x − ρ F t ( dx ) . A.5 Pro ofs for Section 4 Pr o of of Pr op osition 8. Let X ∼ F and E ∼ Exp (1) b e indep enden t, and deﬁne Z γ ,F = w γ ( E /X ). F or any x such that v γ ( x ) ∈ (0 , ∞ ), Pr( Z γ ,F ≤ x | X ) = Pr( w γ ( E /X ) ≤ x | X ) = Pr( E /X ≥ v γ ( x ) | X ) , since w γ is decreasing. Therefore, Pr( Z γ ,F ≤ x | X ) = exp  − X v γ ( x )  . T aking exp ectations yields Pr( Z γ ,F ≤ x ) = E h e − X v γ ( x ) i = P 0  v γ ( x )  = H γ ,F ( x ) . The explicit formulas follow from w γ ( t ) = ( t − γ − 1) /γ for γ  = 0 and w 0 ( t ) = − log t . Pr o of of The or em 1. Case γ  = 0 . Let µ i := s γ # F i for i ∈ { 1 , 2 } , so that d γ ,p ( F 1 , F 2 ) = W p ( µ 1 , µ 2 ). Let ( U 1 , U 2 ) b e an otimal coupling of ( µ 1 , µ 2 ), and let E ∼ Exp (1) b e indep enden t of ( U 1 , U 2 ). Set V := E − γ and deﬁne ϕ ( u, v ) := uv − 1 γ . 34 If X i := s − 1 γ ( U i ), then X i ∼ F i and ϕ ( U i , V ) = X γ i E − γ − 1 γ = w γ ( E /X i ) . By Prop osition 8, L ( ϕ ( U i , V )) = H γ ,F i . F or each ﬁxed v , the map u 7→ ϕ ( u, v ) is Lipsc hitz with constan t | v | / | γ | . Hence Lemma 9 giv es W p ( H γ ,F 1 , H γ ,F 2 ) ≤  E [ | E − γ | p ]  1 /p | γ | W p ( µ 1 , µ 2 ) =  E [ | E − γ | p ]  1 /p | γ | d γ ,p ( F 1 , F 2 ) . This pro ves the b ound for γ  = 0. Case γ = 0 . Let µ i := ( log )# F i for i ∈ { 1 , 2 } , so that d 0 ,p ( F 1 , F 2 ) = W p ( µ 1 , µ 2 ). Let ( U 1 , U 2 ) b e an optimal coupling of ( µ 1 , µ 2 ), let E ∼ Exp(1) b e indep enden t, and deﬁne ϕ ( u, v ) := u − log v . If X i := e U i , then X i ∼ F i and ϕ ( U i , E ) = log X i − log E = w 0 ( E /X i ) . By Prop osition 8, L ( ϕ ( U i , E )) = H 0 ,F i . F or each ﬁxed v > 0, the map u 7→ ϕ ( u, v ) is 1-Lipschitz, so Lemma 9 yields W p ( H 0 ,F 1 , H 0 ,F 2 ) ≤ W p ( µ 1 , µ 2 ) = d 0 ,p ( F 1 , F 2 ) . Pr o of of Cor ol lary 2. By construction, d γ ,p ( F s , F t ) = W p ( G s , G t ) . Since ( G t ) t ∈ [0 , 1] is the canonical monotone constant-speed W p geo desic from G 0 to G 1 , W p ( G s , G t ) = | t − s | W p ( G 0 , G 1 ) = | t − s | d γ ,p ( F 0 , F 1 ) . Applying Theorem 1 to F s and F t yields the second claim. Pr o of of Pr op osition 10. Let m := m ( F ) and let Y := s γ ( X ) for X ∼ F . If γ  = 0, then s γ  X m  =  X m  γ = m − γ X γ = m − γ Y . Hence s γ # e F is the la w of m − γ Y . Because quan tiles scale linearly under multiplication b y a p ositiv e constan t, d γ ,p ( F , e F ) p = W p  L ( Y ) , L ( m − γ Y )  p = Z 1 0   Q Y ( u ) − m − γ Q Y ( u )   p du. Therefore d γ ,p ( F , e F ) =   1 − m − γ    Z 1 0 | Q Y ( u ) | p du  1 /p =   1 − m − γ   ( E [ | Y | p ]) 1 /p , whic h is exactly the stated formula for γ  = 0. If γ = 0, then s 0 ( X/m ) = log X − log m , so s 0 # e F is the translate of s 0 # F by − log m . Again by the quantile representation, d 0 ,p ( F , e F ) = W p ((log)# F , τ − log m #((log)# F )) = | log m | , where τ c ( y ) := y + c . The W asserstein b ound for the induced extreme laws is then immediate from Theorem 1. 35 Pr o of of Cor ol lary 3. The equality with W p ( H γ ,F 1 , H γ ,F 2 ) is the one-dimensional quantile represen- tation of W asserstein distance. The ﬁrst inequalit y is Theorem 1. The geo desic statement then follo ws from Corollary 2. Pr o of of Cor ol lary 4. Fix x in theinterior of the supp ort of H γ and deﬁne f x ( z ) := e − v γ ( x ) z , z ≥ 0 . Then f x is v γ ( x )-Lipsc hitz on [0 , ∞ ) b ecause | f ′ x ( z ) | = v γ ( x ) e − v γ ( x ) z ≤ v γ ( x ) . Since H γ ,F i ( x ) = Z f x ( z ) F i ( dz ) , Corollary 7 implies | H γ ,F 1 ( x ) − H γ ,F 2 ( x ) | ≤ v γ ( x ) W 1 ( F 1 , F 2 ) . Pr o of of Pr op osition 9. By deﬁnition, d γ ,p ( F 1 , F 2 ) = W p  F 1 ◦ s − 1 γ , F 2 ◦ s − 1 γ  . A W asserstein distance on R is ﬁnite if and only if b oth marginals b elong to P p ( R ). Hence d γ ,p ( F 1 , F 2 ) < ∞ if and only if Z | y | p  F i ◦ s − 1 γ  ( dy ) < ∞ , i ∈ { 1 , 2 } . By c hange of v ariables this is equiv alent to Z | s γ ( x ) | p F i ( dx ) < ∞ , i ∈ { 1 , 2 } . The three regime-sp eciﬁc statements follo w b y substituting s γ ( x ) = x γ for γ  = 0 and s 0 ( x ) = log x . Pr o of of Pr op osition 11. Case 1: 0 < γ ≤ 1 . Let ( X , Y ) b e an y coupling of ( F 1 , F 2 ). Since x 7→ x γ is γ -H¨ older on [0 , ∞ ), | X γ − Y γ | ≤ | X − Y | γ . Therefore, E [ | X γ − Y γ | p ] ≤ E [ | X − Y | γ p ] ≤ ( E [ | X − Y | p ]) γ , where the last step is Jensen’s inequalit y applied to the concav e map t 7→ t γ . T aking the inﬁmum o ver couplings and then p th ro ots yields d γ ,p ( F 1 , F 2 ) ≤ W p ( F 1 , F 2 ) γ . Case 2: γ < 0 . On [ a, ∞ ), the deriv ativ e of x 7→ x γ satisﬁes     d dx x γ     = | γ | x γ − 1 ≤ | γ | a γ − 1 . 36 Hence x 7→ x γ is | γ | a γ − 1 -Lipsc hitz on [ a, ∞ ). Lemma 8 gives d γ ,p ( F 1 , F 2 ) = W p (( x 7→ x γ ) # F 1 , ( x 7→ x γ ) # F 2 ) ≤ | γ | a γ − 1 W p ( F 1 , F 2 ) . Case 3: γ = 0 . On [ a, ∞ ), the deriv ativ e of log x is b ounded by a − 1 . Thus log is a − 1 -Lipsc hitz there, and Lemma 8 yields d 0 ,p ( F 1 , F 2 ) = W p ((log) # F 1 , (log) # F 2 ) ≤ a − 1 W p ( F 1 , F 2 ) . Pr o of of Pr op osition 12. Apply Corollary 7 with µ = H γ ,F 1 , ν = H γ ,F 2 , and the L -Lipsc hitz function ψ . Pr o of of Pr op osition 13. Fix x ∈ R and deﬁne f x ( z ) := exp  − θ z (1 − G ( x ))  , z ≥ 0 . Then | f ′ x ( z ) | = θ (1 − G ( x )) exp  − θ z (1 − G ( x ))  ≤ θ (1 − G ( x )) , so f x is θ (1 − G ( x ))-Lipschitz on [0 , ∞ ). Because Pr F ( M θ ≤ x ) = Z f x ( z ) F ( dz ) , Corollary 7 gives     Pr F 1 ( M θ ≤ x ) − Pr F 2 ( M θ ≤ x )     ≤ θ (1 − G ( x )) W 1 ( F 1 , F 2 ) . Pr o of of The or em 2. Fix a compact set K con tained in the interior of the supp ort of H γ . By Assumption 3, there exists a remainder δ θ ( x ) suc h that θ  1 − G ( b ( θ ) + a ( θ ) x )  = v γ ( x ) + A ( θ ) h γ ( x ) + δ θ ( x ) , x ∈ K, with sup x ∈ K | δ θ ( x ) | | A ( θ ) | → 0 . W rite ∆ θ ( x ) := A ( θ ) h γ ( x ) + δ θ ( x ) . Since h γ is lo cally b ounded and K is compact, there exists B K < ∞ suc h that | h γ ( x ) | ≤ B K on K . Hence sup x ∈ K | ∆ θ ( x ) | = O ( | A ( θ ) | ) . By (1), Pr( Z θ ≤ x ) = P 0 ( v γ ( x ) + ∆ θ ( x )) , x ∈ K. Set m K := inf x ∈ K v γ ( x ) > 0 , M K := sup x ∈ K v γ ( x ) < ∞ . 37 F or t ≥ m K / 2, diﬀeren tiation under the in tegral sign gives P ′ 0 ( t ) = − E  X e − tX  , P ′′ 0 ( t ) = E  X 2 e − tX  . Moreo ver, for every t > 0, X 2 e − tX ≤ sup y ≥ 0 y 2 e − y t − 2 = 4 e − 2 t − 2 , so sup t ≥ m K / 2 | P ′′ 0 ( t ) | ≤ 16 e − 2 m 2 K < ∞ . Therefore P 0 has uniformly b ounded second deriv ativ e on a neigh b orho od of { v γ ( x ) : x ∈ K } . A second-order T aylor expansion yields, uniformly for x ∈ K , P 0 ( v γ ( x ) + ∆ θ ( x )) = P 0 ( v γ ( x )) + P ′ 0 ( v γ ( x )) ∆ θ ( x ) + R θ ( x ) , where sup x ∈ K | R θ ( x ) | ≤ C K sup x ∈ K | ∆ θ ( x ) | 2 = O  A ( θ ) 2  = o  A ( θ )  . Since P 0 ( v γ ( x )) = H γ ,F ( x ) and P ′ 0 ( v γ ( x )) = − E [ X e − v γ ( x ) X ], w e obtain Pr( Z θ ≤ x ) = H γ ,F ( x ) − A ( θ ) h γ ( x ) E h X e − v γ ( x ) X i + r θ,K ( x ) , where r θ,K ( x ) := − δ θ ( x ) E h X e − v γ ( x ) X i + R θ ( x ) . Because X e − v γ ( x ) X ≤ e − 1 /m K uniformly on K , the ﬁrst term is o ( A ( θ )) uniformly on K ; the second term is already o ( A ( θ )) uniformly on K . Hence sup x ∈ K | r θ,K ( x ) | | A ( θ ) | → 0 , whic h prov es the theorem. A.6 Pro ofs for Section 5 Pr o of of The or em 3. W e write the design problem (3) in the linear case as sup F : E F [ X ]=1  Z ψ ( x ) F ( dx ) − λD KL ( F ∥ F 0 )  . Throughout, w e restrict atten tion to F ≪ F 0 , since otherwise D KL ( F ∥ F 0 ) = + ∞ . Step 1: we ak duality. Fix η ∈ D . F or any feasible F with E F [ X ] = 1, Z ψ dF − λD KL ( F ∥ F 0 ) = Z ( ψ + η x ) dF − λD KL ( F ∥ F 0 ) − η . Applying Theorem 5 with P = F 0 and f = ( ψ + η x ) /λ giv es Z ( ψ + η x ) dF − λD KL ( F ∥ F 0 ) ≤ λ log Z ( η ) . 38 Hence, for every feasible F and every η ∈ D , Z ψ dF − λD KL ( F ∥ F 0 ) ≤ λ log Z ( η ) − η . T aking the suprem um ov er feasible F and the the inﬁmum ov er η ∈ D yields the weak-dualit y b ound sup F : E F [ X ]=1 n Z ψ dF − λD KL ( F ∥ F 0 ) o ≤ inf η ∈D { λ log Z ( η ) − η } . Step 2: the dual obje ctive is smo oth and has a unique minimizer. Deﬁne J ( η ) := λ log Z ( η ) − η , η ∈ D . Let I = [ a, b ] ⊂ D be compact. F or ev ery η ∈ I and ev ery x > 0, exp  ψ ( x ) + η x λ  ≤ exp  ψ ( x ) + bx λ  , and the latter is integrable by Assumption 4. Similarly , x exp  ψ ( x ) + η x λ  ≤ x exp  ψ ( x ) + bx λ  , and x 2 exp  ψ ( x ) + η x λ  ≤ x 2 exp  ψ ( x ) + bx λ  , with b oth dominating functions integrable. Therefore dominated con vergence implies that Z is t wice contin uously diﬀerentiable on D , with Z ′ ( η ) = 1 λ M 1 ( η ) , Z ′′ ( η ) = 1 λ 2 M 2 ( η ) . Hence J ′ ( η ) = M 1 ( η ) Z ( η ) − 1 = m ( η ) − 1 , and J ′′ ( η ) = 1 λ V ar F η ( X ) ≥ 0 , where F η is the exp onen tial tilt deﬁned by dF η dF 0 ( x ) = exp  ψ ( x )+ ηx λ  Z ( η ) . W e claim that in fact V ar F η ( X ) > 0 for every η ∈ D . Indeed, if V ar F η ( X ) = 0, then F η w ould b e a Dirac mass. But the density dF η /dF 0 is strictly p ositiv e on the supp ort of F 0 , so this can happ en only if F 0 itself is a Dirac mass. In that case m ( η ) ≡ 1, contradicting the assumption that there exist η − , η + ∈ D with m ( η − ) < 1 < m ( η + ). Therefore J ′′ ( η ) > 0 for all η ∈ D , so J is strictly conv ex on D and J ′ is strictly increasing and contin uous. By Assumption 4, J ′ ( η − ) = m ( η − ) − 1 < 0 < m ( η + ) − 1 = J ′ ( η + ) . 39 The in termediate v alue theorem therefore yields some η ⋆ ∈ ( η − , η + ) suc h that J ′ ( η ⋆ ) = 0, that is, m ( η ⋆ ) = 1. Because J ′ is strictly increasing, this ro ot is unique. Since J is strictly conv ex, the same η ⋆ is the unique minimizer of J on D . Step 3: c onstruct the primal optimizer and pr ove str ong duality. Let F ⋆ := F η ⋆ . By construction, E F ⋆ [ X ] = m ( η ⋆ ) = 1 , so F ⋆ is feasible. Because Assumption 4 also giv es Z | ψ ( x ) | exp  ψ ( x ) + η ⋆ x λ  F 0 ( dx ) < ∞ , and b ecause E F ⋆ [ X ] = 1 implies Z x exp  ψ ( x ) + η ⋆ x λ  F 0 ( dx ) = Z ( η ⋆ ) < ∞ , one also has R | f | e f dF 0 < ∞ for f = ( ψ + η ⋆ x ) /λ . Therefore Theorem 5 attains equality at F ⋆ for that function f . Hence Z ( ψ + η ⋆ x ) dF ⋆ − λD KL ( F ⋆ ∥ F 0 ) = λ log Z ( η ⋆ ) . Using E F ⋆ [ X ] = 1, we obtain Z ψ dF ⋆ − λD KL ( F ⋆ ∥ F 0 ) = λ log Z ( η ⋆ ) − η ⋆ = J ( η ⋆ ) . Because η ⋆ minimizes J , this v alue coincide with the weak-dualit y upp er b ound. Therefore strong dualit y holds, F ⋆ is optimal, and (5) and (4) follow. Step 4: uniqueness of the primal optimizer. The map F 7→ R ψ dF is linear, and F 7→ D KL ( F ∥ F 0 ) is strictly conv ex on { F ≪ F 0 } . Hence F 7→ Z ψ dF − λD KL ( F ∥ F 0 ) is strictly concav e on the conv ex feasible set { F : E F [ X ] = 1 , F ≪ F 0 } . A strictly conca ve ob jective has at most one maximizer, so the optimal F ⋆ is unique. B T ec hnical to ols B.1 W asserstein geometry and dualit y W e collect standard facts ab out W asserstein distance that are used throughout the pap er. References include Villani (2009) and Santam brogio (2015). Let ( X , d ) b e a Polish metric space. F or p ≥ 1, write P p ( X ) for the set of Borel probility measures on X with ﬁnite p th moment. Deﬁnition 4 (W asserstein distance) . F or µ, ν ∈ P p ( X ), W p ( µ, ν ) :=  inf π ∈ Γ( µ,ν ) Z d ( x, y ) p π ( dx, dy )  1 /p , where Γ( µ, ν ) denotes the set of couplings of µ and ν . 40 Lemma 6 (Upp er b ound from any coupling) . L et p ≥ 1 and let U, V b e r andom variables in X . If L ( U ) = µ and L ( V ) = ν , then W p ( µ, ν ) ≤ ( E [ d ( U, V ) p ]) 1 /p . Pr o of. Let π := L ( U, V ) ∈ Γ( µ, ν ). By Deﬁnition 4, W p p ( µ, ν ) ≤ Z d ( x, y ) p π ( dx, dy ) = E [ d ( U, V ) p ] . Lemma 7 (One-dimensional quantile representation) . L et p ≥ 1 and let µ, ν ∈ P p ( R ) , wher e R is e quipp e d with the metric d ( x, y ) = | x − y | . L et F µ , F ν denote the distribution functions and deﬁne the quantile functions Q µ ( u ) := inf { x ∈ R : F µ ( x ) ≥ u } , u ∈ (0 , 1) , and similarly for Q ν . Then W p p ( µ, ν ) = Z 1 0 | Q µ ( u ) − Q ν ( u ) | p du. Theorem 4 (Kantoro vic h-Rubinstein duality for W 1 ) . L et µ, ν ∈ P 1 ( X ) . Then W 1 ( µ, ν ) = sup  Z f dµ − Z f dν : f : X → R , ∥ f ∥ Lip ≤ 1  , wher e ∥ f ∥ Lip := sup x  = y | f ( x ) − f ( y ) | d ( x,y ) . Corollary 7 (Lipsc hitz test functions) . L et µ, ν ∈ P 1 ( X ) and let f : X → R b e L -Lipschitz. Then     Z f dµ − Z f dν     ≤ L W 1 ( µ, ν ) . Pr o of. If L = 0, the claim is immediate. Otherwise, f /L has Lipschitz seminorm at most 1, and Theorem 4 yields     Z f dµ − Z f dν     = L     Z ( f /L ) dµ − Z ( f /L ) dν     ≤ L W 1 ( µ, ν ) . Lemma 8 (Deterministic Lipsc hitz pushforw ard) . L et p ≥ 1 , let µ, ν ∈ P p ( X ) , and let g : X → Y b e L -Lipschitz b etwe en metric sp ac es ( X , d X ) and ( Y , d Y ) . Then the pushforwar d me asur es g # µ and g # ν b elong to P p ( Y ) and satisfy W p ( g # µ, g # ν ) ≤ L W p ( µ, ν ) . Pr o of. Let π ∈ Γ( µ, ν ) and let ( U, V ) ∼ π . Then ( g ( U ) , g ( V )) is a coupling of ( g # µ, g # ν ) and E [ d Y ( g ( U ) , g ( V )) p ] ≤ L p E [ d X ( U, V ) p ] . T aking the inﬁmum ov er π ∈ Γ( µ, ν ) and then p th ro ots yields the result. 41 A central step in Section 4 uses a coupling where the mapping is Lischitz conditional on a random en vironment. The following lemma isolates this idea. Lemma 9 (Random Lipsc hitz contraction) . L et p ≥ 1 . L et U 1 , U 2 b e r e al-value d r andom variables with laws µ 1 , µ 2 ∈ P p ( R ) . L et V b e an auxiliary r andom variable taking values in a me asur able sp ac e V , indep endent of ( U 1 , U 2 ) . L et ϕ : R × V → R b e me asur able and assume ther e exists a me asur able function L : V → [0 , ∞ ) such that | ϕ ( u, v ) − ϕ ( u ′ , v ) | ≤ L ( v ) | u − u ′ | for al l u, u ′ ∈ R and al l v ∈ V , and E [ L ( V ) p ] < ∞ . Then W p ( L ( ϕ ( U 1 , V )) , L ( ϕ ( U 2 , V ))) ≤ ( E [ L ( V ) p ]) 1 /p W p ( µ 1 , µ 2 ) . Pr o of. Let ( e U 1 , e U 2 ) b e an optimal coupling of µ 1 and µ 2 . Enlarge the probability space so tat V is indep enden t of ( e U 1 , e U 2 ) and has the same law as in the statement. Deﬁne e Z i := ϕ ( e U i , V ) for i ∈ { 1 , 2 } , so that L ( e Z i ) = L ( ϕ ( U i , V )). By Lemma 6 and the Lipschitz prop erty of ϕ , W p p ( L ( ϕ ( U 1 , V )) , L ( ϕ ( U 2 , V ))) ≤ E h | e Z 1 − e Z 2 | p i ≤ E [ L ( V ) p ] E h | e U 1 − e U 2 | p i . Since ( e U 1 , e U 2 ) is optimal, E [ | e U 1 − e U 2 | p ] = W p p ( µ 1 , µ 2 ), and taking p th ro ots yields the claim. R emark 23 . Lemma 9 is applied in Section 4 with random transformations induced b y the canonical coupling represen tation in Section 4.2. In particular, it is used with maps of the form ϕ ( u, v ) = uv − 1 γ when γ  = 0 and ϕ ( u, v ) = u − log v when γ = 0. B.2 En tropic v ariational iden tities Let P b e a probabilit y measure on a measurable space ( X , F ). The basic v ariational identit y b elo w is the Donsker-V aradhan form ula; see Donsker and V aradhan (1975). F or Q ≪ P , deﬁne the relative en tropy D ( Q ∥ P ) := Z log  dQ dP  dQ, and set D ( Q ∥ P ) = + ∞ if Q is not absolutely contin uous with resp ect to P . Theorem 5 (Donsk er-V aradhan v ariational form ula) . L et f : X → R b e me asur able and assume R e f dP < ∞ . Then log Z e f dP = sup Q ∈P ( X )  Z f dQ − D ( Q ∥ P )  . Deﬁne the Gibbs tilt Q ⋆ by dQ ⋆ dP ( x ) = e f ( x ) R e f dP . If, in addition, R | f | e f dP < ∞ (e quivalently, R | f | dQ ⋆ < ∞ and D ( Q ⋆ ∥ P ) < ∞ ), then the supr emum is attaine d at Q ⋆ . Corollary 8 (En tropy p enalization in minimization form) . L et g : X → R b e me asur able and assume R e − g dP < ∞ . Then − log Z e − g dP = inf Q ∈P ( X )  Z g dQ + D ( Q ∥ P )  . 42 Deﬁne the Gibbs tilt Q ⋆ by dQ ⋆ dP ( x ) = e − g ( x ) R e − g dP . If, in addition, R | g | e − g dP < ∞ , then the inﬁmum is attaine d at Q ⋆ . Lemma 10 (En tropy p enalization with linear constrain ts) . L et P ∈ P ( X ) and let g 0 , g 1 , . . . , g m : X → R b e me asur able. Fix ε > 0 and c onstants c 1 , . . . , c m ∈ R . Assume the fe asible set Q :=  Q ∈ P ( X ) : Z g k dQ = c k , k = 1 , . . . , m  is nonempty and c ontains some Q 0 ≪ P with D ( Q 0 ∥ P ) < ∞ and R | g 0 | dQ 0 < ∞ . L et C :=  Z g 1 dQ, . . . , Z g m dQ  : Q ∈ P ( X ) , Q ≪ P , D ( Q ∥ P ) < ∞ , Z | g 0 | dQ < ∞  ⊆ R m . Assume the tar get moment ve ctor c := ( c 1 , . . . , c m ) b elongs to the r elative interior ri ( C ) . Assume also that the value of the pr oblem inf Q ∈Q  Z g 0 dQ + εD ( Q ∥ P )  is ﬁnite and attaine d at some Q ⋆ . Then ther e exist ﬁnite L agr ange multipliers θ 1 , . . . , θ m ∈ R such that Q ⋆ admits the exp onential-tilt r epr esentation dQ ⋆ dP ( x ) = exp  − 1 ε ( g 0 ( x ) + P m k =1 θ k g k ( x ))  R exp  − 1 ε ( g 0 + P m k =1 θ k g k )  dP . R emark 24 . Lemma 10 is in fact a general constrained-entrop y pro jection result. Section 5 uses a sharp er one-dimensional argumen t tailored to the mean constrain t E F [ X ] = 1 so that the existence of a ﬁnite dual optimizer can b e stated directly in terms of the tilted mean. B.3 Bac kground on en tropic optimal transp ort Let µ ∈ P ( X ) and ν ∈ P ( Y ) b e Borel probability measures on P olish spaces and let c : X × Y → R ∪ { + ∞} b e a measurable cost. Let Γ( µ, ν ) denote the set of couplings with marginals ( µ, ν ). F or ε > 0, deﬁne the en tropic optimal transp ort problem inf π ∈ Γ( µ,ν )  Z c dπ + εD ( π ∥ µ ⊗ ν )  . W e use only structural consequences The relev an t ingredients are the duality and the Gibbs form of optimizers. See, e.g., L´ eonard (2014), Peyr ´ e and Cuturi (2019), Ghosal et al. (2022), and Nutz (2022); and see also Cuturi (2013) for the computational Sinkhorn formulation that made en tropy regularization cen tral in applied optimal transp ort. Theorem 6 (Entropic optimal transp ort duality and Gibbs form) . Assume the primal pr oblem is ﬁnite for some ε > 0 and that an optimizer exists. Then ther e exist me asur able functions φ : X → R and ψ : Y → R such that an optimal c oupling π ⋆ admits a density with r esp e ct to µ ⊗ ν of the form dπ ⋆ d ( µ ⊗ ν ) ( x, y ) = exp  φ ( x ) + ψ ( y ) − c ( x, y ) ε  , with ( φ, ψ ) chosen so that π ⋆ ∈ Γ( µ, ν ) . Mor e over, ( φ, ψ ) solve a c onc ave dual maximization pr oblem obtaine d by F enchel duality. 43 R emark 25 . Note that Theorem 6 is stated at a level suﬃcien t for our purp oses. Section 5 uses the same conv ex-dualit y logic in a reduced setting where the p enalt y is relativ e entrop y with resp ect to a baseline F 0 and the decision v ariable is the marginal distribution F . In that sense, the planner problem can b e read as a one-marginal en tropy pro jection deriv ed from the broader Schr¨ odinger and en tropic-transp ort framework. References Arnold, B. C., N. Balakrishnan, and H. N. Nagaraja (1998): R e c or ds , New Y ork: Wiley . Balkema, A. A. and L. de Haan (1974): “Residual Life Time at Great Age,” The Annals of Pr ob ability , 2, 792–804. Barndorff-Nielsen, O. (1964): “On the Limit Distribution of the Maxim um of a Random Num b er of Independent Random V ariables,” A cta Mathematic a A c ademiae Scientiarum Hungaric ae , 15, 399–403. Becker, L. and S. Mangin (2023): “The Eﬀect of Search F rictions on Extreme Outcomes,” W orking pap er, originally circulated in 2023. Bobbia, B., C. Dombr y, and D. V arron (2021): “The Coupling Metho d in Extreme V alue Theory ,” Bernoul li , 27, 1824–1850. Bol te, L., N. Immorlica, and M. O. Jackson (2024): “The Role of Referrals in Immobility , Inequality , and Ineﬃciency in Lab or Markets,” Journal of L ab or Ec onomics , adv ance online publication / forthcoming in Journal of Labor Economics. Buhai, I. S. and M. J. v an der Leij (2023): “A Social Net work Analysis of Occupational Segregation,” Journal of Ec onomic Dynamics and Contr ol , 147, 104593. Cal v ´ o-Armengol, A. and M. O. Jackson (2004): “The Eﬀects of So cial Netw orks on Emplo yment and Inequalit y ,” Americ an Ec onomic R eview , 94, 426–454. Cuturi, M. (2013): “Sinkhorn Distances: Ligh tsp eed Computation of Optimal T ransp ort,” in A dvanc es in Neur al Information Pr o c essing Systems 26 , 2292–2300. de Haan, L. and A. Ferreira (2006): Extr eme V alue The ory: A n Intr o duction , New Y ork: Springer. Donsker, M. D. and S. R. S. V aradhan (1975): “Asymptotic Ev aluation of Certain Marko v Process Exp ectations for Large Time, I,” Communic ations on Pur e and Applie d Mathematics , 28, 1–47. Einmahl, J. H. J. and Y. He (2023): “Extreme V alue Estimation for Heterogeneous Data,” Journal of Business & Ec onomic Statistics , 41, 255–269. F ournier, N. and A. Guillin (2015): “On the Rate of Conv ergence in W asserstein Distance of the Empirical Measure,” Prob ability The ory and R elate d Fields , 162, 707–738. Galambos, J. (1973): “The Distribution of the Maxim um of a Random Num b er of Random V ariables with Applications,” Journal of Applie d Pr ob ability , 10, 122–129. Galichon, A. (2016): Optimal T r ansp ort Metho ds in Ec onomics , Princeton, NJ: Princeton Univ ersity Press. Ghosal, P., M. Nutz, and E. Bernton (2022): “Stability of Entropic Optimal T ransp ort and Schr¨ odinger Bridges,” Journal of F unctional Analysis , 283, 109622. Granovetter, M. S. (1973): “The Strength of W eak Ties,” Americ an Journal of So ciolo gy , 78, 1360–1380. Ioannides, Y. M. and L. D. Lour y (2004): “Job Information Netw orks, Neighborho od Eﬀects, and Inequality ,” Journal of Ec onomic Liter atur e , 42, 1056–1093. L ´ eonard, C. (2014): “A Surv ey of the Schr¨ odinger Problem and Some of Its Connections with Optimal T ransp ort,” Discr ete & Continuous Dynamic al Systems - A , 34, 1533–1574. Mangin, S. (2026): “Extreme V alue Theory with Heterogeneous Agents,” Ec onometric a , forthcoming. Mansanarez, P., G. Pol y, and Y. Sw an (2025): “Stein’s Method for F r´ echet Approximation: A Regularly V arying F unctions Approac h,” arXiv preprin t arXiv:2510.14016, preprint. Megyesi, Z. (2002): “Domains of Geometric P artial Attraction of Max-Semistable Laws: Structure, Merge and Almost S ure Limit Theorems,” Journal of The oretic al Pr ob ability , 15, 973–1005. Montgomer y, J. D. (1991): “So cial Netw orks and Lab or-Mark et Outcomes: T ow ard an Economic Analysis,” Americ an Ec onomic R eview , 81, 1408–1418. Nutz, M. (2022): “Introduction to Entropic Optimal T ransp ort,” Lecture notes, av ailable online. Peyr ´ e, G. and M. Cuturi (2019): “Computational Optimal T ransp ort,” F oundations and T r ends in Machine L e arning , 11, 355–607. Pickands, J. (1975): “Statistical Inference Using Extreme Order Statistics,” The Annals of Statistics , 3, 119–131. Rachev, S. T. and S. I. Resnick (1991): “Max-Geometric Inﬁnite Divisibility and Stabilit y ,” Sto chastic Mo dels , 7, 191–218. Resnick, S. I. (2008): Extr eme V alues, R e gular V ariation and Point Pr o c esses , New Y ork: Springer, 2 ed. R othschild, M. and J. E. Stiglitz (1970): “Increasing Risk: I. A Deﬁnition,” Journal of Ec onomic The ory , 2, 225–243. 44 Sant ambrogio, F. (2015): Optimal T r ansp ort for Applie d Mathematicians: Calculus of V ariations, PDEs, and Mo deling , vol. 87 of Pro gr ess in Nonline ar Diﬀer ential Equations and Their Applic ations , Basel: Birkh”auser. Shaked, M. and J. G. Shanthikumar (2007): Sto chastic Or ders , Springer Series in Statistics, New Y ork: Springer. Sil vestrov, D. and J. L. Teugels (1998): “Limit Theorems for Extremes with Random Sample Size,” A dvanc es in Applie d Pr ob ability , 30, 777–806. Top a, G. (2001): “So cial Interactions, Lo cal Spillov ers and Unemplo yment,” R eview of Ec onomic Studies , 68, 261–295. Villani, C. (2009): Optimal T r ansp ort: Old and New , v ol. 338 of Grund lehr en der mathematischen Wissenschaften , Springer. 45

The Geometry of Heterogeneous Extremes: Optimal Transport and Entropic Design

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment