On the $q$-integrability of $p$-Wasserstein barycenters

ON THE 𝑞 -IN TEGRABILI TY OF 𝑝 - W ASSERSTEIN BARY CEN TERS CAMILLA BRIZZI AND LORENZO PORTINALE Abstract. W e study the 𝐿 𝑞 -regularity of the density of barycenters of 𝑁 probability measures on R 𝑑 with respect to the 𝑝 - W asserstein metric ( 1 < 𝑝 < ∞ ). According to a pre vious result by the rst author and collaborators [ BFR25 ], if one marginal is absolutely continuous, so is the 𝑊 𝑝 -barycenter . The next natural question is whether the 𝐿 𝑞 - regularity on the marginals is also preserved for any 𝑞 > 1 , as in the classical case ( 𝑝 = 2 ) of Agueh–Carlier [ A C11 ], or for 𝑊 𝑝 -geodesics ( 𝑁 = 2 ). Here we prove that this is the case if one marginal belongs to 𝐿 𝑞 and the supports of all the marginals satisfy suitable geometric assumptions. However , we show that, as soon as 𝑁 > 2 , it is possible to nd examples of 𝑊 𝑝 -barycenters which are not 𝑞 -integrable, ev en if one marginal is compactly supporte d and bounded, thus highlighting the role played by the ge ometry of the supports. Furthermore, we provide a general estimate of the 𝐿 𝑞 -norm, including a detailed study of the sources of singularities, and a characterization of the 𝑊 𝑝 -barycenters à la Agueh–Carlier in terms of the associated Kantorovich potentials. Finally , we explicitly compute the 𝑊 𝑝 -barycenters of measures obtained as push-for ward of special ane transformations. In this case, regularity holds without any additional requirement on the supports. Contents 1. Introduction 1 2. Preliminaries 7 3. 𝐿 𝑞 -regularity (and counter examples) of the 𝑝 - W asserstein barycenter 16 4. A general estimate on the 𝐿 𝑞 -norm of 𝑝 - W asserstein barycenters 21 5. Optimal maps for 𝑝 - W asserstein distance and barycenters via ane transformations 23 Appendix A. Injectivity estimate and proof of Theorem 2.4 31 References 34 1. Introduction W asserstein barycenters—dened rst in [ A C11 ]—are an important generalization of the classi- cal notion of barycenters of points, as they can be seen as Fréchet means over the W asserstein space of probability measures. The concept rapidly gained popularity as a valuable to ol for meaningful geometric interpolation b etween probability measur es and computation of represen- tative summaries of input datasets. Thus, applications of W asserstein barycenters span several domains such as data science, statistics, machine learning and image processing (see, among others, [ RPDB11 , BLGL15 , BVFRT22 , PZ20 ]). This has sparked growing interest acr oss multiple mathematical communities, making research on this topic—both with theoretical and applie d perspectives—highly active. In this article, we study regularity and integrability properties of 𝑝 - W asserstein barycenters, which are the natural generalizations of 2 - W asserstein barycenters introduced by Agueh and Carlier [ A C11 ] to the 𝑝 - W asserstein distance for 1 < 𝑝 < ∞ and which have been extensiv ely studied in [ BFR25 ]. W e refer to the work [ CCE24 ] for the case of the so-called W asserstein medians, corresponding to 𝑝 = 1 , which, due to the lack of strict convexity Date : Februar y 23, 2026. 2020 Mathematics Subject Classication. Primary 49Q20; Secondar y 49J40, 49K21. Key words and phrases. W asserstein bar ycenter , optimal transp ort, multi-marginal optimal transport. 1 2 C. BRIZZI AND L. PORTINALE presents additional diculties. Further generalizations of the metric include ℎ - W asserstein dis- tance, with ℎ being a nonnegative strictly convex function (see [ BFR26 ]) and variants of OT , such as unbalance d OT ( se e , [ FMS21 , Buz25 ]) or entropic OT ( see, for instance, [ CD14 , Chi25 ]). Another active resear ch direction, which goes b e yond the interest of our paper , is the generalization of the 2 - W asserstein bar ycenters in more general spaces, as for instance Riemannian manifolds [ KP17 ], Alexandrov spaces [ Jia17 ], Radon spaces [ Kr o18 ], and abstract Wiener spaces [ HLZ24 ]. The 𝑝 - W asserstein barycenters are w eighted averages of probability measur es in R 𝑑 with respect to the 𝑝 - W asserstein metric. The 𝑝 - W asserstein distance, 𝑊 𝑝 , between two probability measures 𝜇 and 𝜈 , with nite 𝑝 th moments 1 , is dened as 𝑊 𝑝 ( 𝜇 , 𝜈 ) ≔ min 𝜂 ∈ Π ( 𝜇 ,𝜈 )   R 2 𝑑 | 𝑥 − 𝑦 | 𝑝 d 𝜂 ( 𝑥 , 𝑦 )  1 𝑝 , where Π ( 𝜇 , 𝜈 ) denotes the set of transport plans from 𝜇 to 𝜈 , i.e. Π ( 𝜇 , 𝜈 ) ≔  𝜂 ∈ P ( R 2 𝑑 ) : 𝜋 1 # 𝜂 = 𝜇 , 𝜋 2 # 𝜂 = 𝜈  . Thus, giv en 𝑁 ≥ 2 pr obability measures 𝜇 1 , . . . , 𝜇 𝑁 ∈ P 𝑝 ( R 𝑑 ) , which we r efer to as marginals , and given weights 𝜆 1 , . . . , 𝜆 𝑁 > 0 such that  𝑁 𝑖 = 1 𝜆 𝑖 = 1 , the 𝑝 - W asserstein bar ycenter is the solution 2 𝑊 𝑝 bar ( ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 , .. .,𝑁 ) ≔ argmin 𝜈 ∈ P 𝑝 ( R 𝑑 ) 𝑁  𝑖 = 1 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 ) . ( 𝑊 𝑝 bar) Notice that, in the case of 𝑁 = 2 , the 𝑝 - W asserstein bar ycenter is nothing but the 𝑝 - W asserstein geodesic parametrize d with a suitable rescaled time (see ( 3.1 ) ). In this special case the 𝑝 - W asserstein geodesics is not only absolutely continuous under the assumption of the rst marginal being absolutely continuous, but it also inherits ner regularity properties, as for instance 𝐿 𝑞 -regularity of the density , as discussed in [ San15 , Lemma 4.22]. A natural question is therefore whether this result also holds when 𝑁 ≥ 3 . When 𝑝 = 2 , it has b een shown in [ A C11 ] that the 𝐿 ∞ -regularity of the density of one marginal is preserved by the W asserstein bar ycenter . Even if not explicitly written there, the same proof can be easily used to show that any 𝐿 𝑞 -regularity , with 𝑞 ≥ 1 , is preserved. The case 𝑝 ≠ 2 is much more delicate . The main dierence here is that the Hessian of the 𝑝 -cost function | · | 𝑝 is the identity matrix when 𝑝 = 2 , while for any 𝑝 ≠ 2 it is not constant and it degenerates at 0 for 𝑝 > 2 , whereas it is singular for 1 < 𝑝 < 2 . One of the key r esults in [ BFR25 ] states that, if all the measures are absolutely continuous with r esp ect to the Lebesgue measure L 𝑑 , then 𝜈 𝑝 : = 𝑊 𝑝 bar ( ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 , .. .,𝑁 ) is in turn absolutely continuous (cf. [ BFR25 , Theorem 4.1]). Moreover , for 𝑝 ≥ 2 , they show that it is sucient to have only one absolutely continuous marginal, say 𝜇 1 . Here, with simple considerations, w e extend this result to every 1 < 𝑝 < ∞ (cf. The orem 2.4 in the next section). In [ BFR25 ], the problem of a possibly singular Hessian is overcome by partitioning the space and dev eloping an ad hoc strategy to treat the singularities. In contrast, when dealing with 𝐿 𝑞 regularity for 𝑞 > 1 , it turns out that the bare assumption 𝜇 1 ∈ 𝐿 𝑞 ( R 𝑑 ) (or e ven 𝜇 1 ∈ 𝐿 ∞ ( R 𝑑 ) ) does not suce to guarantee 𝐿 𝑞 integrability for 𝜈 𝑝 . In Example 3.3 , we sho w that, even in simple cases, the 𝐿 𝑞 regularity of the barycenter may fail if 𝑞 is bigger than a natural thr eshold which depends on 𝑝 . In the aforementioned examples, it is evident that the lack of integrability may arise fr om the potential degeneracy/singularity of the Euclidean 𝑝 -barycenter map when 𝑝 ≠ 2 , due in turn to the previously discussed behavior of the Hessian of | · | 𝑝 . Nonetheless, this potential sour ce of singularity can be o vercome with tailored assumptions on the supports of the reference measures. In particular , we assume compactness and some geometric properties which guarantee a positive distance between the supp ort of the marginals ( 𝜇 𝑖 ) 𝑖 = 1 , .. .,𝑁 1 the set of probability measures with nite 𝑝 th moments is P 𝑝 ( R 𝑑 ) ≔  𝜇 ∈ P ( R 𝑑 ) :  R 𝑑 | 𝑥 | 𝑝 d 𝜇 ( 𝑥 ) < ∞  2 existence of a solution here can be inferred by a straightforward adaptation of [ A C11 , Proposition 2.3] or by using the multi-marginal formulation, see ( C 𝑝 − MM ) below . ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 3 and the one of the bar ycenter . Under these assumptions, we show in The orem 1.1 that, giv en 𝑓 1 , 𝑔 𝑝 ∈ 𝐿 1 , such that 𝜇 1 = 𝑓 1 L 𝑑 and 𝜈 𝑝 = 𝑔 𝑝 L 𝑑 , if 𝑓 1 ∈ 𝐿 𝑞 , then there exists a constant 𝐶 > 0 , such that | | 𝑔 𝑝 | | 𝐿 𝑞 ≤ 𝐶 𝜆 𝑑 ( 1 − 𝛼 𝑝 ) 1 | | 𝑓 1 | | 𝐿 𝑞 . (1.1) W e refer to the next se ction for a more detaile d discussion. This analysis is coherent with the result by M. Goldman and L. Koch in [ GK25 ], where the partial 𝐶 1 , 𝛼 regularity for the optimal map of the 𝑝 - W asserstein distance is pr oved to hold away from xed points, i.e. at 𝑥 such that 𝑇 ( 𝑥 ) = 𝑥 . Howev er , we provide a class of e xamples where the the Euclidean 𝑝 -barycenter is uniformly bounded from below and never degenerates, and therefore the estimate ( 1.1 ) on the 𝐿 𝑞 -norm of the density of the barycenter holds without any assumption on the supports. This is the case of barycenters of marginals dene d as some ane transformations of one source measure, discussed in the nal section of this paper . It would certainly be v ery interesting to provide a more complete overview on which setting ensures integrability , and most of all, provide a countexample (if any exist) of the nonintegrability of the 𝑝 - W assertein barycenter with 𝑝 ≠ 2 when all the marginals are absolutely continuous, which remains for the time b eing out of the reach of the current work. 1.1. Main results and strategy. The rst contributions of this paper is the 𝐿 𝑞 integrability of 𝑝 - W asserstein barycenters for compactly supported measures with 𝜇 1 ∈ 𝐿 𝑞 when the measures satisfy a suitable ge ometric conditions, which will be discusse d in the following. Throughout this section, we shall always assume that 𝜇 1 ≪ L 𝑑 . Integrability for distant supports . W e consider the following set of assumptions: • Compact supports: there exists 𝑀 > 0 , such that spt 𝜇 𝑖 ⊂ 𝐵 𝑀 2 , for every 𝑖 = 1 , . . . , 𝑁 ( C pt ) • Distant supports I: it holds D : = inf ( 𝑥 1 ,. . ., 𝑥 𝑁 ) ∈ > 𝑁 𝑖 = 1 spt 𝜇 𝑖   𝑥 𝑝 ( 𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑁 ) − 𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 )   > 0 . ( ℎ𝑝 1 ) • Distant supports II: it holds 𝑚 : = inf 𝑖 ∈ { 1 ,. . .,𝑁 } inf ( 𝑥 1 ,. . ., 𝑥 𝑁 ) ∈ > 𝑁 𝑖 = 1 spt 𝜇 𝑖   𝑥 𝑖 − 𝑥 𝑝 ( 𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑁 )   > 0 . ( ℎ𝑝 2 ) Note that D = D ( 𝜇 1 , . . . , 𝜇 𝑁 ) , same for 𝑚 , but for simplicity we omit the dependence. In case of compact supports, ( ℎ 𝑝 1 ) is equivalent (see discussion at the beginning of Subsection 3.2.1 ) to dist  spt 𝜇 1 , 𝑥 𝑝  𝑁 ? 𝑖 = 2 spt 𝜇 𝑖   > 0 . Thus, in case of compactly supp orted measures, one can imme diately see that ( ℎ𝑝 1 ) is weaker than ( ℎ𝑝 2 ) (which is a r equirement on every marginal, not only 𝜇 1 ), that is in turn equivalent to dist  spt 𝜇 𝑗 , 𝑥 𝑝  𝑁 ? 𝑖 = 2 spt 𝜇 𝑖   > 0 , for every 𝑗 = 1 , . . . , 𝑁 . The latter , stronger condition is ne eded to ensure 𝑞 -integrability for the 𝑝 -barycenter when 𝑝 ∈ ( 1 , 2 ) . Our rst main result is that, under the aforementioned geometric assumptions, one can show integrability for the 𝑝 -bar ycenter . Theorem 1.1 (Integrability with distant support) . Let 𝜇 1 , . . . , 𝜇 𝑁 satisfy ( C pt ) and so that 𝜇 1 = 𝑓 1 d L 𝑑 with 𝑓 1 ∈ 𝐿 𝑞 . Then there exists 𝐶 = 𝐶 ( 𝑑 , 𝑀 ) ∈ R + so that 4 C. BRIZZI AND L. PORTINALE • if 𝑝 ≥ 2 and ( ℎ𝑝 1 ) is satise d, then 𝑔 𝑝 ∈ 𝐿 𝑞 such that ∥ 𝑔 𝑝 ∥ 𝐿 𝑞 ≤ 𝐶  𝜆 𝑑 ( 1 − 𝛼 𝑝 ) 1 D 𝑑 ( 𝑝 − 2 )  𝑞 ′ ∥ 𝑓 1 ∥ 𝐿 𝑞 ; • if 𝑝 ∈ ( 1 , 2 ) and ( ℎ𝑝 2 ) is satise d, then 𝑔 𝑝 ∈ 𝐿 𝑞 and such that ∥ 𝑔 𝑝 ∥ 𝐿 𝑞 ≤ 𝐶  𝜆 𝑑 ( 1 − 𝛼 𝑝 ) 1 𝑚 𝑑 ( 2 − 𝑝 )  𝑞 ′ ∥ 𝑓 1 ∥ 𝐿 𝑞 , Where 𝑞 ′ ∈ N is the conjugate of 𝑞 . The proof of this theorem can be found in Section 3.1 , rst for 𝑝 ≥ 2 (Subsection 3.2.1 ) and then for 𝑝 ∈ ( 1 , 2 ) (Subsection 3.2.2 ). Counterexamples to integrability . Our next contribution is to show the failure of 𝑞 -integrability of the 𝑝 - W asserstein bar ycenters with 𝑝 ≠ 2 , even for 𝑓 1 ∈ 𝐿 ∞ and compactly supported. This is in sharp contrast to the case 𝑝 = 2 , where, instead, 𝑓 1 ∈ 𝐿 𝑞 ensures that also 𝑔 𝑝 ∈ 𝐿 𝑞 . Theorem 1.2 (Nonintegrability examples) . For every 𝑝 ≠ 2 and 𝑁 > 2 , there exist measur es 𝜇 1 , . . . , 𝜇 𝑁 which are compactly supp orte d, 𝜇 1 = 𝑓 1 L 𝑑 with 𝑓 1 ∈ 𝐿 ∞ ( R 𝑑 ) , so that 𝜈 𝑝 = 𝑔 𝑝 L 𝑑 with 𝑔 𝑝 ∈ 𝐿 𝑞 if and only if 𝑞 < 𝑞 0 , where 𝑞 0 = 𝑞 0 ( 𝑝 ) > 1 . In particular , 𝑔 𝑝 ∉ 𝐿 𝑞 for every 𝑞 ≥ 𝑞 0 . W e provide below mor e details about the construction of these nonintegrable e xamples, but we shall rst discuss the general strategy behind Theorem 1.1 and Theorem 1.2 . Strategy : W e use the same approach as in [ BFR25 ] to consider the e quivalent multi-marginal optimal transport denition of 𝑝 - W asserstein barycenter ( 𝑊 𝑝 bar ) , where the cost function 𝑐 𝑝 : R 𝑑 × · · · × R 𝑑 → [ 0 , ∞) is given by 𝑐 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) ≔ min 𝑧 ∈ R 𝑑 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 = 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑥 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) | 𝑝 , (1.2) where the classical 𝑝 -bar ycenter 𝑥 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) ≔ argmin 𝑧 ∈ R 𝑑 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 of x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) is the unique 3 minimizer of ( 1.2 ) . Then the multi-marginal optimal transport (MMOT) problem reads 𝐶 𝑝 -MM ≔ min 𝛾 ∈ Π ( 𝜇 1 ,. . .,𝜇 𝑁 )  R 𝑁 𝑑 𝑐 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) d 𝛾 ( 𝑥 1 , . . . , 𝑥 𝑁 ) , (C 𝑝 − MM ) where 4 Π ( 𝜇 1 , . . . , 𝜇 𝑁 ) : = { 𝛾 ∈ P ( R 𝑁 𝑑 ) : 𝜋 𝑖 ♯ 𝛾 = 𝜇 𝑖 , for all 𝑖 = 1 , . . . , 𝑁 } is the set of admissible transp ort plans between the marginals 𝜇 1 , . . . , 𝜇 𝑁 . Note that 𝐶 𝑝 -MM < +∞ whenever 𝜇 1 , . . . , 𝜇 𝑁 ∈ P 𝑝 ( R 𝑑 ) . Moreover , the existence of optimizers follows from standar d direct method. The 𝑝 - W asserstein bar ycenter is then the probability measur e on R 𝑑 𝑥 𝑝 # 𝛾 𝑝 , where 𝛾 𝑝 is any optimizer of ( C 𝑝 − MM ). ( 𝑊 𝑝 bar-MM) Note that 𝜈 𝑝 = 𝑊 𝑝 bar ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 .. .,𝑁 = ( 𝑥 𝑝 ) # 𝛾 𝑝 (see ( 2.1 ) ). In [ BFR25 , Theorem 1.2] (cf. also Theorem 2.5 below), it has also been shown that, when 𝜇 1 ≪ L 𝑑 , the unique optimal plan 𝛾 𝑝 for ( C 𝑝 − MM ) is of Monge type in the multi-marginal sense, that is, its supp ort is concentrated on a graph of a function dened on the support of 𝜇 1 , or , equivalently , 𝛾 𝑝 = ( Id , 𝑇 2 , . . . , 𝑇 𝑁 ) ♯ 𝜇 1 , where 3 Existence and uniqueness are a direct consequence of strict convexity and coercivity of the function 𝑤 ↦→ | 𝑤 | 𝑝 . 4 W e denote by 𝜋 𝑖 the projection on the 𝑖 th marginal space, 𝜋 𝑖 : R 𝑁 𝑑 → R , 𝜋 𝑖 ( x ) = 𝑥 𝑖 . ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 5 𝜇 𝑖 = 𝑇 𝑖 ♯ 𝜇 1 . This generalizes the sparsity result by Gangb o and Świe ˛ ch in [ GŚ98 ], which holds for 𝑝 = 2 . Moreover , this implies the existence of a map bar 𝑝 : spt 𝜇 1 → spt 𝜈 𝑝 (see eq. ( 2.7 ) ) such that 𝜈 𝑝 = bar 𝑝 ♯ 𝜇 1 . By means of a standard change of variables formula, for injective and smooth enough bar 𝑝 , the two densities are linked by the equation 𝑔 𝑝 ◦ bar 𝑝 = 𝑓 1 | det ∇ bar 𝑝 | . The injectivity of bar 𝑝 directly relates to the interplay between the multi-marginal ( 𝑊 𝑝 bar-MM ) and the coupled-two marginal ( 𝑊 𝑝 bar ) nature of 𝑝 - W asserstein bar ycenter . W e can inde ed show (Corollary 2.10 ) that bar 𝑝 can be alternatively identied as the unique optimal map 𝑆 1 for 𝑊 𝑝 ( 𝜇 1 , 𝜈 𝑝 ) , which, being b oth measures absolutely continuous, is invertible. The proof of this fact relies on a characterization of 𝑝 - W asserstein bar ycenters in terms of Kantor ovich p otentials ( cf. Proposition 2.8 ), which extends to general 1 < 𝑝 < ∞ , the characterization proved in [ A C11 ] for 𝑝 = 2 , and which allows to show ( cf. Pr oposition 2.9 ) that 𝑥 𝑝 ◦  𝑅 1 , . . . , 𝑅 𝑁  = id 𝜈 𝑝 − a.e. , where 𝑅 𝑖 = 𝑇 𝑖 ◦ 𝑆 − 1 1 . The challenge is thus to understand the regularity of bar 𝑝 and bar − 1 𝑝 and to nd a lower bound for | det ∇ bar 𝑝 ( 𝑥 ) | . • W e rst analyze the case where the rst marginal 𝜇 1 is absolutely continuous and all the other marginals are single Diracs. In this case, bar − 1 𝑝 can be computed explicitly (see ( 2.17 ) . This allows for a detailed analysis of the p otential sour ces of singularities and their order of magnitude (cf. Proposition 2.14 and Proposition 2.15 ). • This analysis provides natural and sucient assumptions on the supports of the measures (see ( ℎ𝑝 1 ) and ( ℎ𝑝 2 ) ), which guarantee a uniform lower bound, independent of the choice of the points where the Diracs are concentrated. W e then take inspiration from the proof of r egularity of W asserstein geodesics of [ San15 , Section 4.3]: we study the case of general marginals by approximating them with a sequence of nite sums of Diracs, which, thanks to standard stability of optimal transport, leads to ( 1.1 ) (see Proposition 3.1 ). • Thanks to this analysis (see in particular ( 2.19 ) and ( 2.27 ) ), we are able to exhibit the claimed counterexamples in Theorem 1.2 . This shows that also in the ver y regular case of the rst marginal being the uniform measure on a ball, integrability and regularity fail for certain values of 𝑞 (see Examples 3.3 and Proposition 2.12 ). Let us delve more into the proof of The or em 1.2 and provide details about the construction of nonintegrable examples. T o this purpose, we consider 𝜇 2 , . . . , 𝜇 𝑁 atomic measures with a single atom, i.e. there exists ˆ 𝑥 : = ( 𝑥 2 , . . . , 𝑥 𝑁 ) such that 𝜇 𝑖 = 𝛿 𝑥 𝑖 , for every 𝑖 = 2 , . . . , 𝑁 . W e x 𝐵 ⊂ R 𝑑 be an open, b ounded set of L 𝑑 -measure 1 , and consider 𝜇 1 = 1 𝐵 L 𝑑 . Note that 𝑓 1 ∈ 𝐿 𝑞 for every 𝑞 ∈ [ 1 , +∞] , and it is compactly supported. A s we show in Example 3.3 , under certain assumptions on 𝑥 2 , . . . , 𝑥 𝑁 which depend on whether 𝑝 ∈ ( 1 , 2 ) or 𝑝 > 2 , one can prove the existence of a 𝑞 0 = 𝑞 0 ( 𝑝 ) so that Theorem 1.2 holds true. Let us dene 𝑧 𝑝 : = 𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 ) the Euclidean 𝑝 -bar ycenter of the 𝑁 − 1 points with the same weights 𝜆 2 , . . . , 𝜆 𝑁 . (1) For 𝑝 > 2 , we pick ˆ 𝑥 ∉ diag ( R 𝑑 ( 𝑁 − 1 ) ) . Assume that 𝑧 𝑝 ∈ 𝐵 . Then 𝑔 𝑝 ∈ 𝐿 𝑞 ⇔ 𝑞 < 1 𝛼 𝑝 = 𝑝 − 1 𝑝 − 2 . (2) For 𝑝 ∈ ( 1 , 2 ) , assume that 𝑥 𝑖 ≠ 𝑧 𝑝 and 𝑥 𝑖 ∈ 𝐵 . Then 𝑔 𝑝 ∈ 𝐿 𝑞 ⇔ 𝑞 < 1 𝛼 𝑝 = 1 2 − 𝑝 . 6 C. BRIZZI AND L. PORTINALE General estimate . The previous examples show that, in general, one can not expect to obtain 𝑞 -integrability without extra assumptions on the measures. Nevertheless, we can provide a general estimate by analyzing possible sources of singularities. Obviously , this do es not generally ensure integrability , as the upp er bound may blow-up when the geometric assumptions are not in place, as expected. T o this purp ose , we need to x some notation. For any p oint x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ R 𝑁 𝑑 , we set 𝐻 𝑖 ( x ) : = 𝜆 𝑖 ∇ 2 𝑥 𝑖 ( | 𝑥 𝑖 − 𝑧 | 𝑝 )   𝑧 = 𝑥 𝑝 and (1.4) Λ 𝑖 ( x ) : = minimum eigenvalue of the matrix 𝐻 𝑖 ( x )  𝑁  𝑘 = 1 𝐻 𝑘 ( x )  − 1 𝐻 𝑖 ( x ) . Moreover , we dene the following partition of the space: for every 𝑆 ⊂ { 1 , . . . , 𝑁 } , set 𝐷 𝑆 : =  x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ R 𝑁 𝑑 : 𝑥 𝑝 ( x ) = 𝑥 𝑖 for 𝑖 ∈ 𝑆 , and 𝑥 𝑝 ( x ) ≠ 𝑥 𝑖 for 𝑖 ∉ 𝑆  . (1.5) As 𝜇 1 is assumed to be the reference measure, w e also dene the family F 1 of subsets of { 1 , . . . , 𝑁 } that contain 1 , i.e. F 1 : = { 𝑆 ⊂ { 1 , . . . , 𝑁 } : 1 ∈ 𝑆 } . Finally , for every 𝑆 ⊂ { 1 , . . . , 𝑁 } , we set 𝐷 1 𝑆 : = 𝜋 1 ( 𝐷 𝑆 ∩ spt 𝛾 𝑝 ) , (1.6) Note that the sets 𝐷 1 𝑆 form a partition of spt 𝜇 1 . Proposition 1.3 (General 𝐿 𝑞 -norm estimate) . For every 𝜇 1 , . . . , 𝜇 𝑁 , let 𝜇 1 = 𝑓 1 L 𝑑 𝜈 𝑝 = 𝑔 𝑝 L 𝑑 . Assume that 𝑓 1 ∈ 𝐿 𝑞 ( R 𝑑 ) , for some 𝑞 > 1 . Then the 𝐿 𝑞 -norm of 𝑔 𝑝 can b e controlled by ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 ≤   𝑆 ∈ F 1 𝐷 1 𝑆 | 𝑓 1 ( 𝑥 1 ) | 𝑞 d 𝑥 1 + 1 2   𝑆 ∉ F 1 𝐷 1 𝑆  max 𝑖 ∉ 𝑆 | 𝐻 1 𝑖 ( 𝑥 1 ) | min 𝑖 ∉ 𝑆 Λ 1 𝑖 ( 𝑥 1 )  𝑑 ( 𝑞 − 1 ) | 𝑓 1 ( 𝑥 1 ) | 𝑞 d 𝑥 1 . The proof of such estimate is obtained with similar techniques as in [ BFR25 ]: partitioning the space and using a slightly impr oved version of [ BFR26 , Lemma 5.2], which is a local injectivity estimate on the support of the optimal plan 𝛾 𝑝 for ( C 𝑝 − MM ) of the a priori highly non injective map 𝑥 𝑝 . In this way one can b ound | det ∇ bar 𝑝 | from below with a function that depends on the geometry of the supp ort of the marginals. This b ound does not require any assumption, but the lower bound may naturally degenerate in certain cases. Under the stronger assumption ( ℎ𝑝 2 ) , one can show integrability and thus provide an alternativ e proof to The or em 1.1 in this case. Barycenters of ane transformations . In the nal part of this work, we analyze the opti- mality of ane transformations for the 𝑝 - W asserstein distance, and we perform some explicit computations of 𝑝 - W asserstein bar ycenters between marginals which are obtained by pushing a reference measur e with ane transformations with a particular rigid structure. First of all, for a given measure 𝜇 ∈ P 𝑝 ( R 𝑑 ) and 𝜈 = 𝑇 ♯ 𝜇 , with 𝑇 ane, we sho w that the map 𝑇 is optimal under strong restrictions on the structure of the associate d matrix, in sharp contrast with the case 𝑝 = 2 , where any ane map with positive denite linear part is optimal, due to Brenier’s theorem. For 𝑝 ≠ 2 , let 𝐴 ∈ M ( 𝑑 ) be symmetric and 𝑏 ∈ R 𝑑 , and consider the map 𝑇 : R 𝑑 → R 𝑑 given by 𝑇 ( 𝑥 ) = 𝐴𝑥 + 𝑏 , for 𝑥 ∈ R 𝑑 . W e show in Theorem ( 5.1 ) that 𝑇 is optimal between 𝜇 and 𝜈 = 𝑇 ♯ 𝜇 ⇐ ⇒ 𝜎 ( 𝐴 ) ⊂ { 1 , 𝜁 } , for some 𝜁 ≥ 0 . Secondly , we apply and generalize this result to describe some explicit examples of 𝑝 - W asserstein barycenters. W e x 𝜇 ≪ L 𝑑 a reference probability measur e, and consider probability measures of the form 𝜇 𝑖 =  𝐴 𝑖 · + 𝑣 𝑖  # 𝜇 , in two cases: either 𝐴 𝑖 = Id or 𝑣 𝑖 = 𝑣 ∈ R 𝑑 for every 𝑖 = 1 , . . . , 𝑁 , and in the second case, • 𝐴 1 is invertible. • 𝐴 𝑖 are symmetric, positive semidenite, 𝑑 × 𝑑 matrices. • The spectrum of the matrices satises 𝜎 ( 𝐴 𝑖 ) ⊂ { 1 , 𝜁 𝑖 } for some 𝜁 𝑖 ∈ [ 0 , +∞) . • The matrices commute, i.e . [ 𝐴 𝑖 , 𝐴 𝑗 ] = 0 for every 𝑖 , 𝑗 = 1 , . . . , 𝑁 , and the eigenspaces associated to the eigenvalue 1 of every 𝐴 𝑗 ≠ Id coincide. ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 7 Under such assumptions, we sho w in Proposition 5.2 that the 𝑝 - W asserstein barycenters is the push-forward of the reference measure with respect to the barycenter of the ane transformations, i.e. 𝜈 𝑝 =  𝐴 · + 𝑣  # 𝜇 , 𝐴 : = 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) and 𝑣 : = 𝑥 𝑝 ( 𝑣 1 , . . . , 𝑣 𝑁 ) , where with the barycenters of matrix we mean 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) : = argmin 𝐵 𝑁  𝑖 = 1 𝜆 𝑖   𝐴 𝑖 − 𝐵   𝑝 , | 𝐴 | : = Tr ( 𝐴 𝑇 𝐴 ) 1 2 =  𝑁  𝑖 , 𝑗 = 1 𝐴 2 𝑖 𝑗  1 2 , for every 𝐴 . Note that this is in line with case 𝑝 = 2 (cf. The or em 7.7 in [ Fri24 ]), alb eit in that setting it suces to assume commuting, symmetric, strictly positive associated matrices. It turns out that combinations of translations and linear transformations ar e generally not optimal. Thanks to the explicit formula and conse quent regularity of the map bar 𝑝 in these class of examples, we obtain integrability properties of the 𝑝 - W asserstein bar ycenter density without any additional geometric assumptions on the supports of the marginals. 2. Preliminaries As mentioned in the introduction, it is a known fact that the two denitions of 𝑝 - W asserstein barycenter , ( 𝑊 𝑝 bar ) and ( 𝑊 𝑝 bar-MM ), are equivalent (see for instance [ CE10 ] or [ BFR25 ]). If 𝐶 𝑝 -C2M ≔ min 𝜈 ∈ P 𝑝 ( R 𝑑 ) 𝑁  𝑖 = 1 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 ) , then 𝐶 𝑝 -MM = 𝐶 𝑝 -C2M and, given 𝜈 ∈ P 𝑝 ( R 𝑑 ) , 𝜈 = 𝑊 𝑝 bar ( ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 , .. .,𝑁 ) ⇐ ⇒ 𝜈 = 𝑥 𝑝 ♯ 𝛾 𝑝 , for 𝛾 𝑝 optimal for ( C 𝑝 − MM ) . (2.1) This strong relationship between multimarginal and coupled-two marginal minimizers implies a further optimality property of the optimal 𝛾 𝑝 , when considering its projection 𝛾 𝑖 : = ( 𝜋 𝑖 , 𝑥 𝑝 ) ♯ 𝛾 𝑝 on the product space given by the space of one marginal and the space of the barycenter . More precisely , 𝛾 𝑖 : = ( 𝜋 𝑖 , 𝑥 𝑝 ) ♯ 𝛾 𝑝 ∈ Π ( 𝜇 𝑖 , 𝜈 𝑝 ) is optimal for 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 𝑝 ) 5 , where 𝜈 𝑝 : = 𝑥 𝑝 ♯ 𝛾 𝑝 . (2.2) W e now state the Kantorovich duality sp ecically for our problem ( C 𝑝 − MM ) . The existence of op- timizers, known as optimal p otentials or Kantor ovich potentials, for a large class of multimarginal optimal transport problems, including this case, has been originally pro ved in [ Kel84 ]. Here we state the result as in [ BFR25 , Theorem 4.1], which gives mor e information about continuity and almost-everywhere dierentiability of the optimizers. For this purpose, we introduce the space Y 𝑝 ≔ ( 1 + | · | 𝑝 ) C 𝑏 ( R 𝑑 ) ≔  𝜑 ∈ C ( R 𝑑 ) : ( 1 + | · | 𝑝 ) − 1 𝜑 is bounded  of continuous functions of at most 𝑝 -growth and we dene the 𝜆, 𝑝 -conjugate (or 𝜆, 𝑝 -conjugate) of a function 𝜑 : R 𝑑 → R via 𝜑 𝜆,𝑝 ( 𝑥 ) ≔ inf 𝑧 ∈ R 𝑑  𝜆 | 𝑥 − 𝑧 | 𝑝 − 𝜑 ( 𝑧 )  for 𝑥 ∈ R 𝑑 . Theorem 2.1 (MMOT Duality) . Ther e holds 𝐶 𝑝 − MM = sup 𝜑 1 ,. . ., 𝜑 𝑁 ∈ A ( 𝑐 𝑝 ) 𝑁  𝑖 = 1  R 𝑑 𝜑 𝑖 𝑑 𝜇 𝑖 , (D- 𝑝 -bar) where A ( 𝑐 𝑝 ) ≔ { ( 𝜑 1 , . . . , 𝜑 𝑁 ) ∈ 𝐿 1 𝜇 1 ( R 𝑑 ) × · · · × 𝐿 1 𝜇 𝑁 ( R 𝑑 ) : 𝜑 1 ⊕ · · · ⊕ 𝜑 𝑁 ≤ 𝑐 𝑝 } . 8 C. BRIZZI AND L. PORTINALE Moreover , there exists a maximizer Φ = (  𝜑 1 , . . . ,  𝜑 𝑁 ) ∈ B ( 𝑐 𝑝 ) , where B ( 𝑐 𝑝 ) ≔  ( 𝜑 1 , . . . , 𝜑 𝑁 ) ∈ A ( 𝑐 𝑝 ) : 𝜑 𝑖 = 𝜓 𝜆 𝑖 ,𝑝 𝑖 with 𝜓 𝑖 ∈ Y 𝑝 for every 𝑖 , and 𝑁  𝑖 = 1 𝜓 𝑖 = 0  . In particular ,  𝜑 𝑖 and ∇  𝜑 𝑖 are L 𝑑 -a.e. dierentiable 9 on the convex hull of the supp ort of 𝜇 𝑖 for every 𝑖 = 1 , . . . , 𝑁 . Proof. For the pr o of w e refer to Theorem 4.1 in [ BFR25 ]. As one can se e in that pr o of (second part of Step 2 ), each  𝜑 𝑖 is locally bounde d in the conve x hull of 𝜇 𝑖 , for every 𝑖 = 1 , . . . , 𝑁 . Being each  𝜑 𝑖 a 𝜆, 𝑝 -conjugate to some function, it is lo cally semiconcave 10 (see for instance Corollary C.5 in [ GM96 ]). This is enough to ensure L 𝑑 -a.e. dierentiability of  𝜑 𝑖 and ∇  𝜑 𝑖 on the convex hull of 𝜇 𝑖 for every 𝑖 = 1 , . . . , 𝑁 (see for instance Proposition C.6 in [ GM96 ]). ■ As one can e xpect from ( 2.2 ) , the potentials constructed in the previous theorem ar e optimal for the associated 2 -marginal problems. Corollary 2.2. For every 𝑖 = 1 , . . . , 𝑁 , let  𝜑 𝑖 as given in The orem 2.1 , and let  𝜓 𝑖 be such that  𝑁 𝑖 = 1  𝜓 𝑖 = 0 and  𝜑 𝑖 =  𝜓 𝜆 𝑖 ,𝑝 𝑖 . Then the functions (  𝜑 𝑖 ,  𝜓 𝑖 ) are optimal potentials for 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 𝑝 ) , where 𝜈 𝑝 = 𝑊 𝑝 bar ( ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 , .. .,𝑁 ) . Proof. The proof follo ws the same line as in [ BFR25 ], we report it here for b etter clarity . W e recall that by construction, for every 𝑖 = 1 , . . . , 𝑁 ,  𝜑 𝑖 is the 𝜆, 𝑝 -conjugate of 𝜓 𝑖 , and thus  𝜑 𝑖 ( 𝑥 𝑖 ) +  𝜓 𝑖 ( 𝑧 ) ≤ 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 for every ( 𝑥 𝑖 , 𝑧 ) ∈ R 2 𝑑 (2.3) and that  𝑁 𝑖 = 1  𝜓 𝑖 ( 𝑧 ) = 0 , for every 𝑧 ∈ R 𝑑 . If 𝛾 𝑝 is the optimal plan for ( C 𝑝 − MM ) ,  𝜑 1 , . . . ,  𝜑 𝑁 are the Kantorovich potentials in ( D- 𝑝 -bar ) , if and only if the nonnegative function 𝑐 𝑝 −  𝜑 1 ⊕ · · · ⊕  𝜑 𝑁 is equal 0 on spt 𝛾 𝑝 . It follows that for every x ∈ spt 𝛾 𝑝 , 𝑁  𝑖 = 1  𝜑 𝑖 ( 𝑥 𝑖 ) + 𝑁  𝑖 = 1  𝜓 𝑖 ( 𝑥 𝑝 ( x ) ) = 𝑁  𝑖 = 1  𝜑 𝑖 ( 𝑥 𝑖 ) = 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑥 𝑝 ( x ) | 𝑝 . (2.4) Since the unique optimal plan 𝛾 𝑖 ∈ Π ( 𝜇 𝑖 , 𝜈 𝑝 ) for 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 𝑝 ) is given by 𝛾 𝑖 = ( 𝜋 𝑖 , 𝑥 𝑝 ) ♯ 𝛾 𝑝 (see ( 2.1 ) and ( 2.2 ) ), the pair ( 𝑥 𝑖 , 𝑧 ) ∈ spt 𝛾 𝑖 i 𝑧 = 𝑥 𝑝 ( x ) , where the points 𝑥 1 , . . . , 𝑥 𝑖 − 1 , 𝑥 𝑖 + 1 , . . . , 𝑥 𝑁 are such that x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ spt 𝛾 𝑝 . Thus, thanks to ( 2.4 ) , inequality ( 2.3 ) is actually an equality on the support of 𝛾 𝑖 . ■ Remark 2.3. From the proof of The or em 2.1 in [ BFR25 ] one can see that every  𝜓 𝑖 is by construction the 𝜆, 𝑝 -conjugate of a function and continuous, for 𝑖 = 1 , . . . , 𝑁 − 1 . Then, by Corollary C.5 and Proposition C.6 in [ GM96 ],  𝜓 𝑖 is lo cally semiconcave and thus  𝜓 𝑖 and 𝐷  𝜓 𝑖 are L 𝑑 -a.e. dierentiable, for 𝑖 = 1 , . . . , 𝑁 − 1 . Dierentiability L 𝑑 -a.e. of  𝜓 𝑁 and ∇  𝜓 𝑁 follows by the fact that  𝜓 𝑁 ( 𝑧 ) = −  𝑁 − 1 𝑖 = 1  𝜓 𝑖 ( 𝑧 ) , for every 𝑧 . Theorem 2.4 b elow states the absolute continuity of the 𝑝 - W asserstein barycenter , with 1 < 𝑝 < ∞ , and it is therefore the starting point of our analysis. Notice that it is a slightly improved version of [ BFR25 , Theorem 1.4], where the assumption of all marginals being absolutely continuous has been weakened to only one marginal b eing absolutely continuous. Before stating that result, we recall that, giv en a function 𝑐 : R 𝑁 𝑑 → R , a set Γ ⊂ R 𝑁 𝑑 is said to be 𝑐 -monotone if for every x 1 = ( 𝑥 1 1 , . . . , 𝑥 1 𝑁 ) , x 2 = ( 𝑥 2 1 , . . . , 𝑥 2 𝑁 ) ∈ Γ we have 𝑐 ( 𝑥 1 1 , . . . , 𝑥 1 𝑁 ) + 𝑐 ( 𝑥 2 1 , . . . , 𝑥 2 𝑁 ) ≤ 𝑐 ( 𝑥 𝜎 1 ( 1 ) 1 , . . . , 𝑥 𝜎 𝑁 ( 1 ) 𝑁 ) + 𝑐 ( 𝑥 𝜎 1 ( 2 ) 1 , . . . , 𝑥 𝜎 𝑁 ( 2 ) 𝑁 ) , (2.5) 9 ∇  𝜑 𝑖 is L 𝑑 -a.e. dierentiable in the sense of Alexandro 10 W e say that a function 𝜙 is locally semiconcave if for every 𝑥 there exist an open neighborhood 𝑈 of 𝑥 , and a constant 𝐶 ≥ 0 such that 𝜙 ( 𝑦 ) − 𝐶 2 | 𝑦 | 2 is concave for every 𝑦 ∈ 𝑈 . ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 9 for every 𝜎 𝑖 ∈ 𝑆 ( 2 ) , with 𝑆 ( 2 ) being the set of permutations of two elements. Moreover , the notion of 𝑐 -cyclical monotonicity can also b e dene d in the multi-marginal setting (see, for instance, Denition 2.2 in [ KP14 ]) and it implies 𝑐 -monotonicity . If 𝑐 is continuous, then the support of any optimal plan 𝛾 for the multi-marginal OT problem associated to 𝑐 , is 𝑐 -cyclically monotone, see [ KP14 , Proposition 2.3]. Therefore, if 𝑐 : R 𝑁 𝑑 → R is continuous and 𝛾 is optimal for min 𝛾 ∈ Π ( 𝜇 1 ,. . .,𝜇 𝑁 )  R 𝑁 𝑑 𝑐 ( 𝑥 1 , . . . , 𝑥 𝑁 ) d 𝛾 , then spt 𝛾 is 𝑐 -monotone. Theorem 2.4. Let 1 < 𝑝 < ∞ and 𝛾 𝑝 ∈ Π ( 𝜇 1 , . . . , 𝜇 𝑁 ) have 𝑐 𝑝 -monotone supp ort. Then under the condition that 𝜇 1 ≪ L 𝑑 there holds 11 𝑥 𝑝 ♯ 𝛾 𝑝 ≪ L 𝑑 . It follows that the 𝑝 - W asserstein bar ycenter 𝜈 𝑝 = 𝑊 𝑝 bar ( ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 , .. .,𝑁 ) of the measures 𝜇 1 , . . . , 𝜇 𝑁 with weights 𝜆 1 , . . . , 𝜆 𝑁 is absolutely continuous with respect to Lebesgue measure on R 𝑑 . Proof. For 𝑝 > 2 se e [ BFR26 ], for the extension to general 1 < 𝑝 < ∞ , see Appendix A . ■ As shown in [ BFR25 ], this regularity property has as a direct consequence a strong sparsity result for the the optimal plan. More precisely , 𝛾 𝑝 can be considered as a map dene d on the support of one of the marginals (cf. [ BFR25 , Theorem 1.2]). In Theorem 2.5 , the Monge property of the optimal plan holds under the w eaker condition 𝜇 1 ≪ L 𝑑 , for every 1 < 𝑝 < ∞ . However , taking into account Theorem 2.4 , the proof is the same as for [ BFR25 , Theorem 1.2], and we do not report it here. Notice that the choice of 𝜇 1 is arbitrary: if 𝜇 𝑖 ≪ L 𝑑 the result is equivalent, with spt 𝛾 𝑝 parametrized with respect to the 𝑖 th marginal. Theorem 2.5. For any 1 < 𝑝 < ∞ , if 𝜇 1 ≪ L 𝑑 , then there exists a unique optimal plan 𝛾 𝑝 for the problem ( C 𝑝 − MM ) , and there exist measurable maps 𝑇 𝑖 : spt 𝜇 1 → spt 𝜇 𝑖 , 𝑖 = 1 , . . . , 𝑁 , such that 𝛾 𝑝 = ( 𝑇 1 , 𝑇 2 , . . . , 𝑇 𝑁 ) ♯ 𝜇 1 , where 𝑇 𝑖 = 𝑅 𝑖 ◦ 𝑆 1 , with 𝑆 1 : spt 𝜇 1 → spt 𝜈 𝑝 and 𝑅 𝑖 : spt 𝜈 𝑝 → spt 𝜇 𝑖 the optimal maps for the 2 -marginals problem 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 𝑝 ) , for every 𝑖 = 1 , . . . , 𝑁 . In particular , 𝑆 1 = 𝑅 − 1 1 and 𝑇 1 = Id . Moreover , there exist a.e. dierentiable functions  𝜑 1 : spt 𝜇 1 → R and  𝜓 𝑖 : spt 𝜈 𝑝 → R such that 𝑆 1 = Id − ( 𝑝 𝜆 1 ) − 1 𝑝 − 1 | ∇  𝜑 1 | − 𝛼 𝑝 ∇  𝜑 1 and 𝑅 𝑖 = Id − ( 𝑝 𝜆 𝑖 ) − 1 𝑝 − 1 | ∇  𝜓 𝑖 | − 𝛼 𝑝 ∇  𝜓 𝑖 , (2.6) for any 𝑖 = 1 , . . . , 𝑁 , where 𝛼 𝑝 = 𝑝 − 2 𝑝 − 1 . Remark 2.6. Thanks to Theorem 2.5 , one denes the map bar 𝑝 : spt 𝜇 1 → spt 𝜈 𝑝 as bar 𝑝 ( 𝑥 1 ) : = 𝑥 𝑝 ◦ ( Id , 𝑇 2 , . . . , 𝑇 𝑁 ) ( 𝑥 1 ) , for 𝑥 1 ∈ spt 𝜇 1 . (2.7) W e notice thus 𝜈 𝑝 = bar 𝑝 ♯ 𝜇 1 . (2.8) 11 The choice of 𝜇 1 is arbitrary , one can choose any other marginal 𝜇 𝑖 to be absolutely continuous with respect to Lebesgue measure on R 𝑑 . 10 C. BRIZZI AND L. PORTINALE Remark 2.7. The fact that 𝑆 1 and 𝑅 𝑖 are optimal respectively for the 2 -marginals problems 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 1 , 𝜈 𝑝 ) and 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜈 𝑝 , 𝜇 𝑖 ) is a dir ect consequence of ( 2.2 ) . W e remark that the existence of 𝑅 𝑖 (see the classical result by Gangbo and McCann [ GM96 ]) is guaranteed by the fact that 𝜈 𝑝 ≪ L 𝑑 (Theorem 2.4 ). Notice that by dropping the assumption of absolute continuity on the marginals 𝜇 𝑖 , with 𝑖 ≥ 2 , the existence of the optimal map 𝑆 𝑖 : = 𝑅 − 1 𝑖 : spt 𝜇 𝑖 → spt 𝜈 𝑝 for the problem 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 𝑝 ) is not guaranteed. Finally , the functions  𝜑 1 and  𝜓 𝑖 of Theorem 2.5 are the same of the ones obtained in Theorem 2.1 . By Corollary 2.2 we know that each  𝜑 𝑖 ,  𝜓 𝑖 are Kantorovich potentials for the 2 -marginals problem 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , 𝜈 𝑝 ) and the r epresentation of 𝑆 1 and 𝑅 𝑖 with these potentials is consistent with the standard 2 -marginal duality arguments (see for instance [ ABS24 , Remark 5.3] or [ Fri24 , Section 3.6]). 12 2.1. Characterization of 𝑝 - W asserstein barycenters. Proposition 2.8 and Proposition 2.9 below are natural extensions of the characterization of the 𝑊 2 -barycenter discussed respe ctively in [ A C11 , Proposition 3.8, Remark 3.9]. In particular , ( 2.9 ) is the equivalent of (3.10) in [ A C11 ]. Proposition 2.8. Assume that 𝜇 1 ≪ L 𝑑 and that ¯ 𝜈 ∈ P 𝑝 ( R 𝑑 ) . Then the following conditions are equivalent: (1) ¯ 𝜈 = 𝑊 𝑝 bar ( ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 , .. .,𝑁 ) . (2) there exist ( ( 𝜑 𝑖 , 𝜓 𝑖 ) ) 𝑖 = 1 , .. .,𝑁 such that ( 𝜑 𝑖 , 𝜓 𝑖 ) are Kantorovich potentials for 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , ¯ 𝜈 ) for every 𝑖 = 1 , . . . , 𝑁 and  𝑁 𝑖 = 1 𝜆 𝑖 𝜓 𝑖 ( 𝑧 ) = 0 , for every 𝑧 . Proof. The fact that ( 1 ) implies ( 2 ) follows directly by Theorem 2.1 and Corolloray 2.2 Indeed, for each 𝑖 , take (  𝜑 𝑖 𝜆 𝑖 ,  𝜓 𝑖 𝜆 𝑖 ) , where (  𝜑 𝑖 ,  𝜓 𝑖 ) are the ones given by Theorem 2.1 . W e now prov e that ( 2 ) implies ( 1 ) . The admissibility property of the potentials implies that 𝜑 𝑖 ( 𝑥 𝑖 ) + 𝜓 𝑖 ( 𝑧 ) ≤ | 𝑥 𝑖 − 𝑧 | 𝑝 , for every 𝑥 𝑖 , 𝑧 , and thus 𝑁  𝑖 = 1 𝜆 𝑖 𝜑 𝑖 ( 𝑥 𝑖 ) = 𝑁  𝑖 = 1 𝜆 𝑖 ( 𝜑 𝑖 ( 𝑥 𝑖 ) + 𝜓 𝑖 ( 𝑧 ) ) ≤ 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 . By choosing 𝑧 = 𝑥 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) this yelds 𝑁  𝑖 = 1 𝜆 𝑖 𝜑 𝑖 ( 𝑥 𝑖 ) ≤ 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑥 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) | 𝑝 , meaning that 𝜑 1 , . . . , 𝜑 𝑁 are admissible potentials for the 𝑝 - W asserstein bar ycenter multi-marginal cost 𝑐 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) =  𝑁 𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑥 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) | 𝑝 (see ( 1.2 )). Therefore min 𝛾 ∈ Π ( 𝜇 1 ,. . .,𝜇 𝑁 )  R 𝑁 𝑑 𝑐 𝑝 ( 𝑥 1 , . . . , 𝑥 𝑁 ) d 𝛾 ( 𝑥 1 , . . . , 𝑥 𝑁 ) ≥ 𝑁  𝑖 = 1  𝜑 𝑖 ( 𝑥 𝑖 ) d 𝜇 𝑖 ( 𝑥 𝑖 ) = 𝑁  𝑖 = 1  𝜆 𝑖 𝜑 𝑖 ( 𝑥 𝑖 ) d 𝜇 𝑖 ( 𝑥 𝑖 ) + 𝑁  𝑖 = 1  𝜆 𝑖 𝜓 𝑖 ( 𝑧 ) d ¯ 𝜈 ( 𝑧 ) = 𝑁  𝑖 = 1 𝜆 𝑖 𝑊 𝑝 𝑝 ( 𝜇 𝑖 , ¯ 𝜈 ) , where the last equality comes from ( 2 ). ■ Proposition 2.9. Let 𝜇 1 ≪ L 𝑑 . If 𝜈 𝑝 = 𝑊 𝑝 bar ( ( 𝜇 𝑖 , 𝜆 𝑖 ) 𝑖 = 1 , .. .,𝑁 ) , then 𝑥 𝑝 ◦  𝑅 1 , . . . , 𝑅 𝑁  = id 𝜈 𝑝 − a.e. , (2.9) where 𝑅 1 , . . . , 𝑅 𝑁 are the maps given by Theorem 2.5 . 12 Indeed, by duality theor y one knows that the optimal transport map 𝑇 for a 2-marginal OT problem is obtained, when possible, by inverting w .r .t. of 𝑦 the equality ∇ 𝑥 𝑐 ( 𝑥 , 𝑦 ) = ∇ 𝜑 ( 𝑥 ) , which holds on the supp ort of the optimal plan 𝛾 . In our case the 2 -marginal cost is of the typ e 𝑐 ( 𝑥 , 𝑦 ) = 𝜆 | 𝑥 − 𝑦 | 𝑝 . ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 11 Proof. By Theorem 2.5 , we kno w that, for every 𝑖 = 1 , . . . , 𝑁 , 𝑅 𝑖 ( 𝑧 ) = 𝑧 − ( 𝑝 𝜆 𝑖 ) − 1 𝑝 − 1 | ∇  𝜓 𝑖 | − 𝛼 𝑝 ∇  𝜓 𝑖 ( 𝑧 ) , for 𝜈 𝑝 -a.e. 𝑧, (2.10) where the  𝜓 𝑖 ’s are the one given by Theorem 2.1 . Let 𝑧 ∈ spt 𝜈 𝑝 such that ( 2.10 ) holds. Then 𝑥 𝑝 ◦  𝑅 1 , . . . , 𝑅 𝑁  ( 𝑧 ) = 𝑧 if and only if 𝑧 is the (unique) solution of argmin 𝑦 ∈ R 𝑑 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑅 𝑖 ( 𝑧 ) − 𝑦 | 𝑝 , which in turn is equivalent to 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑅 𝑖 ( 𝑧 ) − 𝑧 | 𝑝 − 2 ( 𝑅 𝑖 ( 𝑧 ) − 𝑧 ) = 0 . By ( 2.10 ), the above equation is equivalent to 𝑁  𝑖 = 1 𝜆 𝑖    ( 𝑝 𝜆 𝑖 ) − 1 𝑝 − 1 | ∇  𝜓 𝑖 | − 𝛼 𝑝 ∇  𝜓 𝑖 ( 𝑧 )    𝑝 − 2 ( 𝑝 𝜆 𝑖 ) − 1 𝑝 − 1 | ∇  𝜓 𝑖 | − 𝛼 𝑝 ∇  𝜓 𝑖 ( 𝑧 ) = 0 . (2.11) All in all, recalling that 𝛼 𝑝 = 𝑝 − 2 𝑝 − 1 , we conclude that, for 𝑧 ∈ spt 𝜈 𝑝 such that ( 2.10 ) holds, 𝑥 𝑝 ◦  𝑅 1 , . . . , 𝑅 𝑁  ( 𝑧 ) = 𝑧 ⇐ ⇒ 𝑁  𝑖 = 1 ∇  𝜓 𝑖 ( 𝑧 ) = 0 ⇐ ⇒ ∇  𝑁  𝑖 = 1  𝜓 𝑖 ( 𝑧 )  = 0 . Finally , the latter equation is satised since  𝑁 𝑖 = 1  𝜓 𝑖 ( 𝑧 ) = 0 for every 𝑧 (see Theorem 2.1 ). ■ The following corollary follows directly from Pr oposition 2.9 . Corollary 2.10. Let 𝜇 1 ≪ L 𝑑 . Then bar 𝑝 = 𝑆 1 𝜇 1 -a.e. . In particular , bar 𝑝 is inje ctive . 2.2. Euclidean Bar ycenters: properties and regularity. The 𝑝 -barycenter in R 𝑑 is the map 𝑥 𝑝 : R 𝑁 𝑑 → R 𝑑 , dened by 𝑥 𝑝 ( x ) = argmin 𝑧 ∈ R 𝑑 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 , x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ R 𝑁 . In this section we discuss some properties of this map which will be used throughout the paper . Recall that 𝛼 𝑝 = 𝑝 − 2 𝑝 − 1 = 1 − 1 𝑝 − 1 . In particular , 𝛼 𝑝 ∈ ( 0 , 1 ) for 𝑝 > 2 , 𝛼 2 = 0 , and 𝛼 𝑝 < 0 for 𝑝 ∈ ( 1 , 2 ) . Remark 2.11 (Properties of 𝑥 𝑝 ) . Notice that 1. 𝑥 𝑝 ( x ) is in the convex hull of the points 𝑥 1 , . . . , 𝑥 𝑁 . 2. the map 𝑥 𝑝 is locally Lipschitz and therefore dierentiable L 𝑑 -a.e.. Indeed, it is lo cally the minimum, over a compact set 𝐾 , of a family the locally Lipschitz functions { 𝑓 𝑧 } 𝑧 ∈ 𝐾 13 𝑓 𝑧 ( x ) ≔ 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 , x ∈ R 𝑁 𝑑 . 13 If 𝐵 1 , . . . , 𝐵 𝑁 ⊂ R 𝑑 are open balls, then for any y ∈ 𝐵 1 × · · · × 𝐵 𝑁 , 𝑥 𝑝 ( y ) = min 𝑧 ∈ 𝐾 𝑓 𝑧 ( y ) , where 𝐾 is the closure of the convex hull of  𝑁 𝑖 = 1 𝐵 𝑖 . Indee d, 𝐾 contains the union of the convex hulls of { 𝑦 1 , . . . , 𝑦 𝑁 } with y ∈ 𝐵 1 × · · · × 𝐵 𝑁 , where the barycenter lies. 12 C. BRIZZI AND L. PORTINALE 3. Let x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ R 𝑑 , then 𝑥 𝑝 ( x ) is the only solution of its Euler–Lagrange equation 𝑁  𝑖 = 1 𝜆 𝑖 ∇ 𝑧 ( | 𝑥 𝑖 − 𝑧 | 𝑝 ) 𝑁  𝑖 = 1 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 − 2 ( 𝑥 𝑖 − 𝑧 ) = 0 . (2.12) In general, the solution of ( 2.12 ) cannot be written explicitly as a function of 𝑥 1 , . . . , 𝑥 𝑁 , howev er 3.1. if 𝑝 = 2 , 𝑥 𝑝 ( x ) = 𝑁  𝑖 = 1 𝜆 𝑖 𝑥 𝑖 , 3.2. if 𝑁 = 2 , 𝑥 𝑝 ( x ) = 𝜆 1 𝑝 − 1 1 ( 1 − 𝜆 1 ) 1 𝑝 − 1 + 𝜆 1 𝑝 − 1 1 𝑥 1 + ( 1 − 𝜆 1 ) 1 𝑝 − 1 ( 1 − 𝜆 1 ) 1 𝑝 − 1 + 𝜆 1 𝑝 − 1 1 𝑥 2 . (2.13) Let us set diag ( R 𝑑 𝑘 ) : = { ( 𝑥 𝑖 1 , . . . , 𝑥 𝑖 𝑘 ) ∈ R 𝑑 𝑘 : 𝑥 𝑖 𝑗 = 𝑥 𝑖 𝑙 for all 𝑗 , 𝑙 = 1 , . . . , 𝑘 } . Proposition 2.12 (Regularity of 𝑥 𝑝 ) . The barycenter map 𝑥 𝑝 : R 𝑁 𝑑 → R 𝑑 is continuously dieren- tiable over R 𝑑 𝑁 \ diag ( R 𝑑 𝑁 ) , and over such set we have ∇ 𝑥 𝑖 𝑥 𝑝 ( ·) = 𝐻 ( ·) − 1 𝐻 𝑗 ( ·) , (2.14) where for every x ∈ R 𝑑 𝑁 \ diag ( R 𝑑 𝑁 ) , we dene 𝐻 𝑖 ( x ) : = 𝜆 𝑖 ∇ 2 𝑥 𝑖 ( | 𝑥 𝑖 − 𝑧 | 𝑝 )   𝑧 = 𝑥 𝑝 = 𝜆 𝑖 | 𝑥 𝑖 − 𝑧 | 𝑝 − 2  ( 𝑝 − 2 ) 𝑥 𝑖 − 𝑧 | 𝑥 𝑖 − 𝑧 | ⊗ 𝑥 𝑖 − 𝑧 | 𝑥 𝑖 − 𝑧 | + 1      𝑧 = 𝑥 𝑝 ( x ) , 𝐻 ( x ) : =  𝑁  𝑘 = 1 𝐻 𝑘 ( x )  =  𝑁  𝑘 = 1 𝜆 𝑘 | 𝑥 𝑘 − 𝑧 | 𝑝 − 2  ( 𝑝 − 2 ) 𝑥 𝑘 − 𝑧 | 𝑥 𝑘 − 𝑧 | ⊗ 𝑥 𝑘 − 𝑧 | 𝑥 𝑘 − 𝑧 | + 1       𝑧 = 𝑥 𝑝 ( x ) . W e remark that, in terms of the sets 𝐷 𝑆 dened in ( 1.5 ), we have diag ( R 𝑑 𝑁 ) = 𝐷 { 1 ,. ..,𝑁 } . Proof. For x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ R 𝑁 𝑑 , recall that 𝑥 𝑝 ( x ) is the unique solution of ( 2.12 ) . W e set 𝐹 ( x , 𝑧 ) ≔  𝑁 𝑘 = 1 𝜆 𝑖 | 𝑥 𝑘 − 𝑧 | 𝑝 − 2 ( 𝑥 𝑘 − 𝑧 ) . Then ∇ 𝑧 𝐹 ( x , 𝑧 ) = − 𝑁  𝑘 = 1 𝜆 𝑘 | 𝑥 𝑘 − 𝑧 | 𝑝 − 2  ( 𝑝 − 2 ) 𝑥 𝑘 − 𝑧 | 𝑥 𝑘 − 𝑧 | ⊗ 𝑥 𝑘 − 𝑧 | 𝑥 𝑘 − 𝑧 | + 1  , which degenerates if and only if 𝑥 𝑘 = 𝑧 for every 𝑘 . Thus, ∇ 𝑧 𝐹 ( x , 𝑥 𝑝 ( x ) ) exists and is invertible for every x ∈ R 𝑁 𝑑 \ diag ( R 𝑑 𝑁 ) . By the Implicit Function Theorem, ther e exists an op en neighb orhood 𝑈 x of x , such that 𝑥 𝑝 ∈ 𝐶 1 ( 𝑈 x ) and ∇ 𝑥 𝑖 𝑥 𝑝 ( y ) = −∇ 𝑧 𝐹 ( y , 𝑥 𝑝 ( y ) ) − 1 ∇ 𝑥 𝑖 𝐹 ( y , 𝑥 𝑝 ( y ) ) , for every y ∈ 𝑈 x . Formula ( 2.14 ) then follows by a direct computation. ■ 2.2.1. One-variable map. In this part, we x 𝑁 − 1 points ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 ∈ R 𝑑 . Call ˆ 𝑥 ∈ R ( 𝑁 − 1 ) 𝑑 such that ˆ 𝑥 = ( ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 ) , and dene the function 𝑏 ˆ 𝑥 : R 𝑑 → R 𝑑 simply given by 𝑏 ˆ 𝑥 ( 𝑥 1 ) : = 𝑥 𝑝 ( 𝑥 1 , ˆ 𝑥 ) . The next proposition sums up the main properties of the map 𝑏 ˆ 𝑥 . With the notation 𝐴 ( 𝑥 ) ≲ 𝐵 ( 𝑥 ) we mean that there exists a constant 𝐶 = 𝐶 ( 𝑑 ) ∈ R + (in particular independent of 𝑥 1 , ˆ 𝑥 ) such that 𝐴 ( ·) ≤ 𝐶 𝐵 ( ·) as quadratic forms, every where on R 𝑑 . Similarly with 𝐴 ( 𝑥 ) ≳ 𝐵 ( 𝑥 ) . When b oth are true, w e simply write 𝐴 ( 𝑥 ) ≃ 𝐵 ( 𝑥 ) . W e denote by 𝑥 𝑝 ( ˆ 𝑥 ) the barycenter of 𝑁 − 1 points with their corresponding weights, i.e., 𝑥 𝑝 ( ˆ 𝑥 ) = argmin 𝑧 ∈ R 𝑑 𝑁  𝑖 = 2 𝜆 𝑖 | ˆ 𝑥 𝑖 − 𝑧 | 𝑝 , ˆ 𝑥 = ( ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 ) ∈ R 𝑁 − 1 . ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 13 Dene 𝐺 ( ˆ 𝑥 , 𝑧 ) : = 𝑁  𝑖 = 2 𝜆 𝑖 | ˆ 𝑥 𝑖 − 𝑧 | 𝑝 − 2 ( ˆ 𝑥 𝑖 − 𝑧 ) . For simplicity , throughout the section, we omit the dependence on ˆ 𝑥 and simply write 𝑏 = 𝑏 ˆ 𝑥 and 𝐺 ( 𝑧 ) : = 𝐺 ( ˆ 𝑥 , 𝑧 ) . By optimality , 𝑏 ( 𝑥 1 ) is the unique solution 𝑧 ∈ R 𝑑 of 𝜆 1 | 𝑥 1 − 𝑧 | 𝑝 − 2 ( 𝑥 1 − 𝑧 ) + 𝐺 ( 𝑧 ) = 0 , (2.15) which is nothing but ( 2.12 ) rewritten for this setting. This in particular implies that 𝜆 1 | 𝑥 1 − 𝑏 ( 𝑥 1 ) | 𝑝 − 1 = | 𝐺 ( 𝑏 ( 𝑥 1 ) ) | , ∀ 𝑥 1 ∈ R 𝑑 . (2.16) By substituting back into ( 2.15 ) with 𝑧 = 𝑏 ( 𝑥 1 ) , we nd 𝑥 1 − 𝑏 ( 𝑥 1 ) = − 1 𝜆 1 − 𝛼 𝑝 1 𝐺 ( 𝑏 ( 𝑥 1 ) ) | 𝐺 ( 𝑏 ( 𝑥 1 ) ) | 𝛼 𝑝 , ∀ 𝑥 1 ∈ R 𝑑 . This shows that, for any given ˆ 𝑥 ∈ R ( 𝑁 − 1 ) 𝑑 , the map 𝑏 = 𝑏 ˆ 𝑥 is a bije ction of R 𝑑 , and it has a unique xed point, given by 𝑥 𝑝 ( ˆ 𝑥 ) . Furthermore, the inverse function has the explicit form 𝑏 − 1 ( 𝑧 ) = 𝑧 − 1 𝜆 1 − 𝛼 𝑝 1 𝐺 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 𝛼 𝑝 , 𝑧 ∈ R 𝑑 . (2.17) Remark 2.13. (Regularity of 𝑏 ) W e obser ve that 𝑏 ∈ 𝐶 1 ( R 𝑑 ) . Indeed, rst of all, from Remark 2.11 , one can easily infer that in the special cases 𝑝 = 2 , ∇ 𝑏 ( 𝑥 1 ) = 𝜆 1 Id , for every 𝑥 1 ∈ R 𝑑 , 𝑁 = 2 , ∇ 𝑏 ( 𝑥 1 ) = 𝜆 1 − 𝛼 𝑝 1 𝜆 1 − 𝛼 𝑝 1 + ( 1 − 𝜆 1 ) 1 − 𝛼 𝑝 Id , for every 𝑥 1 ∈ R 𝑑 . In these cases, therefor e, both the gradient of 𝑏 and of 𝑏 − 1 are p ositive multiple of the identity . For 𝑁 > 2 and 𝑝 ≠ 2 , if ˆ 𝑥 ∉ diag ( R 𝑑 ( 𝑁 − 1 ) ) , then ( 𝑥 1 , ˆ 𝑥 ) ∉ diag ( R 𝑑 𝑁 ) and regularity follows by an application of Implicit Function Theorem (via the function 𝐹 ( 𝑥 1 , ˆ 𝑥 , 𝑧 ) : = 𝜆 1 | 𝑥 1 − 𝑧 | 𝑝 − 2 ( 𝑥 1 − 𝑧 ) + 𝐺 ( 𝑧 ) , similarly to the proof of Proposition 2.12 ). If ˆ 𝑥 ∈ diag ( R 𝑑 ( 𝑁 − 1 ) ) , we get 𝑏 ( 𝑥 1 ) = 𝑥 𝑝 ( 𝑥 1 , ˆ 𝑥 ) , thus falling back to the case 𝑁 = 2 . Proposition 2.14 (Estimates on ∇ 𝑏 − 1 , 𝑝 ≥ 2 ) . Let us consider 𝑝 ≥ 2 and x ˆ 𝑥 ∈ R ( 𝑁 − 1 ) 𝑑 , set 𝑏 = 𝑏 ˆ 𝑥 . For every 𝑧 ∈ R 𝑑 with 𝑧 ≠ 𝑥 𝑝 ( ˆ 𝑥 ) , we have that Id ≤ ∇ 𝑏 − 1 ( 𝑧 ) ≤  1 + 𝐶 𝑝  1 − 𝜆 1 𝜆 1  1 − 𝛼 𝑝  𝑀 ( 𝑧 ) | 𝑧 − 𝑥 𝑝 ( ˆ 𝑥 ) |  𝑝 − 2  Id , (2.18) where 𝑀 ( 𝑧 ) : = max 𝑖 ≥ 2 | ˆ 𝑥 𝑖 − 𝑧 | and 𝐶 𝑝 ∈ ( 0 , +∞) is a constant only depending on p. If 𝑁 > 2 and 𝑝 > 2 , then for every ˆ 𝑥 ∉ diag ( R ( 𝑁 − 1 ) 𝑑 ) and every compact set 𝐾 ⊂ R 𝑑 , there exists a constant 𝐶 = 𝐶 ( ˆ 𝑥 , 𝐾 , 𝑝 ) ∈ ( 1 , +∞) such that 1 𝐶 Id <  𝜆 1 − 𝛼 𝑝 1 | 𝑧 − 𝑥 𝑝 ( ˆ 𝑥 ) | 𝛼 𝑝  ∇ 𝑏 − 1 ( 𝑧 ) < 𝐶 Id , (2.19) for every 𝑧 ∈ 𝐾 . Proof. For simplicity , we set 𝑧 𝑝 : = 𝑥 𝑝 ( ˆ 𝑥 ) . From ( 2.16 ) , we also see that 𝑏 ( 𝑥 1 ) = 𝑥 1 is solved uniquely by 𝑥 1 = 𝑧 𝑝 , for it is the unique solution 𝑧 ∈ R 𝑑 to 𝐺 ( 𝑧 ) = 0 , as it is given by 𝑧 𝑝 ∈ argmin 𝑧 ∈ R 𝑑  𝑁  𝑖 = 2 𝜆 𝑖 | 𝑧 − ˆ 𝑥 𝑖 | 𝑝  . In particular , 𝑧 𝑝 is the unique xed p oint of the inverse 𝑏 − 1 as well. Note that as 𝛼 𝑝 ∈ [ 0 , 1 ) as 𝑝 ≥ 2 , it readily follows that 𝑏 − 1 is also continuous in 𝑧 = 𝑧 𝑝 . 14 C. BRIZZI AND L. PORTINALE As 𝑝 ≥ 2 , 𝐺 ∈ 𝐶 1 ( R 𝑑 ) , and fr om ( 2.17 ) , 𝑏 − 1 ∈ 𝐶 1 ( R 𝑑 \ { 𝑧 𝑝 } ) . Inde ed, although every where continuous, for 𝑁 > 2 , ˆ 𝑥 ∈ diag ( R ( 𝑁 − 1 ) 𝑑 ) , and 𝑝 > 2 , as 𝛼 𝑝 ∈ ( 0 , 1 ) , 𝑏 − 1 is not dier entiable at 𝑧 𝑝 . Using that for 𝛼 ∈ R + and 𝑦 ∈ R 𝑑 ∇  · | · | 𝛼  ( 𝑦 ) = 1 | 𝑦 | 𝛼 Id − 𝛼 | 𝑦 | 𝛼 + 1 𝑦 ⊗ ∇ | 𝑦 | = 1 | 𝑦 | 𝛼  Id − 𝛼 𝑦 ⊗ 𝑦 | 𝑦 | 2  , for 𝑧 ≠ 𝑧 𝑝 we can dierentiate ( 2.17 ) and obtain ∇ 𝑏 − 1 ( 𝑧 ) = Id − 1 𝜆 1 − 𝛼 𝑝 1  Id − 𝛼 𝑝 𝐺 ( 𝑧 ) ⊗ 𝐺 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 2  ∇ 𝐺 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 𝛼 𝑝 , 𝑧 ∈ R 𝑑 \ { 𝑧 𝑝 } . (2.20) For every 𝑧 ∈ R 𝑑 , | 𝐺 ( 𝑧 ) | − 2 𝐺 ( 𝑧 ) ⊗ 𝐺 ( 𝑧 ) is a rank-one matrix with eigenvalues { 0 , 1 } , and therefore , using 𝛼 𝑝 ∈ [ 0 , 1 ) for 𝑝 ≥ 2 , 0 < ( 1 − 𝛼 𝑝 ) Id ≤ Id − 𝛼 𝑝 𝐺 ( 𝑧 ) ⊗ 𝐺 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 2 ≤ Id . (2.21) for every 𝑧 ∈ R 𝑑 \ { 𝑧 𝑝 } . On the other hand, from the very denition of 𝐺 , we have that −∇ 𝐺 ( 𝑧 ) = ( 𝑝 − 2 ) 𝑁  𝑖 = 2 𝜆 𝑖 | 𝑧 − ˆ 𝑥 𝑖 | 𝑝 − 4 ( 𝑧 − ˆ 𝑥 𝑖 ) ⊗ ( 𝑧 − ˆ 𝑥 𝑖 ) +  𝑁  𝑖 = 2 𝜆 𝑖 | 𝑧 − ˆ 𝑥 𝑖 | 𝑝 − 2  Id = 𝑁  𝑖 = 2 𝜆 𝑖 | 𝑧 − ˆ 𝑥 𝑖 | 𝑝 − 2  ( 𝑝 − 2 ) ( 𝑧 − ˆ 𝑥 𝑖 ) ⊗ ( 𝑧 − ˆ 𝑥 𝑖 ) | 𝑧 − ˆ 𝑥 𝑖 | 2 + Id  for 𝑧 ∈ R 𝑑 . Notice that ∇ 𝐺 ( 𝑧 ) = 0 if and only if ˆ 𝑥 ∈ diag ( R ( 𝑁 − 1 ) 𝑑 ) and 𝑧 = 𝑧 𝑝 = ˆ 𝑥 1 = · · · = ˆ 𝑥 𝑁 . For simplicity , we intr oduce the notation 𝐴 ( 𝑧 ) = 𝐴 ( ˆ 𝑥 , 𝑧 ) : = 𝑁  𝑖 = 2 𝜆 𝑖 | 𝑧 − 𝑥 𝑖 | 𝑝 − 2 . The latter computation shows that, for every 𝑧 ∈ R 𝑑 , 𝐴 ( 𝑧 ) Id ≤ − ∇ 𝐺 ( 𝑧 ) ≤ ( 𝑝 − 1 ) 𝐴 ( 𝑧 ) Id , (2.22) as quadratic form. Ther efore, together with ( 2.20 ) and ( 2.21 ) , the latter double b ound ensures that  1 + ( 1 − 𝛼 𝑝 ) 𝐴 ( 𝑧 ) 𝜆 1 − 𝛼 𝑝 1 | 𝐺 ( 𝑧 ) | 𝛼 𝑝  Id ≤ ∇ 𝑏 − 1 ( 𝑧 ) ≤  1 + ( 𝑝 − 1 ) 𝐴 ( 𝑧 ) 𝜆 1 − 𝛼 𝑝 1 | 𝐺 ( 𝑧 ) | 𝛼 𝑝  Id , (2.23) for every 𝑧 ∈ R 𝑑 . Let us show the slightly weaker but more general (as it is claimed to hold uniformly in ˆ 𝑥 ∈ R 𝑑 ( 𝑁 − 1 ) ) lower bound in ( 2.18 ). In fact, it follows directly from ( 2.23 ) as soon as we pro ve | 𝐺 ( 𝑧 ) | ≳ ( 1 − 𝜆 1 ) 2 2 − 𝑝 𝑝 − 1 | 𝑧 − 𝑧 𝑝 | 𝑝 − 1 . (2.24) Indeed, the b ound ( 2.24 ) w ould imply ( 𝑝 − 1 ) 𝐴 ( 𝑧 ) 𝜆 1 − 𝛼 𝑝 1 | 𝐺 ( 𝑧 ) | 𝛼 𝑝 ≤ 𝐶 𝑝 ( 1 − 𝜆 1 ) − 𝛼 𝑝  𝑁 𝑖 = 2 𝜆 𝑖 𝑀 ( 𝑧 ) 𝜆 1 − 𝛼 𝑝 1 | 𝑧 − 𝑧 𝑝 | ( 𝑝 − 1 ) 𝛼 𝑝 = 𝐶 𝑝 ( 1 − 𝜆 1 ) 1 − 𝛼 𝑝 𝑀 ( 𝑧 ) 𝜆 1 − 𝛼 𝑝 1 | 𝑧 − 𝑧 𝑝 | ( 𝑝 − 2 ) , for some 𝐶 𝑃 ∈ ( 0 , +∞) , as claime d in ( 2.18 ). The lower bound on the norm of 𝐺 ( 𝑧 ) instead follos by the strong conv exity of the function 𝑢 ↦→ 𝑔 ( 𝑢 ) : = 1 𝑝 | 𝑢 | 𝑝 , which can be recast as ⟨∇ 𝑔 ( 𝑢 ) − ∇ 𝑔 ( 𝑣 ) , 𝑢 − 𝑣 ⟩ ≥ 2 2 − 𝑝 𝑝 − 1 | 𝑢 − 𝑣 | 𝑝 . ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 15 Using the latter inequality and the fact that 𝐺 ( 𝑧 ) =  𝑁 𝑖 = 2 𝜆 𝑖 ∇ 𝑔 ( 𝑥 𝑖 − 𝑧 ) , we conclude that that ⟨ 𝐺 ( 𝑧 ) − 𝐺 ( 𝑧 𝑝 ) , 𝑧 𝑝 − 𝑧 ⟩ = 𝑁  𝑖 = 2 𝜆 𝑖 ⟨∇ 𝑔 ( 𝑥 𝑖 − 𝑧 𝑝 ) − ∇ 𝑔 ( 𝑥 𝑖 − 𝑧 ) , 𝑧 − 𝑧 𝑝 ⟩ ≥ ( 1 − 𝜆 1 ) 2 2 − 𝑝 𝑝 − 1 | 𝑧 − 𝑧 𝑝 | 𝑝 . Using that 𝐺 ( 𝑧 𝑝 ) = 0 and by means of a simple Cauchy-Schwarz inequality provides the sought lower bound. Let us now assume that 𝑝 > 2 , 𝑁 > 2 , and that ˆ 𝑥 ∉ diag ( R ( 𝑁 − 1 ) 𝑑 ) ). W e want to show the validity of ( 2.19 ), for every compact set 𝐾 ⊂ R 𝑑 . For o-diagonal ˆ 𝑥 , we have that 0 < inf 𝑧 ∈ 𝐾 𝐴 ( 𝑧 ) ≤ sup 𝑧 ∈ 𝐾 𝐴 ( 𝑧 ) < +∞ (2.25) and therefore fr om ( 2.22 ) we conclude that 1 𝑐 Id ≤ − ∇ 𝐺 ( 𝑧 ) ≤ 𝑐 Id ∀ 𝑧 ∈ 𝐾 . (2.26) for some 𝑐 = 𝑐 ( ˆ 𝑥 , 𝐾 ) ∈ ( 0 , +∞) . In particular , −∇ 𝐺 ( 𝑧 ) ≠ 0 everywhere, and therefore for 𝑝 ≠ 2 (as 𝛼 𝑝 ≠ 0 ), ∇ 𝑏 − 1 has a singularity exactly when 𝐺 ( 𝑧 ) = 0 , i.e. in 𝑧 = 𝑧 𝑝 . In a more quantitative way: we have that | 𝐺 ( 𝑧 ) | = | 𝐺 ( 𝑧 ) − 𝐺 ( 𝑧 𝑝 ) | = | ∇ 𝐺 ( 𝑤 𝑧,𝑧 𝑝 ) ( 𝑧 − 𝑧 𝑝 ) | , for some 𝑤 𝑧,𝑧 𝑝 belonging to the segment connecting 𝑧 and 𝑧 𝑝 . Therefore from ( 2.26 ) and the fact that ∇ 𝐺 ( 𝑧 ) is symmetric, we see that 14 1 ˜ 𝑐 | ( 𝑧 − 𝑧 𝑝 ) | ≤ | 𝐺 ( 𝑧 ) | ≤ ˜ 𝑐 | ( 𝑧 − 𝑧 𝑝 ) | , for every 𝑧 ∈ 𝐾 , where ˜ 𝑐 = ˜ 𝑐 ( ˆ 𝑥 , 𝐾 ) ∈ ( 0 , +∞) . In particular , fr om ( 2.23 ) , ( 2.25 ) , and ( 2.26 ) , the validity of ( 2.19 ) readily follows. ■ The case of 𝑝 ∈ ( 1 , 2 ) presents dierent typ e of singularities. In particular , they may occur when 𝑧 = ˆ 𝑥 𝑖 rather than 𝑧 = 𝑥 𝑝 ( ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 ) , as explained in the next proposition. For convenience , we set 𝛽 𝑝 : = − 𝛼 𝑝 ∈ ( 0 , +∞) whene ver 𝑝 ∈ ( 1 , 2 ) . W e denote by 𝜆 min : = min 𝑖 = 2 , .. .,𝑁 𝜆 𝑖 > 0 . Proposition 2.15 (Properties of 𝑝 -barycenters, 𝑝 ∈ ( 1 , 2 ) ) . Fix 𝑝 ∈ ( 1 , 2 ) and ˆ 𝑥 ∈ R 𝑑 ( 𝑁 − 1 ) , set 𝑏 = 𝑏 ˆ 𝑥 . For every 𝑧 ∈ R 𝑑 with 𝑧 ≠ ˆ 𝑥 𝑖 for every 𝑖 = 2 , . . . , 𝑁 , we have that ( 𝑝 − 1 ) 𝜆 min ( 1 − 𝜆 1 ) 𝛽 𝑝 𝜆 1 + 𝛽 𝑝 1  | 𝑧 − 𝑥 𝑝 ( ˆ 𝑥 ) | 𝑚 ( 𝑧 )  2 − 𝑝 ≤ ∇ 𝑏 − 1 ( 𝑧 ) − Id ≤ ( 1 + 𝛽 𝑝 ) ( 1 − 𝜆 1 ) 𝛽 𝑝 𝜆 1 + 𝛽 𝑝 1  𝑀 ( 𝑧 ) 𝑚 ( 𝑧 )  2 − 𝑝 , (2.27) where 𝑀 ( 𝑧 ) : = max 𝑖 ≥ 2 | ˆ 𝑥 𝑖 − 𝑧 | and 𝑚 ( 𝑧 ) : = min 𝑖 ≥ 2 | ˆ 𝑥 𝑖 − 𝑧 | . Proof. The proof follows the same lines as the proof of the same result for 𝑝 ≥ 2 . Once again, we set 𝑧 𝑝 : = 𝑥 𝑝 ( ˆ 𝑥 ) . Notice that in this case , 𝐺 ∈ 𝐶 0 ( R 𝑑 ) and 𝐺 ∈ 𝐶 1 ( R 𝑑 \ { ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 } ) . On the other hand, the function · | · | 𝛼 𝑝 = ( · ) | · | 𝛽 𝑝 is e very wher e dier entiable. Thus 𝑏 − 1 ∈ 𝐶 1 ( R 𝑑 \ { ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 } ) . W e have seen in ( 2.20 ) that ∇ 𝑏 − 1 ( 𝑧 ) = Id − 1 𝜆 1 + 𝛽 𝑝 1  Id + 𝛽 𝑝 𝐺 ( 𝑧 ) ⊗ 𝐺 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 2  | 𝐺 ( 𝑧 ) | 𝛽 𝑝 ∇ 𝐺 ( 𝑧 ) , 𝑧 ∈ R 𝑑 \ { ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 } . For every 𝑧 ∈ R 𝑑 \ { ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 } , arguing as in ( 2.21 ) w e obtained 0 < Id ≤ Id + 𝛽 𝑝 𝐺 ( 𝑧 ) ⊗ 𝐺 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 2 ≤ ( 1 + 𝛽 𝑝 ) Id . 14 For 𝐴 symmetric satisfying 𝛼 Id ≤ 𝐴 ≤ 𝛽 Id , 𝛼 , 𝛽 > 0 , then 𝛼 | 𝑥 | ≤ | 𝐴𝑥 | ≤ 𝛽 | 𝑥 | for every 𝑥 ∈ R 𝑑 . 16 C. BRIZZI AND L. PORTINALE Arguing as in ( 2.22 ) and ( 2.23 ), we obtain that for ev ery 𝑧 ∈ R 𝑑 \ { ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 } ,  1 + 𝑝 − 1 𝜆 1 + 𝛽 𝑝 1 ˜ 𝐴 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 𝛽 𝑝  Id ≤ ∇ 𝑏 − 1 ( 𝑧 ) ≤  1 + 1 + 𝛽 𝑝 𝜆 1 + 𝛽 𝑝 1 ˜ 𝐴 ( 𝑧 ) | 𝐺 ( 𝑧 ) | 𝛽 𝑝  Id , (2.28) for every 𝑧 ∈ R 𝑑 , where ˜ 𝐴 ( 𝑧 ) : =  𝑁 𝑖 = 2 𝜆 𝑖 1 | 𝑧 − ˆ 𝑥 𝑖 | 2 − 𝑝 . Now , we observe that 𝜆 min 1 𝑚 ( 𝑧 ) 2 − 𝑝 ≤ ˜ 𝐴 ( 𝑧 ) ≤ 𝑁  𝑖 = 2 𝜆 𝑖 1 𝑚 ( 𝑧 ) 2 − 𝑝 = ( 1 − 𝜆 1 ) 1 𝑚 ( 𝑧 ) 2 − 𝑝 , and | 𝐺 ( 𝑧 ) | 𝛽 𝑝 ≤  𝑁  𝑖 = 2 𝜆 𝑖 𝑀 ( 𝑧 ) 𝑝 − 1  𝛽 𝑝 = ( 1 − 𝜆 1 ) 𝛽 𝑝 𝑀 ( 𝑧 ) 2 − 𝑝 . The inequalities in ( 2.27 ) follow from ( 2.28 ) as well as the lower bound on | 𝐺 | provided in ( 2.24 ) . ■ Note that if 𝑁 = 2 , 𝑚 ( 𝑧 ) = 𝑀 ( 𝑧 ) = | 𝑧 − 𝑥 𝑝 ( ˆ 𝑥 ) | , coherently with Remark 2.13 . 3. 𝐿 𝑞 -regularity (and counterexamples) of the 𝑝 -W asserstein barycenter In this section, we prove Theorem 1.1 , on the 𝐿 𝑞 -integrability of the 𝑝 - W asserstein bar ycenter . W e start from the case where 𝜇 2 , . . . , 𝜇 𝑁 are single Deltas. W e provide a preliminary integrability result for distant supports and a countere xample to the general integrability for 𝑝 > 2 , ev en under the stronger assumption that 𝑓 1 ∈ 𝐿 ∞ and is compactly supp orted. This is in stark contrast to what happens in the case of 𝑝 - W asserstein ge odesics, i.e. 𝑁 = 2 , or when 𝑝 = 2 (see Remark 3.2 ). Finally , we extend the proof of the integrability of the density of the bar ycenter under the assumption of distant supports ( cf. ( ℎ𝑝 1 ) ) beyond the empirical case to general 𝜇 2 , . . . , 𝜇 𝑁 , arguing by approximation. This part relies on the preliminary analysis on the map 𝑥 𝑝 (and thus 𝑏 ) which we discussed in the previous section. Through all the section, we assume that 𝜇 1 ≪ L 𝑑 . By Theorem 2.4 we know that 𝜈 𝑝 ≪ L 𝑑 . W e denote with 𝑓 1 , 𝑔 𝑝 ∈ 𝐿 1 ( R 𝑑 ) the corresponding densities, i.e. 𝜇 1 = 𝑓 1 d L 𝑑 and 𝜈 𝑝 = 𝑔 𝑝 d L 𝑑 . 3.1. Estimates, integrability , and counterexamples for concentrate d measures. Let us assume that there exists ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 such that 𝜇 𝑖 = 𝛿 ˆ 𝑥 𝑖 , for every 𝑖 = 2 , . . . , 𝑁 . Then 𝑇 𝑖 ( ·) = ˆ 𝑥 𝑖 , is constant for every 𝑖 = 2 , . . . , 𝑁 . Notice that in this case, the map bar 𝑝 coincides with the map 𝑏 ( 𝑥 1 ) : = 𝑥 𝑝 ( 𝑥 1 , ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 ) . Recall that bar 𝑝 is globally injective (see Corollary 2.10 ), and 𝐶 1 (see Remark 2.13 ). In general, it is not a die omorphism, as this depends on the location of the supp orts of the measures. In this section, we see that if the supp orts are distant enough, bar 𝑝 is a dieomorphism with a quantitative low er bound | det ∇ bar 𝑝 | , which provides in this simpler setting a rst integrability result ( cf. Proposition 3.1 ). Other wise, we may encounter counter examples (see Example 3.3 ). W e assume compactly supported measures, i.e. there exists 𝑀 > 0 , such that spt 𝜇 𝑖 ⊂ 𝐵 𝑀 2 , for every 𝑖 = 1 , . . . , 𝑁 . ( C pt ) For simplicity , we denote by 𝑧 𝑝 : = 𝑥 𝑝 ( ˆ 𝑥 2 , . . . , ˆ 𝑥 𝑁 ) . Proposition 3.1 (Discrete integrability) . Let 𝜇 1 = 𝑓 1 d L 𝑑 such that ( C pt ) and with 𝑓 1 ∈ 𝐿 𝑞 . Assume (1) 𝑝 ≥ 2 and 𝑧 𝑝 ∉ spt 𝜇 1 , or (2) 1 < 𝑝 < 2 and bar 𝑝 ( 𝑥 1 ) ≠ ˆ 𝑥 𝑖 , for every 𝑖 = 1 , . . . , 𝑁 , and for every 𝑥 1 ∈ spt 𝜇 1 , Then 𝑔 𝑝 ∈ 𝐿 𝑞 and we have ∥ 𝑔 𝑝 ∥ 𝐿 𝑞 ≤ 𝐶 𝜆 𝑑 𝑞 ′ ( 1 − 𝛼 𝑝 ) 1 ∥ 𝑓 1 ∥ 𝐿 𝑞 , where 𝐶 ∈ R + depends on 𝑀 , 𝑝 , 𝑑 , and - respectively - on (1) dist  bar 𝑝 ( spt 𝜇 1 ) , 𝑧 𝑝  > 0 , or (2) min 𝑖 = 1 , .. .,𝑁 dist  bar 𝑝 ( spt 𝜇 1 ) , ˆ 𝑥 𝑖  > 0 . ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 17 Proof. Using the global injectivity of bar 𝑝 and a change of variables formula, we nd that 𝑔 𝑝 = ( 𝑓 1 ◦ bar − 1 𝑝 ) 𝐽 bar − 1 𝑝 , where 𝐽 bar − 1 𝑝 ( 𝑧 ) = | det ∇ bar − 1 𝑝 ( 𝑧 ) | =  𝐽 bar 𝑝 ( bar − 1 𝑝 ( 𝑧 ) )  − 1 , for every 𝑧 ∈ R 𝑑 . W e employ this formula to compute the 𝐿 𝑞 -norm of the barycenter 𝜈 𝑝 in this discrete setting: by applying a change of variables formula once again, we nd that ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 ( R 𝑑 ) =    ( 𝑓 1 ◦ bar − 1 𝑝 ) ( 𝑧 )   𝑞 · 𝐽 bar − 1 𝑝 ( 𝑧 ) 𝑞 d 𝑧 =  | 𝑓 1 ( 𝑥 1 ) | 𝑞 · ( 𝐽 bar − 1 𝑝 ◦ bar 𝑝 ) ( 𝑥 1 ) 𝑞 − 1 d 𝑥 1 . An application of Proposition 2.14 (in particular ( 2.18 ) ) and the fact that 𝑀 ≥ 𝑀 ( 𝑧 ) ≥ | 𝑧 − 𝑥 𝑝 ( ˆ 𝑥 ) | for every 𝑧 ∈ bar 𝑝 ( spt 𝜇 1 ) ( by ( C pt )), ensure that 𝐽 bar − 1 𝑝 ◦ bar 𝑝 ( ·) ≲ 𝑀 𝑑 ( 𝑝 − 2 ) 𝜆 𝑑 ( 𝛼 𝑝 − 1 ) 1 | bar 𝑝 ( ·) − 𝑧 𝑝 | 𝑑 ( 2 − 𝑝 ) . As bar 𝑝 is injective and bar 𝑝 ( 𝑧 𝑝 ) = 𝑧 𝑝 , from 𝑧 𝑝 ∉ spt 𝜇 1 we ensure that dist ( bar 𝑝 ( spt 𝜇 1 ) , 𝑧 𝑝 ) > 0 . Therefore w e conclude ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 ≲ 𝐶 𝜆 𝑑 ( 𝑞 − 1 ) ( 1 − 𝛼 𝑝 ) 1 ∥ 𝑓 1 ∥ 𝑞 𝐿 𝑞 dist  bar 𝑝 ( spt 𝜇 1 ) , 𝑧 𝑝  − 𝑑 ( 𝑞 − 1 ) ( 𝑝 − 2 ) , where 𝐶 = 𝐶 ( 𝑝 , 𝑑 , 𝑀 ) ∈ R + , as claimed. A very similar argument applies for the case of 1 < 𝑝 < 2 , and by Proposition 2.15 we get 𝐽 bar − 1 𝑝 ◦ bar 𝑝 ( ·) ≲ 𝑀 𝑑 ( 2 − 𝑝 ) 𝜆 𝑑 ( 𝛼 𝑝 − 1 ) 1 ( 𝑚 ◦ bar 𝑝 ) ( · ) 𝑑 ( 2 − 𝑝 ) , where we r ecall 𝑚 ( 𝑧 ) : = min 𝑖 = 1 , .. .,𝑁 | 𝑧 − ˆ 𝑥 𝑖 | 𝑑 ( 2 − 𝑝 ) . W e thus obtain ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 ≲ 𝐶 𝜆 𝑑 ( 𝑞 − 1 ) ( 1 − 𝛼 𝑝 ) 1 ∥ 𝑓 1 ∥ 𝑞 𝐿 𝑞 min 𝑖 = 1 , .. .,𝑁 dist  bar 𝑝 ( spt 𝜇 1 ) , ˆ 𝑥 𝑖  − 𝑑 ( 𝑞 − 1 ) ( 2 − 𝑝 ) , as claimed. ■ Remark 3.2. W e remark that • When 𝑁 = 2 , thanks to ( 2.8 ) and ( 2.13 ), one easily infer that 𝜈 𝑝 ≔ 𝑊 𝑝 bar ( ( 𝜇 1 , ( 1 − 𝑡 ) ) , ( 𝜇 2 , 𝑡 ) ) ) = ( ( 1 − 𝑡 ( 𝑝 ) ) Id + 𝑡 ( 𝑝 ) 𝑇 2 ) ♯ 𝜇 1 , (3.1) where 𝑡 ( 𝑝 ) ≔ 𝑡 1 𝑝 − 1  𝑡 1 𝑝 − 1 + ( 1 − 𝑡 ) 1 𝑝 − 1  − 1 and 𝑇 2 , given by Theorem 2.5 , is also the optimal map for 𝑊 𝑝 ( 𝜇 1 , 𝜇 2 ) . Thus, as expected, the 𝑝 - W asserstein barycenter with weights ( 1 − 𝑡 ) , 𝑡 is the 𝑝 - W asserstein geodesic parametrized by 𝑡 ( 𝑝 ) . In the case of Proposition 3.1 , with 𝜇 2 = 𝛿 ˆ 𝑥 2 , as highlighte d in Remark 2.13 , one has ∇ bar 𝑝 ( 𝑥 1 ) = ( 1 − 𝑡 ( 𝑝 ) ) , hence by the change of variable formula ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 ≤ ( 1 − 𝑡 ( 𝑝 ) ) 𝑑 ( 1 − 𝑞 ) | | 𝑓 𝑝 | | 𝐿 𝑞 , regardless of any assumptions on spt 𝜇 1 , consistently with [ San15 , Lemma 4.22]. • When 𝑝 = 2 , thanks to ( 2.8 ) and to Remark 2.13 , from the change of variable formula one directly infers that ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 ≤ 𝜆 𝑑 ( 1 − 𝑞 ) 1 | | 𝑓 𝑝 | | 𝐿 𝑞 , and without extra assumptions, such as dist  bar 𝑝 ( spt 𝜇 1 ) , 𝑧 𝑝  > 0 , consistently with the results in [ A C11 , Theorem 5.11]. Example 3.3 (Counterxamples to integrability) . The following examples show that, when 𝑝 ≠ 2 , 𝑁 > 2 , and ˆ 𝑥 ∉ diag ( R 𝑑 ( 𝑁 − 1 ) ) ), assumptions ( 1 ) and ( 2 ) in Proposition 3.1 can not be omitted in general. Let 𝐵 ⊂ R 𝑑 be an op en set of L 𝑑 -measure 1 and consider 𝜇 1 : = 𝑓 1 L 𝑑 where 𝑓 1 : = 1 𝐵 . W e consider two cases of interest. 18 C. BRIZZI AND L. PORTINALE ( Counterexamples for 𝑝 > 2 ) . For superquadratic costs, we assume that 𝑧 𝑝 ∈ 𝐵 . Then w e claim that the associated W asserstein barycenter 𝜈 𝑝 = 𝑔 𝑝 L 𝑑 is such that 𝑔 𝑝 ∈ 𝐿 𝑞 ⇔ 𝑞 < 1 𝛼 𝑝 = 𝑝 − 1 𝑝 − 2 . (3.2) Note that this provides a counter example to the 𝑞 -integrability for 𝑝 > 2 as soon as 𝑞 ≥ 𝛼 − 1 𝑝 . In order to show ( 3.2 ), w e apply Proposition 2.14 , and in particular ( 2.19 ), obtaining that  | 𝑔 𝑝 ( 𝑧 ) | 𝑞 d 𝑧 =  𝐵   ( 𝐽 bar − 1 𝑝 ◦ bar 𝑝 ) ( 𝑥 1 )   𝑞 − 1 d 𝑥 1 =  bar 𝑝 ( 𝐵 )   𝐽 bar − 1 𝑝 ( 𝑧 )   𝑞 d 𝑧 ≃  bar 𝑝 ( 𝐵 ) 1 | 𝑧 − 𝑧 𝑝 | 𝑑 𝑞𝛼 𝑝 d 𝑧 . The claimed conclusion then follows from the fact that bar 𝑝 is an invertible open (it has a continuous inverse) map and the fact that  𝐵 1 ( 0 ) 1 | 𝑧 | 𝑑 𝑞𝛼 𝑝 d 𝑧 < ∞ ⇔ 𝑑 𝑞𝛼 𝑝 < 𝑑 ⇔ 𝑞 < 1 𝛼 𝑝 . ( Counterexamples for 1 < 𝑝 < 2 ). Assume in this case that ˆ 𝑥 𝑖 ∈ bar 𝑝 ( 𝐵 ) , for some 𝑖 = 1 , . . . , 𝑁 . Then we have that 𝑔 𝑝 ∈ 𝐿 𝑞 ⇔ 𝑞 < 1 2 − 𝑝 . Indeed, the estimate ( 2.27 ) , provided by Proposition 2.15 , sho ws that 𝐽 bar − 1 𝑝 ( 𝑧 ) ≃ | 𝑧 − ˆ 𝑥 𝑖 | 𝑑 ( 𝑝 − 2 ) as 𝑧 → ˆ 𝑥 𝑖 . 3.2. 𝐿 𝑞 -estimates for general measures with distant support. In this section we prove Theorem 1.1 . In particular , by using a discr etization approach, we show that Pr oposition 3.1 can be extended to general measures 𝜇 2 , . . . , 𝜇 𝑁 . For 𝑞 ∈ N , we denote by 𝑞 ′ ∈ N its conjugate. As conclusions and assumptions are dier ent depending on whether 𝑝 < 2 or 𝑝 > 2 , we divide the proof into these two cases. 3.2.1. Integrability for 𝑝 ≥ 2 . Recall that we assume compactly supported measures ( C pt ) and work with the assumption D : = inf ( 𝑥 1 ,. . ., 𝑥 𝑁 ) ∈ > 𝑁 𝑖 = 1 spt 𝜇 𝑖   𝑥 𝑝 ( 𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑁 ) − 𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 )   > 0 . ( ℎ𝑝 1 ) Note that, when the measures are compactly supported, ( ℎ𝑝 1 ) is equivalent to dist  spt 𝜇 1 , 𝑥 𝑝  𝑁 ? 𝑖 = 2 spt 𝜇 𝑖   > 0 . ( ℎ𝑝 ∗ 1 ) Indeed, by contradiction: assume D = 0 , which, by compactness and continuity of 𝑥 𝑝 , means that one can nd points 𝑥 1 , . . . , 𝑥 𝑁 with 𝑥 𝑖 ∈ spt 𝜇 𝑖 such that 𝑥 𝑝 ( 𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑁 ) = 𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 ) = 𝑥 𝑝  𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 ) , 𝑥 2 , . . . , 𝑥 𝑁  . As the barycenter map 𝑥 1 ↦→ 𝑥 𝑝 ( 𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑁 ) is injective for every 𝑥 2 , . . . , 𝑥 𝑁 , we conclude that 𝑥 1 = 𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 ) , with 𝑥 𝑖 ∈ spt 𝜇 𝑖 , which clearly contradicts the assumption ( ℎ𝑝 ∗ 1 ) . Viceversa , if there exist 𝑥 1 , . . . , 𝑥 𝑁 such that 𝑥 1 = 𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 ) , it must hold 𝑥 𝑝 ( 𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑁 ) = 𝑥 𝑝 ( 𝑥 2 , . . . , 𝑥 𝑁 ) , contradicting ( ℎ 𝑝 1 ). ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 19 Proof of Theorem 1.1 ( 𝑝 ≥ 2 ) . W e proceed in two steps: rst, we extend the validity of Proposi- tion 3.1 to the case when 𝜇 2 , . . . , 𝜇 𝑁 are empirical measures . Secondly , we will make use of this result and provide an appr oximation argument with respect to which the sought ine quality and our assumption ( ℎ𝑝 1 ) are stable. Step 1. Assume there e xists atoms { 𝑥 𝑖 𝑗 } 𝑖 , 𝑗 ⊂ R 𝑑 and masses { 𝑚 𝑖 𝑗 } 𝑖 , 𝑗 ⊂ R + so that 𝜇 𝑖 = 𝐾 𝑖  𝑗 = 1 𝑚 𝑖 𝑗 𝛿 𝑥 𝑖 𝑗 , for 𝑖 = 2 , . . . , 𝑁 , 𝑗 = 1 , . . . , 𝐾 𝑖 ∈ N . (3.3) As 𝜇 1 ≪ L 𝑑 , we have 𝜈 𝑝 = 𝑔 𝑝 L 𝑑 = ( bar 𝑝 ) # ( 𝜇 1 ) with bar 𝑝 ( 𝑥 1 ) = 𝑥 𝑝 ( 𝑥 1 , 𝑇 2 ( 𝑥 1 ) , . . . , 𝑇 𝑁 ( 𝑥 1 ) ) . As the measures are empirical, for e very 𝑖 = 2 , . . . , 𝑁 the map 𝑇 𝑖 = 𝑅 𝑖 ◦ 𝑆 1 is piecewise constant and assumes precisely the value { 𝑥 𝑖 1 , . . . , 𝑥 𝑖 𝐾 𝑖 } ⊂ R 𝑑 . W e divide the supp ort of 𝜇 1 in  𝑁 𝑖 = 1 𝐾 𝑖 sets, using the maps 𝑇 2 , . . . , 𝑇 𝑁 , as follows: for each choice of multi-index j = ( 𝑗 2 , . . . , 𝑗 𝑁 ) ∈ J with 𝑗 𝑖 ∈ { 1 , . . . , 𝐾 𝑖 } , w e dene Ω j : =  𝑥 ∈ R 𝑑 ∩ spt 𝜇 1 : 𝑇 𝑖 ( 𝑥 1 ) = 𝑥 𝑗 𝑖  . Note that such sets are, by construction, disjoint: Ω j ∩ Ω k = ∅ if j ≠ k . Consider now the image of such sets via bar 𝑝 and set Ω 𝑝 j : = bar 𝑝 ( Ω j ) for 𝑗 ∈ J . As bar 𝑝 is injective (Cor ollar y 2.10 ), we conclude that the new sets are also disjoint, and all in all, w e have Ω 𝑝 j ∩ Ω 𝑝 k = ∅ if j ≠ k and 𝜈 𝑝   j ∈ J Ω 𝑝 j  = 𝜈 𝑝  bar 𝑝 ( spt 𝜇 1 )  = 1 . Moreover , for every given multi-index j ∈ J , we have that bar 𝑝   Ω j = 𝑏 j where 𝑏 j ( 𝑥 1 ) : = 𝑥 𝑝 ( 𝑥 1 , 𝑥 𝑗 2 , . . . , 𝑥 𝑗 𝑁 ) , 𝑥 1 ∈ R 𝑑 . In particular , it is the restriction of a locally Lipschitz function in R 𝑑 (cfr . Proposition 2.12 ). A s a consequence, using once again that bar 𝑝 is injective (cfr . Corollar y 2.10 ), w e conclude that 𝜈 𝑝   Ω 𝑝 j inj = ( bar 𝑝 ) #  𝜇 1   Ω j  = ( 𝑏 j ) #  𝜇 1   Ω j  = ( 𝑏 j ) # 𝜇 1   Ω 𝑝 j , ∀ j ∈ J . From the above equality , using the change of variable formula [ AFP00 ] for locally Lipschitz map 𝑏 j , we deduce that, locally on Ω j , the density 𝑔 𝑝 of 𝜈 𝑝 can be written as ( 𝑔 𝑝 ◦ 𝑏 j ) 𝐽 𝑏 j = 𝑓 1 on Ω j ⇐ ⇒ 𝑔 𝑝 =  𝑓 1 ◦ 𝑏 − 1 j  𝐽 𝑏 − 1 j on Ω 𝑝 j . (3.4) Here 𝐽 𝑏 j denotes the Jacobian of 𝑏 j , i.e. 𝐽 𝑏 j ( 𝑥 ) : = | det ∇ 𝑏 j ( 𝑥 ) | = 𝐽 𝑏 − 1 j ( 𝑏 j ( 𝑥 ) ) − 1 , 𝑥 ∈ R 𝑑 . (3.5) Therefore , by ( 3.4 ) and using that { Ω 𝑝 j } j are disjoint, we can compute the norm ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 disj =  j ∈ J  Ω 𝑝 j | 𝑔 𝑝 ( 𝑧 ) | 𝑞 d 𝑧 ( 3.4 ) =  j ∈ J  Ω 𝑝 j    𝑓 1 ◦ 𝑏 − 1 j  ( 𝑧 ) 𝐽 𝑏 − 1 j ( 𝑧 )   𝑞 d 𝑧 (3.6) c.o.v . =  j ∈ J  Ω j | 𝑓 1 ( 𝑥 1 ) | 𝑞 ( 𝐽 𝑏 − 1 j ◦ 𝑏 j ) ( 𝑥 1 ) 1 − 𝑞 d 𝑥 1 , From now on, we simply argue in the same way we did for Proposition 3.1 . By a direct application of Proposition 2.14 (in particular ( 2.18 ) ), ( 3.5 ) , and ( ℎ𝑝 1 ) , we conclude that for 𝑥 1 ∈ Ω j , 𝐽 𝑏 − 1 j ◦ 𝑏 j ( ·) ≤ 𝐶 𝜆 𝑑 ( 𝛼 𝑝 − 1 ) 1 | 𝑏 j ( ·) − 𝑧 𝑝 | 𝑑 ( 2 − 𝑝 ) ≤ 𝐶 𝜆 𝑑 ( 𝛼 𝑝 − 1 ) 1 D 𝑑 ( 2 − 𝑝 ) (3.7) 20 C. BRIZZI AND L. PORTINALE for some 𝐶 ∈ R + (depending on 𝑀 , given in ( C pt ) ), where D > 0 is the one given by ( ℎ𝑝 1 ) . By inserting this lower bound into ( 3.6 ) we conclude that ∥ 𝑔 𝑝 ∥ 𝑞 𝐿 𝑞 ≤ 𝐶 𝑞 − 1  𝜆 𝑑 ( 1 − 𝛼 𝑝 ) D 𝑑 ( 𝑝 − 2 )  𝑞 − 1  j ∈ J  Ω j | 𝑓 1 ( 𝑥 1 ) | 𝑞 d 𝑥 1 disj = 𝐶 𝑞 − 1  𝜆 𝑑 ( 1 − 𝛼 𝑝 ) D 𝑑 ( 𝑝 − 2 )  𝑞 − 1 ∥ 𝑓 1 ∥ 𝑞 𝐿 𝑞 , which provides the claimed integrability . Step 2 . W e procee d by approximation. Let 𝜇 1 , . . . , 𝜇 𝑁 be measures satisfying ( ℎ𝑝 1 ) . W e construct 𝑁 − 1 se quences of measures 𝜇 𝑛 2 , . . . , 𝜇 𝑛 𝑁 so that, for every 𝑖 = 2 , . . . , 𝑁 , (1) For ev ery 𝑛 ∈ N , 𝜇 𝑛 𝑖 is an empirical measure of the form ( 3.3 ). (2) For ev ery 𝑛 ∈ N , we have spt 𝜇 𝑛 𝑖 ⊂ spt 𝜇 𝑖 . (3) W e have 𝜇 𝑛 𝑖 → 𝜇 𝑖 narrowly (i.e . in duality with 𝐶 𝑏 ( R 𝑑 ) ). The existence of such a construction is standard and ther efore omitted. Observe that by construction, we hence have 𝑁 ? 𝑖 = 1 spt 𝜇 𝑛 𝑖 ⊂ 𝑁 ? 𝑖 = 1 spt 𝜇 𝑖 = ⇒ D 𝑛 : = D ( 𝜇 1 , 𝜇 𝑛 2 , . . . , 𝜇 𝑛 𝑁 ) ≥ D ( 𝜇 1 , 𝜇 2 , . . . , 𝜇 𝑁 ) = D > 0 , for all 𝑛 ∈ N . Denoted by 𝜈 𝑛 𝑝 = 𝑔 𝑛 𝑝 L 𝑑 the corresponding 𝑊 𝑝 -barycenter of the measures 𝜇 1 , 𝜇 𝑛 2 , . . . , 𝜇 𝑛 𝑁 , we can therefor e apply the reasoning from Step 1 and obtain ∥ 𝑔 𝑛 𝑝 ∥ 𝐿 𝑞 ≤ 𝐶  𝜆 𝑑 ( 1 − 𝛼 𝑝 ) D 𝑑 ( 𝑝 − 2 ) 𝑛  𝑞 ′ ∥ 𝑓 1 ∥ 𝐿 𝑞 ≤ 𝐶  𝜆 𝑑 ( 1 − 𝛼 𝑝 ) D 𝑑 ( 𝑝 − 2 )  𝑞 ′ ∥ 𝑓 1 ∥ 𝐿 𝑞 , ∀ 𝑛 ∈ N . (3.8) The proof will thus be complete if we are able to show that lim inf 𝑛 →∞ ∥ 𝑔 𝑛 𝑝 ∥ 𝐿 𝑞 ≥ ∥ 𝑔 𝑝 ∥ 𝐿 𝑞 . W e claim that 𝜈 𝑛 𝑝 → 𝜈 𝑝 narrowly as 𝑛 → ∞ . Inde ed, by a standard stability r esult of optimal transport, see for instance [ Fri24 , Theorem 6.13] or [ ABS24 , Theorem 6.8], if 𝛾 𝑛 𝑝 and 𝛾 𝑝 are respectively the optimal plans for ( C 𝑝 − MM ) with marginals 𝜇 𝑛 1 , . . . , 𝜇 𝑛 𝑁 and 𝜇 1 , . . . , 𝜇 𝑁 , 𝛾 𝑛 𝑝 → 𝛾 𝑝 . W e conclude by observing that 𝜈 𝑛 𝑝 and 𝜈 𝑝 are the push-forward with respect to the continuous map 𝑥 𝑝 of, respectively , 𝛾 𝑛 𝑝 and 𝛾 𝑝 . The lower-semicontinuity property ( 3.8 ) of the 𝐿 𝑞 -norm with respect to the narro wly convergence of measur es then follows by standard arguments. Inde ed, without loss of generality , we can assume sup 𝑛 ∈ N ∥ 𝑔 𝑛 𝑝 ∥ 𝐿 𝑞 < ∞ , if not there is nothing to prove. In this case, as 𝐿 𝑞 is reexiv e for 𝑞 ∈ ( 1 , +∞) , up to a non-relabeled subsequence, ther e must exist 𝑔 ∈ 𝐿 𝑞 so that 𝑔 𝑛 𝑝 ⇀ 𝑔 weakly in 𝐿 𝑞 , or in other words lim 𝑛 →∞  𝑔 𝑛 𝑝 𝜑 d 𝑥 =  𝑔𝜑 d 𝑥 , ∀ 𝜑 ∈ 𝐿 𝑞 ′ . As we alr eady know that 𝜈 𝑛 𝑝 → 𝜈 𝑝 narrowly , we conclude that  𝑔𝜑 d 𝑥 =  𝑔 𝑝 𝜑 d 𝑥 , ∀ 𝜑 ∈ 𝐿 𝑞 ′ ∩ 𝐶 𝑏 ( R 𝑑 ) . As 𝐿 𝑞 ′ ( R 𝑑 ) ∩ 𝐶 𝑏 ( R 𝑑 ) ↩ → 𝐿 𝑞 ′ ( R 𝑑 ) densely , this implies 𝑔 = 𝑔 𝑝 . Finally , ( 3.8 ) follows by the standard lower-semicontinuity of the 𝐿 𝑞 -norm with respect to the weak convergence. ■ 3.2.2. Integrability for 𝑝 ∈ ( 1 , 2 ) . By means of the ver y same approach, one can show that a result of the same type as in Proposition 3.1 can b e obtaine d when 𝑝 ∈ ( 1 , 2 ) under suitable assumptions, which take care of the dierent type of singularities, as already illustrate d in Proposition 2.15 . Recall that in this regime we work with once again with compactly supported measures ( C pt ) as well as the slightly stornger geometric assumption 𝑚 : = inf 𝑖 ∈ { 1 ,. . .,𝑁 } inf ( 𝑥 1 ,. . ., 𝑥 𝑁 ) ∈ > 𝑁 𝑖 = 1 spt 𝜇 𝑖   𝑥 𝑖 − 𝑥 𝑝 ( 𝑥 1 , 𝑥 2 , . . . , 𝑥 𝑁 )   > 0 . ( ℎ𝑝 2 ) As usual, for 𝑞 ∈ N , we denote by 𝑞 ′ ∈ N its conjugate. ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 21 Proof of Theorem 1.1 ( 𝑝 ∈ ( 1 , 2 ) ) . W e just sketch the steps of the pr o of as it contains no substan- tially new ideas to the case with 𝑝 ≥ 2 . The approximation argument used for the case 𝑝 > 2 works practically unchanged, with the only dierence consisting in the low er bound in ( 3.7 ) , which, for 𝑝 ∈ ( 1 , 2 ) , as a consequence of Proposition 2.15 , is replaced by 𝐽 𝑏 − 1 j ◦ 𝑏 j ( ·) ≤ 𝐶 𝜆 𝑑 ( 1 + 𝛽 𝑝 ) 1 ( 𝑚 𝑗 ◦ 𝑏 j ) ( · ) 𝑑 ( 2 − 𝑝 ) ≤ 𝐶 𝜆 𝑑 ( 1 + 𝛽 𝑝 ) 1 𝑚 𝑑 ( 2 − 𝑝 ) where 𝑚 𝑗 ( 𝑧 ) : = min 𝑖 ≥ 2 | 𝑥 𝑗 𝑖 − 𝑧 | satises ( 𝑚 𝑗 ◦ 𝑏 j ) ( 𝑥 1 ) ≥ 𝑚 for every j and 𝑥 1 ∈ spt 𝜇 1 by construction. The rest of the proof works v erbatim. ■ 4. A general estima te on the 𝐿 𝑞 -norm of 𝑝 -W asserstein barycenters The goal of this section is to pro ve the general estimate on the norm of the density of the 𝑝 - W asserstein state d in Propositon 1.3 . This holds without any assumption on the supp orts and it highlights possible sources of nonintegrability . This provides (cf. Remark 4.5 ) an alternative pr oof of the integrability result given by Theorem 1.1 , under the slightly stronger assumption that ( ℎ𝑝 2 ) holds for every 1 < 𝑝 < ∞ . First, for every 𝑆 ⊂ { 1 , . . . , 𝑁 } , let us consider the denition of the sets 𝐷 𝑆 and 𝐷 1 𝑆 , given, respectively , by ( 1.5 ) and ( 1.6 ) . Notice that if 𝛾 𝑝 = ( Id , 𝑇 2 , . . . , 𝑇 𝑁 ) # 𝜇 1 ∈ P ( R 𝑑 𝑁 ) is the optimal coupling for the problem ( C 𝑝 − MM ), then, 𝐷 1 𝑆 : = 𝜋 1 ( 𝐷 𝑆 ∩ spt 𝛾 𝑝 ) =  𝑥 1 ∈ spt 𝜇 1 : 𝑇 𝑖 ( 𝑥 1 ) = bar 𝑝 ( 𝑥 1 ) if 𝑖 ∈ 𝑆 , and 𝑇 𝑖 ( 𝑥 1 ) ≠ bar 𝑝 ( 𝑥 1 ) if 𝑖 ∉ 𝑆  , where bar 𝑝 is given by ( 2.7 ) . The rst result of this section concerns the dierentiability properties of bar 𝑝 . Proposition 4.1 (Dierentiability of bar 𝑝 ) . For 𝑝 > 2 , the map bar 𝑝 is dierentiable L 𝑑 -a.e. on the set  𝑆 ∉ F 1 𝐷 1 𝑆 . If 1 < 𝑝 ≤ 2 , bar 𝑝 is dierentiable L 𝑑 -a.e. on spt 𝜇 1 . Proof. By Corollary 2.10 we know that bar 𝑝 ( 𝑥 1 ) = 𝑆 1 ( 𝑥 1 ) , L 𝑑 -a.e. on spt 𝜇 1 , where 𝑆 1 is given by ( 2.6 ) . Let  𝜑 1 the same as in ( 2.6 ) . By The orem 2.1 and Remark 2.3 ,  𝜑 1 is twice dierentiable L 𝑑 -a.e. on spt 𝜇 1 . Therefore, in order to prov e the dierentiability of 𝑆 1 , it is enough to look at the dierentiability of the function · | · | 𝛼 𝑝 . If 1 < 𝑝 ≤ 2 , 𝛼 𝑝 ≤ 0 and the function is every where dierentiable. If 𝑝 > 2 , 𝛼 𝑝 ∈ ( 0 , 1 ) , then · | · | 𝛼 𝑝 is dierentiable in R 𝑑 \ { 0 } . Let 𝑥 1 ∈ spt 𝜇 1 such that ∇  𝜑 ( 𝑥 1 ) exists. By Cor ollar y 2.2 and standard duality arguments, we hav e that ∇  𝜑 1 ( 𝑥 1 ) = ∇ 𝑥 1 | 𝑥 𝑖 − 𝑧 | 𝑝 = 𝑝 | 𝑥 1 − 𝑧 | 𝑝 − 2 ( 𝑥 1 − 𝑧 ) for 𝛾 1 -a.e. ( 𝑥 𝑖 , 𝑧 ) . (4.1) Moreover , ( 𝑥 1 , 𝑧 ) ∈ spt 𝛾 1 if and only if 𝑧 = bar 𝑝 ( 𝑥 1 ) (see ( 2.7 ) ). If 𝑥 1 ∈ 𝐷 1 𝑆 , with 𝑆 ∉ F 1 , then bar 𝑝 ( 𝑥 1 ) ≠ 𝑥 1 and thus | 𝑥 1 − bar 𝑝 ( 𝑥 1 ) | = | 𝑥 1 − 𝑧 | > 0 , and, by ( 4.1 ) , | ∇  𝜑 1 ( 𝑥 1 ) | > 0 . This implies that 𝑆 1 is L 𝑑 -a.e. dierentiable on  𝑆 ∉ ∈ F 1 𝐷 1 𝑆 . ■ Remark 4.2 ( 𝐷 1 𝑆 with 𝑆 ∈ F 1 ) . For 𝑝 > 2 , on the set 𝐷 1 𝑆 , with 𝑆 ∈ F 1 , we cannot generally ensure dierentiability of bar 𝑝 . On the other end, by denition of 𝐷 1 𝑆 , 𝑥 1 = bar 𝑝 ( 𝑥 1 ) = 𝑆 1 ( 𝑥 ) , meaning that 𝑆 1 = Id on  𝑆 ∈ F 1 𝐷 1 𝑆 . This shows that when restricted to the set  𝑆 ∈ F 1 𝐷 1 𝑆 , ∇ bar 𝑝 exists, where ∇ bar 𝑝 ( 𝑥 ) is identied by the standard property bar 𝑝 ( 𝑦 ) − bar 𝑝 ( 𝑥 ) − ⟨∇ bar 𝑝 ( 𝑦 ) − ∇ bar 𝑝 ( 𝑥 ) , 𝑦 − 𝑥 ⟩ = 𝑜 ( | 𝑥 − 𝑦 | ) as 𝑦 → 𝑥 in  𝑆 ∈ F 1 𝐷 1 𝑆 . With this denition, ∇ bar 𝑝 is uniquely determined whenever  𝑆 ∈ F 1 𝐷 1 𝑆 has p ositive density at 𝑥 . Hence , ambiguity occurs only on a L 𝑑 -negligible subset, which is not relevant while integrating. Howev er , we se e from Proposition 1.3 that values of ∇ bar 𝑝 on the set  𝑆 ∈ F 1 𝐷 1 𝑆 are irrele vant to the analysis. Next, we seek injectivity estimates on bar 𝑝 which provide upper bounds on the Jacobian of its inverse. This is a direct conse quence of the injectivity estimate of Lemma A.2 , which is an ad hoc version of [ BFR26 , Lemma 5.2] and [ BFR25 , Lemma 2.9]. 22 C. BRIZZI AND L. PORTINALE For any p oint x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ R 𝑑 𝑁 , recall the denition of 𝐻 𝑖 ( x ) and Λ 𝑖 ( x ) given in ( 1.4 ) . Then, given 𝑇 2 , . . . , 𝑇 𝑁 optimal for the multimarginal problem ( C 𝑝 − MM ), we dene 𝐻 1 𝑖 ( 𝑥 1 ) : = 𝐻 𝑖 ◦ ( Id , 𝑇 2 , . . . , 𝑇 𝑁 ) and Λ 1 𝑖 ( 𝑥 1 ) = Λ 𝑖 ◦ ( Id , 𝑇 2 , . . . , 𝑇 𝑁 ) ( 𝑥 1 ) . Notice that, consistently with ( A.2 ), 𝐻 1 𝑖 ( 𝑥 1 ) ≥ 0 ( and thus Λ 1 𝑖 ( x ) ≥ 0 ) , for every 𝑥 1 ∈ R 𝑑 , 𝐻 1 𝑖 ( 𝑥 1 ) = 0 ( and thus Λ 1 𝑖 ( 𝑥 1 ) = 0 ) ⇐ ⇒ 𝑇 𝑖 ( 𝑥 1 ) = bar 𝑝 ( 𝑥 1 ) . Corollary 4.3 (Lo cal regularity of bar 𝑝 ) . Let 𝑆 ⊂ { 1 , . . . , 𝑁 } ∉ F 1 . Then for every 𝑥 1 ∈ 𝐷 1 𝑆 ∩ spt 𝜇 1 , there exists 𝑟 ( 𝑥 1 ) > 0 such that, for every 𝑦 1 , ˜ 𝑦 1 ∈ 𝐵 𝑟 ( 𝑥 1 ) ( 𝑥 1 ) ∩ 𝐷 1 𝑆 ∩ spt 𝜇 1 , | bar 𝑝 ( 𝑦 1 ) − bar 𝑝 ( ˜ 𝑦 1 ) | ≥ 1 2  min 𝑖 ∉ 𝑆 Λ 1 𝑖 ( 𝑥 1 )  max 𝑖 ∉ 𝑆 | 𝐻 1 𝑖 ( 𝑥 1 ) | | 𝑦 1 − ˜ 𝑦 1 | . (4.2) Proof. Let 𝑆 ∉ F 1 and 𝑥 1 ∈ spt 𝜇 1 ∩ 𝐷 1 𝑆 . Then the p oint x : = ( 𝑥 1 , 𝑇 2 ( 𝑥 1 ) , . . . , 𝑇 𝑁 ( 𝑥 1 ) ) belongs to 𝐷 𝑆 ∩ spt 𝛾 𝑝 . By optimality , spt 𝛾 𝑝 ⊂ R 𝑑 𝑁 is 𝑐 𝑝 -monotone. Hence, one can apply Lemma A.2 for every y ∈ 𝐵 𝑟 ( x ) ( x ) ∩ 𝐷 𝑆 ∩ spt 𝛾 𝑝 , for some 𝑟 ( x ) > 0 . As bar 𝑝 ( 𝑦 1 ) = 𝑥 𝑝 ( y ) with y = ( 𝑦 1 , 𝑇 2 ( 𝑦 1 ) , . . . , 𝑇 𝑁 ( 𝑦 1 ) ) ∈ 𝐷 𝑆 ∩ spt 𝛾 𝑝 whenever 𝑦 1 ∈ 𝐷 1 𝑆 ∩ spt 𝜇 1 , the estimate ( 4.2 ) follows from ( A.1 ) by choosing 𝑟 ( 𝑥 1 ) : = 𝑟 ( x ) ( so that 𝐵 𝑟 ( 𝑥 1 ) ( 𝑥 1 ) ⊂ 𝜋 1 ( 𝐵 𝑟 ( x ) ( x ) ) ). ■ W e recall this general and straightforward result. Remark 4.4 (Push-forward via an inje ctive map) . Let 𝜇 , 𝜈 ∈ P ( R 𝑑 ) and 𝑓 a measurable function such that 𝜈 = 𝑓 ♯ 𝜇 . If 𝑓 is injective, then for every Borel set 𝐴 , 𝜈 | 𝑓 ( 𝐴 ) = ( 𝑓 | 𝐴 ) ♯ 𝜇 . Indeed, given a Borel set 𝐵 ⊂ R 𝑑 , 𝜈 | 𝑓 ( 𝐴 ) ( 𝐵 ) = 𝜈 ( 𝐵 ∩ 𝑓 ( 𝐴 ) ) = 𝜈 = 𝑓 ♯ 𝜇 𝜇  𝑓 − 1 ( 𝐵 ∩ 𝑓 ( 𝐴 ) )  = ( ∗ ) 𝜇  𝑓 − 1 ( 𝐵 ) ∩ 𝑓 − 1 ( 𝑓 ( 𝐴 ) )  = 𝑓 injective 𝜇  𝑓 − 1 ( 𝐵 ) ∩ 𝐴 )  = 𝜇 ( 𝑓 | − 1 𝐴 ( 𝐵 ) ) = ( 𝑓 | 𝐴 ) ♯ 𝜇 ( 𝐵 ) , where in ( ∗) we simply used that 𝑓 − 1 ( 𝐶 ∩ 𝐷 ) = 𝑓 − 1 ( 𝐶 ) ∩ 𝑓 − 1 ( 𝐷 ) for every set 𝐶 , 𝐷 ⊂ R 𝑑 . W e are nally ready to pro ve our general 𝐿 𝑞 -estimate. Proof of Proposition 1.3 . Recall that 𝜈 𝑝 = 𝑥 𝑝 ♯ 𝛾 𝑝 , where 𝛾 𝑝 is optimal for ( C 𝑝 − MM ) 𝑥 𝑝 ♯ 𝛾 𝑝 = bar 𝑝 ♯ 𝜇 1 . Since bar 𝑝 is injective and L 𝑑 -a.e. dierentiable on the union of 𝐷 𝑆 with 𝑆 ∉ F 1 (see Corollary 2.10 and Proposition 4.1 ), using the properties of the push-for ward ( cfr . Remark 4.4 ), we conclude that for every measurable function 𝜑 : R 𝑑 → [ 0 , +∞] ,  𝜑 ( 𝑥 1 ) 𝑓 1 ( 𝑥 1 ) d 𝑥 1 =   𝑆 ∈ F 1 𝐷 1 𝑆 𝜑 ( 𝑥 1 ) 𝑓 1 ( 𝑥 1 ) d 𝑥 1 +   𝑆 ∉ F 1 𝐷 1 𝑆 𝜑 ( 𝑥 1 ) 𝑓 1 ( 𝑥 1 ) d 𝑥 1 =  bar 𝑝 (  𝑆 ∈ F 1 𝐷 1 𝑆 ) 𝜑 ( bar − 1 𝑝 ( 𝑧 ) ) 𝑔 𝑝 ( 𝑧 ) d 𝑧 +  bar 𝑝 (  𝑆 ∉ F 1 𝐷 1 𝑆 ) 𝜑 ( bar − 1 𝑝 ( 𝑧 ) ) 𝑔 𝑝 ( 𝑧 ) d 𝑧 =   𝑆 ∈ F 1 𝐷 1 𝑆 𝜑 ( 𝑧 ) 𝑔 𝑝 ( 𝑧 ) d 𝑧 +   𝑆 ∉ F 1 𝐷 1 𝑆 𝜑 ( 𝑥 1 ) 𝑔 𝑝 ( bar 𝑝 ( 𝑥 1 ) ) 𝐽 bar 𝑝 ( 𝑥 1 ) d 𝑥 1 , for 𝐽 bar 𝑝 ( 𝑥 1 ) : = | det ∇ bar 𝑝 ( 𝑥 1 ) | , where at last we used that i) bar 𝑝 ( 𝑥 1 ) = 𝑥 1 for 𝑥 1 ∈ 𝐷 1 𝑆 with 𝑆 ∈ F 1 ; ii) A standard change of variables formula on a set where bar 𝑝 is a.e. dierentiable (see for instance [ ABS24 , Theorem 7.1, Corollary 7.2]). ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 23 As this holds for ev er y function 𝜑 , this shows that 𝑓 1 ( 𝑥 1 ) =          𝑔 𝑝 ( 𝑥 1 ) if 𝑥 1 ∈  𝑆 ∈ F 1 𝐷 1 𝑆 , 𝑔 𝑝 ( bar 𝑝 ( 𝑥 1 ) ) | 𝐽 bar 𝑝 ( 𝑥 1 ) | if 𝑥 1 ∈  𝑆 ∉ F 1 𝐷 1 𝑆 , for 𝜇 1 -a.e. 𝑥 1 ∈ R 𝑑 . Therefore, we can compute the 𝐿 𝑞 norm of 𝑔 𝑝 as  | 𝑔 𝑝 ( 𝑧 ) | 𝑞 d 𝑧 =   𝑆 ∈ F 1 𝐷 1 𝑆 | 𝑔 𝑝 ( 𝑧 ) | 𝑞 d 𝑧 +  bar 𝑝 (  𝑆 ∉ F 1 𝐷 1 𝑆 ) | 𝑔 𝑝 ( 𝑧 ) | 𝑞 d 𝑧 =   𝑆 ∈ F 1 𝐷 1 𝑆 | 𝑓 1 ( 𝑧 ) | 𝑞 d 𝑧 +   𝑆 ∉ F 1 𝐷 1 𝑆 | 𝑔 𝑝 ( bar 𝑝 ( 𝑥 1 ) ) | 𝑞 𝐽 bar 𝑝 ( 𝑥 1 ) d 𝑥 1 =   𝑆 ∈ F 1 𝐷 1 𝑆 | 𝑓 1 ( 𝑧 ) | 𝑞 d 𝑧 +   𝑆 ∉ F 1 𝐷 1 𝑆 | 𝑓 1 ( 𝑥 1 ) | 𝑞 𝐽 bar 𝑝 ( 𝑥 1 ) 𝑞 − 1 d 𝑥 1 ≤   𝑆 ∈ F 1 𝐷 1 𝑆 | 𝑓 1 ( 𝑧 ) | 𝑞 d 𝑧 + 1 2   𝑆 ∉ F 1 𝐷 1 𝑆  max 𝑖 ∉ 𝑆 | 𝐻 1 𝑖 ( 𝑥 1 ) | min 𝑖 ∉ 𝑆 Λ 1 𝑖 ( 𝑥 1 )  𝑑 ( 𝑞 − 1 ) | 𝑓 1 ( 𝑥 1 ) | 𝑞 d 𝑥 1 , where the last inequality is given by Corollary 4.3 , which concludes the proof. ■ Remark 4.5 (Alternativ e proof of integrability in case of distant supports) . The estimate provided in Proposition 1.3 holds under the simple assumption that 𝜇 1 ≪ L 𝑑 , but, as mentioned, the right- hand side may b e +∞ even if 𝑓 ∈ 𝐿 𝑞 . However , under assumptions ( C pt ) and ( ℎ𝑝 2 ) , Proposition 1.3 provides an alternativ e proof of 𝑞 -integrability for 𝑔 𝑝 15 . Indee d for ev er y set of indexes 𝑆 , max 𝑥 1 ∈ spt 𝜇 1 max 𝑖 ∉ 𝑆 | 𝐻 1 𝑖 ( 𝑥 1 ) | ≤  ( 𝑝 − 1 ) 𝑚 𝑝 − 2 , if 1 < 𝑝 < 2 , ( 𝑝 − 1 ) 𝑀 𝑝 − 2 , if 𝑝 ≥ 2 , and min 𝑥 1 ∈ spt 𝜇 1 min 𝑖 ∉ 𝑆 Λ 1 𝑖 ( 𝑥 1 ) ≥   min 𝑖 ∉ 𝑆 𝜆 𝑖  𝑀 2 ( 𝑝 − 2 ) ( 𝑝 − 1 ) 𝑚 𝑝 − 2 = 𝑚 2 ( 2 − 𝑝 ) ( 𝑝 − 1 ) 𝑀 2 − 𝑝 , if 1 < 𝑝 < 2 ,  min 𝑖 ∉ 𝑆 𝜆 𝑖  𝑚 2 ( 𝑝 − 2 ) ( 𝑝 − 1 ) 𝑀 𝑝 − 2 , if 𝑝 ≥ 2 . Exploiting these estimates in ( 3.6 ) of Proposition 1.3 , under the assumption ( ℎ𝑝 2 ), we have ∥ 𝑔 𝑝 ∥ 𝐿 𝑞 ( R 𝑑 ) ≤           1 ∨ ( 𝑝 − 1 ) 2 𝑀 2 ( 2 − 𝑝 ) ( min 𝑖 𝜆 𝑖 ) 𝑚 2 ( 2 − 𝑝 )  𝑑 𝑞 ′ ∥ 𝑓 1 ∥ 𝐿 𝑞 ( R 𝑑 ) , if 1 < 𝑝 < 2  1 ∨ ( 𝑝 − 1 ) 2 𝑀 2 ( 𝑝 − 2 ) ( min 𝑖 𝜆 𝑖 ) 𝑚 2 ( 𝑝 − 2 )  𝑑 𝑞 ′ ∥ 𝑓 1 ∥ 𝐿 𝑞 ( R 𝑑 ) , if 𝑝 ≥ 2 . which provides integrability , in a similar spirit as Theorem 1.1 . 5. Optimal maps for 𝑝 -W asserstein dist ance and barycenters via affine transforma tions In this nal section of our work, we rst discuss optimal maps for the 𝑝 - W asserstein distance, between two measures, the second given by the push-for ward with r espect to an ane transfor- mation of the rst. Moreover , accordingly , w e explicitly compute bar ycenters in the 𝑝 - W asserstein space for a special class of ane transformations. 15 Notice that this is a slightly weaker r esult than Theorem 1.1 , as the stronger assumption ( ℎ 𝑝 2 ) is required for every 1 < 𝑝 < ∞ . 24 C. BRIZZI AND L. PORTINALE 5.1. Optimality of ane transformations for the 𝑝 - W asserstein distance. First step of this chapter is to discuss the optimality of ane transformations for the 𝑝 - W asserstein distance. Recall that for 𝑝 = 2 , and any 𝜇 ≪ L 𝑑 , any 𝑇 = ∇ 𝑢 with 𝑢 convex is optimal between 𝜇 and its image 𝑇 # 𝜇 . In particular , when lo oking at maps of the form 𝑇 = 𝐴𝑥 + 𝑏 for some 𝐴 ∈ M ( 𝑑 ) ( 𝑑 × 𝑑 matrices with real entries) and 𝑏 ∈ R 𝑑 is indeed optimal whenever 𝐴 is symmetric and 𝐴 ≥ 0 , as it is the gradient of the convex function 𝑢 ( 𝑥 ) = 1 2 ⟨ 𝐴𝑥 , 𝑥 ⟩ + 𝑏 𝑥 . In this chapter , we are going to discuss optimality for 𝑊 𝑝 for general 1 < 𝑝 < ∞ . W e show in particular that 𝐴 ≥ 0 and symmetric does not suce to guarante e optimality on 𝑇 , and much more restrictiv e conditions on 𝐴 must be impose d. First of all, classical duality arguments shows that, given 𝜇 ∈ P 𝑝 ( R 𝑑 ) so that 𝜇 ≪ L 𝑑 and any other 𝜈 ∈ P 𝑝 ( R 𝑑 ) , there exists a unique optimal map 𝑇 from 𝜇 to 𝜈 , which is of the form 𝑇 ( 𝑥 ) = 𝑥 − ∇ 𝜑 ( 𝑥 ) | ∇ 𝜑 ( 𝑥 ) | 𝛼 𝑝 (5.1) where 𝛼 𝑝 = 𝑝 − 2 𝑝 − 1 ∈ ( 0 , 1 ) and 𝜑 is a 𝑝 -concave map, which means that the double 𝑝 -transform coincides with the function itself, i.e. ( 𝜑 𝑝 ) 𝑝 = 𝜑 , where for a pr op er function 𝜑 (somewher e nite) the 𝑝 -transform is dened by 𝜑 𝑝 ( 𝑦 ) = inf 𝑥 ∈ R 𝑑  1 𝑝 | 𝑥 − 𝑦 | 𝑝 − 𝜑 ( 𝑥 )  ∈ [ −∞ , +∞) , ∀ 𝑦 ∈ R 𝑑 . Note indeed that for 𝑝 = 2 the map is of the form 𝑇 ( 𝑥 ) = 𝑥 − ∇ 𝜑 ( 𝑥 ) = ∇  1 2 | 𝑥 | 2 − 𝜑 ( 𝑥 )  where now the 2 -concavity of 𝜑 simply implies that 𝑥 ↦→ 1 2 | 𝑥 | 2 − 𝜑 ( 𝑥 ) is indeed a convex function. The picture signicantly changes when 𝑝 ≠ 2 , as our next the orem shows. In what follows, for 𝑘 ∈ N , we denote by Id 𝑘 ∈ M ( 𝑘 ) the corresponding 𝑘 × 𝑘 identity matrix. For 𝐴 matrix, we denote by 𝜎 ( 𝐴 ) the spectrum of 𝐴 . Theorem 5.1 (Optimality of ane maps for 𝑝 ≠ 2 ) . Let 𝐴 ∈ M ( 𝑑 ) be symmetric and 𝑏 ∈ R 𝑑 , and consider the map 𝑇 : R 𝑑 → R 𝑑 given by 𝑇 ( 𝑥 ) = 𝐴𝑥 + 𝑏 , for 𝑥 ∈ R 𝑑 . Then 𝑇 is an optimal map from any measure 𝜇 ∈ P 𝑝 ( R 𝑑 ) and its image 𝑇 # 𝜇 if and only if 𝜎 ( 𝐴 ) ⊂ { 1 , 𝜁 } , for some 𝜁 ≥ 0 . In particular , it means that, up to orthogonal transformations, the matrix 𝐴 is of the form 𝐴 =  Id 𝑘 0 0 𝜁 Id 𝑑 − 𝑘  (5.2) for some 𝑘 ∈ { 0 , . . . , 𝑑 } and 𝜁 ∈ [ 0 , +∞) . Cases include (1) 𝐴 = Id 𝑑 . In this case 𝑇 ( 𝑥 ) = 𝑥 + 𝑏 is simply a translation, and it is always optimal b etween any measure 𝜇 ∈ P 𝑝 and its translation for every 𝑏 ∈ R 𝑑 . (2) 𝐴 = 0 (i.e. 𝑘 = 0 and 𝜁 = 0 ). In this case 𝑇 ( 𝑥 ) = 𝑏 is constant, and clearly optimal between 𝜇 and 𝛿 𝑏 , for every 𝜇 ∈ P 𝑝 ( R 𝑑 ) . (3) 𝐴 = 𝜁 Id 𝑑 for 𝜁 ∈ ( 0 , +∞) . In this case 𝑇 ( 𝑥 ) = 𝜁 Id 𝑑 + 𝑏 consists of a translation and a dilation. (4) Intermediate cases: 𝑘 ∉ { 0 , 𝑑 } . In this case, the map 𝑇 is, up to a translation, leaving invariant a subspace of dimension 𝑘 and (nonnegatively) dilating its orthogonal subspace of dimension 𝑑 − 𝑘 . Observe that the optimality of the translations ( case (1)) can also b e proved directly as a consequence of Jensen’s inequality: indeed, let 𝑇 ( 𝑥 ) = 𝑥 + 𝑏 and 𝜇 ∈ P 𝑝 ( R 𝑑 ) . Then for every ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 25 admissible coupling 𝜋 ∈ P ( R 𝑑 × R 𝑑 ) between 𝜇 and 𝜈 = 𝑇 # 𝜇 , by Jensen’s inequality we have that  R 𝑑 × R 𝑑 | 𝑥 − 𝑦 | 𝑝 d 𝜋 ( 𝑥 , 𝑦 ) ≥      ( 𝑥 − 𝑦 ) d 𝜋 ( 𝑥 , 𝑦 )     𝑝 =      𝑥 d 𝜇 ( 𝑥 ) −  𝑦 d 𝜈 ( 𝑦 )     𝑝 =      𝑥 d 𝜇 ( 𝑥 ) −  ( 𝑥 + 𝑏 ) d 𝜇 ( 𝑥 )     𝑝 = | 𝑏 | 𝑝 =  R 𝑑 | 𝑥 − 𝑇 ( 𝑥 ) | 𝑝 d 𝜇 ( 𝑥 ) , which shows the claimed optimality . Proof. W e start by showing that if 𝑇 ( 𝑥 ) = 𝐴𝑥 + 𝑏 is optimal between 𝜇 ∈ P 𝑝 ( R 𝑑 ) and its image, then 𝐴 must necessarily satisfy 𝐴 ≥ 0 and its spe ctrum has at most one eigenvalue dierent from 1 . First of all, without loss of generality we can assume 𝑏 = 0 . If not, it is enough to transform 𝜇 via the translation 𝜏 𝑏 : 𝑥 ↦→ 𝑥 + 𝑏 and then work with its image (note that a map 𝑇 is optimal between 𝜇 and 𝜈 if and only if 𝜏 𝑏 ◦ 𝑇 ◦ 𝜏 − 1 𝑏 is optimal between ( 𝜏 𝑏 ) # 𝜇 and ( 𝜏 𝑏 ) # 𝜈 ). According to ( 5.1 ), 𝑇 is optimal if and only if ∇ 𝜑 ( 𝑥 ) | ∇ 𝜑 ( 𝑥 ) | 𝛼 𝑝 = 𝐴𝑥 , where 𝐴 = Id 𝑑 − 𝐴 , (5.3) for some 𝑝 -concave function 𝜑 . W e obser v e that if 𝑅 ∈ SO ( 𝑑 ) and 𝜑 satises ( 5.3 ) , then the function 𝜙 dened by 𝜙 ( 𝑥 ) : = 𝜑 ( 𝑅𝑥 ) solves ∇ 𝜙 ( 𝑥 ) | ∇ 𝜙 ( 𝑥 ) | 𝛼 𝑝 = 𝑅 𝑇 ∇ 𝜑 ( 𝑅 𝑥 ) | ∇ 𝜑 ( 𝑅𝑥 ) | 𝛼 𝑝 =  𝑅 𝑇 𝐴𝑅  𝑥 , ∀ 𝑥 ∈ R 𝑑 , (5.4) hence, the problem is rephrased in terms of a new matrix 𝑅 𝑇 𝐴𝑅 , which has the same sp ectrum of 𝐴 . Therefore, without loss of generality we can assume 𝐴 diagonal. Secondly , we note that ( 5.3 ) implies | ∇ 𝜑 ( 𝑥 ) | 1 − 𝛼 𝑝 = | 𝐴 𝑥 | ⇒ ∇ 𝜑 ( 𝑥 ) = | 𝐴𝑥 | 𝛼 𝑝 1 − 𝛼 𝑝 𝐴𝑥 = | 𝐴𝑥 | 𝑝 − 2 𝐴𝑥 , (5.5) and thus, when 𝑝 > 2 , 𝜑 ∈ 𝐶 2 ( R 𝑑 ) , while for 1 < 𝑝 < 2 , 𝜑 ∈ 𝐶 1 ( R 𝑑 ) ∩ 𝐶 2 ( R 𝑑 \ ker ( 𝐴 ) ) . Furthermore, this sho ws that ∇ 𝜑 ( 𝑥 ) ∈ range ( 𝐴 ) ∀ 𝑥 ∈ R 𝑑 , (5.6) and 𝜑 is constant in the directions of ker ( 𝐴 ) . Since 𝐴 is diagonal, either ker ( 𝐴 ) = { 0 } or ker ( 𝐴 ) = Span { 𝑒 1 , . . . , 𝑒 𝑘 } , for some 𝑘 ∈ { 1 , . . . , 𝑑 } . In the second case, property ( 5.6 ) shows that 𝜕 𝑗 𝜑 ( 𝑥 ) = 0 for every 𝑗 ≤ 𝑘 , hence 𝜑 does not dep end on the rst 𝑘 variables. In other words, there exists ˆ 𝜑 : R 𝑑 − 𝑘 → R such that 𝜑 ( 𝑥 ) = ˆ 𝜑 ( ˆ 𝑥 ) , ∀ ˆ 𝑥 ∈ R 𝑑 − 𝑘 , 𝑥 = ( 𝑥 1 , . . . , 𝑥 𝑘 , ˆ 𝑥 ) ∈ R 𝑑 . (5.7) Therefore , either 𝑘 = 𝑑 , i.e. 𝐴 = 0 and 𝐴 = Id (case (1)), or 𝑘 < 𝑑 . In the latter case, if we denote by ˆ 𝐴 ∈ M ( 𝑑 − 𝑘 ) the (now invertible) linear map so that 𝐴𝑥 = ( 0 , ˆ 𝐴 ˆ 𝑥 ) , the considerations in ( 5.4 ) ensure that ˆ 𝜑 solves ∇ ˆ 𝜑 ( ˆ 𝑥 ) | ∇ ˆ 𝜑 ( ˆ 𝑥 ) | 𝛼 𝑝 = ˆ 𝐴 ˆ 𝑥 , ∀ ˆ 𝑥 ∈ R 𝑑 − 𝑘 . Note that 𝐴 has at most one eigenvalue dier ent from 1 if and only if 𝐴 has at most one eigenvalue dierent from zero, which is then e quivalent to the fact that ˆ 𝐴 = 𝜆 Id 𝑑 − 𝑘 for some 𝜆 ∈ R . Moreover , for 𝑦 = ( 𝑦 1 , . . . , 𝑦 𝑘 , ˆ 𝑦 ) ∈ R 𝑑 , ˆ 𝑦 ∈ R 𝑑 − 𝑘 , 𝜑 𝑝 ( 𝑦 ) = inf 𝑥 ∈ R 𝑑  1 𝑝 | 𝑥 − 𝑦 | 𝑝 − 𝜑 ( 𝑥 )  = inf ˆ 𝑥 ∈ R 𝑑 − 𝑘  1 𝑝 | ˆ 𝑥 − ˆ 𝑦 | 𝑝 − ˆ 𝜑 ( ˆ 𝑥 )  = ˆ 𝜑 𝑝 ( ˆ 𝑦 ) . (5.8) It is then clear that 𝜑 is 𝑝 -concave (in R 𝑑 ) if and only if ˆ 𝜑 is 𝑝 -concave (in R 𝑑 − 𝑘 ). This means that without loss of generality , we can assume that ker ( 𝐴 ) = { 0 } (hence ˆ 𝜑 = 𝜑 ), or else we simply 26 C. BRIZZI AND L. PORTINALE work with ˆ 𝐴 . With this restriction, when 1 < 𝑝 < 2 , 𝜑 ∈ 𝐶 2 ( R 𝑑 \ { 0 } ) ). By taking a se cond derivative in ( 5.3 ), a simple chain rule shows that 1 | ∇ 𝜑 ( 𝑥 ) | 𝛼 𝑝  ∇ 2 𝜑 ( 𝑥 ) − 1 𝛼 𝑝 ∇ 𝜑 ( 𝑥 ) | ∇ 𝜑 ( 𝑥 ) | ⊗  ∇ 2 𝜑 ( 𝑥 ) ∇ 𝜑 ( 𝑥 ) | ∇ 𝜑 ( 𝑥 ) |   = 𝐴 , ∀ 𝑥 ∈ R 𝑑 \ { 0 } . (5.9) As 𝐴 is symmetric, and the hessian of 𝜑 as well, this implies that ∇ 𝜑 ( 𝑥 ) ⊗  ∇ 2 𝜑 ( 𝑥 ) ∇ 𝜑 ( 𝑥 )  is symmetric as well. But this rank-one matrix is symmetric if and only if the two factors of the tensor products are parallel to each other , namely there exists a map 𝜆 : R 𝑑 → R such that ∇ 2 𝜑 ( 𝑥 ) ∇ 𝜑 ( 𝑥 ) = 𝜆 ( 𝑥 ) ∇ 𝜑 ( 𝑥 ) , ∀ 𝑥 ∈ R 𝑑 \ { 0 } . In other wor ds, the gradient of 𝜑 is everywhere an eigenfunction of the hessian of 𝜑 . By plugging in this new information in ( 5.9 ), we conclude that 1 | ∇ 𝜑 ( 𝑥 ) | 𝛼 𝑝  ∇ 2 𝜑 ( 𝑥 ) − 1 𝛼 𝑝 𝜆 ( 𝑥 ) ∇ 𝜑 ( 𝑥 ) | ∇ 𝜑 ( 𝑥 ) | ⊗ ∇ 𝜑 ( 𝑥 ) | ∇ 𝜑 ( 𝑥 ) |   = 𝐴 , ∀ 𝑥 ∈ R 𝑑 \ { 0 } . In particular , for ev ery 𝑥 ∈ R 𝑑 \ { 0 } , we infer that 𝐴 ∇ 𝜑 ( 𝑥 ) = 𝜆 ( 𝑥 ) ∇ 𝜑 ( 𝑥 ) , 𝜆 ( 𝑥 ) = 1 | ∇ 𝜑 ( 𝑥 ) | 𝛼 𝑝 𝜆 ( 𝑥 )  1 − 1 𝛼 𝑝  , which shows that for every 𝑥 ∈ R 𝑑 \ { 0 } , ∇ 𝜑 ( 𝑥 ) is an eigenfunction of the (constant) matrix 𝐴 . Our claim is that in fact 𝜆 ( 𝑥 ) ≡ 𝜆 is constant. T o se e this, observe that ( 5.5 ) and ker ( 𝐴 ) = { 0 } ensure ∇ 𝜑 ( 𝑥 ) = 0 only at 𝑥 = 0 . In particular , aside the value 𝜆 ( 0 ) which is not uniquely determined, the remaining one are uniquely determined by 𝜆 ( 𝑥 ) = ⟨ 𝐴 ∇ 𝜑 ( 𝑥 ) , ∇ 𝜑 ( 𝑥 ) ⟩ | ∇ 𝜑 ( 𝑥 ) | 2 , for 𝑥 ∈ R 𝑑 \ { 0 } , and therefore it is continuous on R 𝑑 \ { 0 } (it is in fact 𝐶 1 , as 𝜑 ∈ 𝐶 2 ). Assume now there e xists 𝑥 , 𝑦 ∈ R 𝑑 \ { 0 } such that 𝜆 ( 𝑥 ) ≠ 𝜆 ( 𝑦 ) . A s R 𝑑 \ { 0 } is connected for 𝑑 ≥ 2 , pick any continuous curve 𝛾 𝑥 𝑦 : [ 0 , 1 ] → R 𝑑 \ { 0 } from 𝑥 = 𝛾 𝑥 𝑦 ( 0 ) to 𝑦 = 𝛾 𝑥 𝑦 ( 1 ) . Then the function 𝜆 𝑥 𝑦 : [ 0 , 1 ] → R dened by 𝜆 𝑥 𝑦 = 𝜆 ◦ 𝛾 𝑥 𝑦 is a continuous curve satisfying 𝜆 𝑥 𝑦 ( 0 ) = 𝜆 ( 𝑥 ) ≠ 𝜆 ( 𝑦 ) = 𝜆 𝑥 𝑦 ( 1 ) . Denote by 𝑡 0 ∈ [ 0 , 1 ) the largest time so that 𝜆 𝑥 𝑦 ( 𝑡 ) = 𝜆 𝑥 𝑦 ( 0 ) , or equivalent 𝑡 0 : = max  𝑡 ∈ [ 0 , 1 ) : 𝜆 𝑥 𝑦 ( 𝑡 ) = 𝜆 𝑥 𝑦 ( 0 )  . In particular , by continuity and construction we hav e that 𝜆 𝑥 𝑦 ( 𝑡 ) ≠ 𝜆 𝑥 𝑦 ( 𝑡 0 ) , ∀ 𝑡 > 𝑡 0 . (5.10) Now , the sp ectral theorem for symmetric matrices ensures that eigenfunctions corresponding to dierent eigenvalues are necessarily orthogonal. As ∇ 𝜑 ( 𝑥 ) is an eigenfunction of 𝐴 with eigenvalue 𝜆 ( 𝑥 ) , ( 5.10 ) would imply for 𝑡 > 𝑡 0 0 = ⟨ ∇ 𝜑 ( 𝛾 𝑥 𝑦 ( 𝑡 0 ) ) , ∇ 𝜑 ( 𝛾 𝑥 𝑦 ( 𝑡 ) ) ⟩ − − − − → 𝑡 → 𝑡 0 ⟨∇ 𝜑 ( 𝛾 𝑥 𝑦 ( 𝑡 0 ) ) , ∇ 𝜑 ( 𝛾 𝑥 𝑦 ( 𝑡 0 ) ) ⟩ =   ∇ 𝜑 ( 𝛾 𝑥 𝑦 ( 𝑡 0 ) )   2 ≠ 0 as 𝛾 𝑥 𝑦 ( 𝑡 0 ) ∈ R 𝑑 \ { 0 } , which is clearly a contradiction. W e thus showed that 𝐴 ∇ 𝜑 ( 𝑥 ) = 𝜆 ∇ 𝜑 ( 𝑥 ) , ∀ 𝑥 ∈ R 𝑑 , for some 𝜆 ∈ R . In order to conclude the proof that 𝐴 = 𝜆 Id , we shall prove that the span of { ∇ 𝜑 ( 𝑥 ) : 𝑥 ∈ R 𝑑 } is the whole space R 𝑑 . Indeed, pick a 𝑣 ∈ R 𝑑 such that ⟨ 𝑣 , ∇ 𝜑 ( 𝑥 ) ⟩ = 0 for every 𝑥 ∈ R 𝑑 . Then from ( 5.5 ) we infer that 0 = ⟨ ∇ 𝜑 ( 𝑥 ) , 𝑣 ⟩ = | 𝐴 𝑥 | 𝑝 − 2 ⟨ 𝐴𝑥 , 𝑣 ⟩ = | 𝐴 𝑥 | 𝑝 − 2 ⟨ 𝑥 , 𝐴 𝑣 ⟩ , ∀ 𝑥 ∈ R 𝑑 , ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 27 as 𝐴 is symmetric. As ker ( 𝐴 ) = { 0 } , we infer that 𝐴𝑣 = 0 , and therefore that 𝑣 = 0 . W e show now that 𝐴 ≥ 0 or , equivalently , that 𝜆 ∈ ( −∞ , 1 ] . By ( 5.5 ) we infer that ˆ 𝜑 ( ˆ 𝑥 ) = 1 𝑝 𝜆 | 𝜆 | 𝑝 − 2 | ˆ 𝑥 | 𝑝 , hence, if 𝜆 > 1 , computing its 𝑝 -transform, ˆ 𝜑 𝑝 ( ˆ 𝑦 ) = 1 𝑝 inf ˆ 𝑥 ∈ R 𝑑 − 𝑘  | ˆ 𝑥 − ˆ 𝑦 | 𝑝 − 𝜆 𝑝 − 1 | ˆ 𝑥 | 𝑝  = −∞ , for all ˆ 𝑦, which shows that there can not exist any 𝑝 -concave potential satisfying ( 5.3 ) . Recall inde ed that from ( 5.8 ) we have that 𝜑 is 𝑝 -concave if and only if ˆ 𝜑 is. This concludes the proof that 𝐴 must necessarily satisfy the claimed conditions. W e are left to show that this is in fact sucient, namely that if 𝐴 ≥ 0 and its spectrum has at most one eigenvalue dierent from 1 , then 𝑇 ( 𝑥 ) = 𝐴𝑥 + 𝑏 is optimal b etw een any 𝜇 ∈ P 𝑝 ( R 𝑑 ) and its image 𝑇 # 𝜇 . T o do that, we are going to explicitly construct a 𝑝 -concave potential 𝜑 satisfying ( 5.3 ) . Without loss of generality we can assume 𝑏 = 0 and directly work in the system of coordinates which makes 𝐴 diagonal, namely we assume that 𝐴 has the form ( 5.2 ) for some 𝑘 ∈ { 0 , . . . , 𝑑 } and 𝜁 ∈ [ 0 , +∞) . Following the same steps as in the proof above, it suces to construct a function ˆ 𝜑 : R 𝑑 − 𝑘 → R such that ∇ ˆ 𝜑 ( ˆ 𝑥 ) | ∇ ˆ 𝜑 ( ˆ 𝑥 ) | 𝛼 𝑝 = ( 1 − 𝜁 ) ˆ 𝑥 , ∀ ˆ 𝑥 ∈ R 𝑑 − 𝑘 , and subsequently dene 𝜑 as in ( 5.7 ) . Recall inde ed that from ( 5.8 ) we have that 𝜑 is 𝑝 -concave if and only if ˆ 𝜑 is. Set 𝜆 : = ( 1 − 𝜁 ) ∈ ( −∞ , 1 ] , then ( 5.5 ) reads ∇ ˆ 𝜑 ( ˆ 𝑥 ) = 𝜆 | 𝜆 | 𝑝 − 2 | ˆ 𝑥 | 𝑝 − 2 ˆ 𝑥 , which (up to additive constants) implies that ˆ 𝜑 ( ˆ 𝑥 ) = 1 𝑝 𝜆 | 𝜆 | 𝑝 − 2 | ˆ 𝑥 | 𝑝 , ∀ ˆ 𝑥 ∈ R 𝑑 − 𝑘 . The nal step is to prov e that for any 𝜆 ∈ ( −∞ , 1 ] , the map ˆ 𝜑 is indeed 𝑝 -concave. Case 1: 𝜆 ≤ 0 . In this case, we have that ˆ 𝜑 ( ˆ 𝑥 ) = − 1 𝑝 | 𝜆 | 𝑝 − 1 | ˆ 𝑥 | 𝑝 , hence computing its 𝑝 -transform ˆ 𝜑 𝑝 ( ˆ 𝑦 ) = 1 𝑝 inf ˆ 𝑥 ∈ R 𝑑 − 𝑘  | ˆ 𝑥 − ˆ 𝑦 | 𝑝 + | 𝜆 | 𝑝 − 1 | ˆ 𝑥 | 𝑝  , which is nothing but a (suitably weighted) Euclidean 𝑝 -barycenter between 0 and ˆ 𝑦 , and ther efore explicitly computable. Indeed it is the unique solution of the asso ciated Euler-Lagrange equation | ˆ 𝑥 − ˆ 𝑦 | 𝑝 − 2 ( ˆ 𝑥 − ˆ 𝑦 ) + | 𝜆 | 𝑝 − 1 | ˆ 𝑥 | 𝑝 − 2 ˆ 𝑥 = 0 , whose unique solution is given by ˆ 𝑥 = 1 1 + | 𝜆 | ˆ 𝑦 . This implies that ˆ 𝜑 𝑝 ( ˆ 𝑦 ) = 1 𝑝 𝑔 𝑝 ( 𝜆 ) | ˆ 𝑦 | 𝑝 , 𝑔 𝑝 ( 𝜆 ) : =  | 𝜆 | 1 + | 𝜆 |  𝑝 − 1 ∈ [ 0 , 1 ) . The second 𝑝 -transform is then given by ( ˆ 𝜑 𝑝 ) 𝑝 ( ˆ 𝑥 ) = 1 𝑝 inf ˆ 𝑦 ∈ R 𝑑 − 𝑘  | ˆ 𝑥 − ˆ 𝑦 | 𝑝 − 𝑔 𝑝 ( 𝜆 ) | ˆ 𝑦 | 𝑝  . As 𝑔 𝑝 ( 𝜆 ) ∈ [ 0 , 1 ) , it follo ws that, for every ˆ 𝑥 ∈ R 𝑑 − 𝑘 , the function ˆ 𝑦 ↦→ | ˆ 𝑥 − ˆ 𝑦 | 𝑝 − 𝑔 𝑝 ( 𝜆 ) | ˆ 𝑦 | 𝑝 is coercive and smooth, and therefor e admits a minimizer . The corresponding Euler–Lagrange equation reads as | ˆ 𝑦 − ˆ 𝑥 | 𝑝 − 2 ( ˆ 𝑦 − ˆ 𝑥 ) = 𝑔 𝑝 ( 𝜆 ) | ˆ 𝑦 | 𝑝 − 2 ˆ 𝑦 , which in particular implies | ˆ 𝑦 − ˆ 𝑥 | 𝑝 − 1 = 𝑔 𝑝 ( 𝜆 ) | ˆ 𝑦 | 𝑝 − 1 ⇐ ⇒ | ˆ 𝑦 − ˆ 𝑥 | = | 𝜆 | 1 + | 𝜆 | | ˆ 𝑦 | . Substituing this back into the previous optimality conditions, w e explicitly nd ˆ 𝑦 − ˆ 𝑥 = 𝑔 𝑝 ( 𝜆 ) 1 − 𝛼 𝑝 ˆ 𝑦 = | 𝜆 | 1 + | 𝜆 | ˆ 𝑦 ⇐ ⇒ ˆ 𝑦 =  1 + | 𝜆 |  ˆ 𝑥 , 28 C. BRIZZI AND L. PORTINALE which is the sought unique minimiser . By computing the corrisp onding minimial value, we conclude that ( ˆ 𝜑 𝑝 ) 𝑝 ( ˆ 𝑥 ) = 1 𝑝  | 𝜆 | 𝑝 − 𝑔 𝑝 ( 𝜆 ) ( 1 + | 𝜆 | ) 𝑝  | ˆ 𝑥 | 𝑝 = 1 𝑝  | 𝜆 | 𝑝 − | 𝜆 | 𝑝 − 1 ( 1 + | 𝜆 | )  | ˆ 𝑥 | 𝑝 = − 1 𝑝 | 𝜆 | 𝑝 − 1 | ˆ 𝑥 | 𝑝 = ˆ 𝜑 ( ˆ 𝑥 ) , for every ˆ 𝑥 ∈ R 𝑑 − 𝑘 , which prov es the claimed 𝑝 -concavity . Case 2: 𝜆 ∈ ( 0 , 1 ) . In this case, we hav e that ˆ 𝜑 ( ˆ 𝑥 ) = 1 𝑝 𝜆 𝑝 − 1 | ˆ 𝑥 | 𝑝 . Alternatively , we can write ˆ 𝜑 ( ˆ 𝑥 ) = 1 𝑝 𝑔 𝑝 ( ¯ 𝜆 ) | ˆ 𝑥 | 𝑝 , where ¯ 𝜆 ≤ 0 is such that 𝑔 𝑝 ( ¯ 𝜆 ) = 𝜆 𝑝 − 1 ⇐ ⇒ ¯ 𝜆 = 𝜆 𝜆 − 1 . But in the proof of Case 1, we have shown that ˆ 𝜑 = ˆ 𝜓 𝑝 , for ˆ 𝜓 ( ˆ 𝑦 ) = − 1 𝑝 | ¯ 𝜆 | 𝑝 − 1 | ˆ 𝑦 | 𝑝 , which is then 𝑝 -concave. Therefore ˆ 𝜑 is also 𝑝 -concave. Case 3 : 𝜆 = 1 . This corresponds to the degenerate case, corresponding to the matrix 𝐴 having kernel, and thus 𝑇 being noninjective. And clearly ˆ 𝜑 𝑝 ( ˆ 𝑦 ) = 1 𝑝 inf ˆ 𝑥 ∈ R 𝑑 − 𝑘  | ˆ 𝑦 − ˆ 𝑥 | 𝑝 − | ˆ 𝑥 | 𝑝  =  0 if ˆ 𝑦 = 0 −∞ otherwise . T aking a second 𝑝 -transform, we nd ( ˆ 𝜑 𝑝 ) 𝑝 ( ˆ 𝑥 ) = inf ˆ 𝑦 ∈ R 𝑑 − 𝑘  1 𝑝 | ˆ 𝑥 − ˆ 𝑦 | 𝑝 − ˆ 𝜑 𝑝 ( ˆ 𝑦 )  = 1 𝑝 | ˆ 𝑥 | 𝑝 = 𝜑 ( ˆ 𝑥 ) , for every ˆ 𝑥 ∈ R 𝑑 − 𝑘 , which conclude the proof of the sought 𝑝 -concavity . ■ 5.2. Barycenters of measures under ane transformations. Using the characterization of optimal ane maps for 𝑊 𝑝 provided in The orem 5.1 , we now describe the 𝑝 - W asserstein barycenters b etween measures which are obtaine d via suitable ane transformations of a common, absolutely continuous measure 𝜇 ∈ P ( R 𝑑 ) . Proposition 5.2 (Barycenters under ane transformations) . Let 𝜇 ≪ L 𝑑 be a reference probability measure. Let us consider probability measures of the form 𝜇 𝑖 =  𝐴 𝑖 · + 𝑣 𝑖  # 𝜇 , in two cases: either 𝐴 𝑖 = Id or 𝑣 𝑖 = 𝑣 ∈ R 𝑑 for every 𝑖 = 1 , . . . , 𝑁 , and in the second case, • 𝐴 1 is invertible. • 𝐴 𝑖 are symmetric, positive semidenite, 𝑑 × 𝑑 matrices. • The spe ctrum of the matrices satises 𝜎 ( 𝐴 𝑖 ) ⊂ { 1 , 𝜁 𝑖 } for some 𝜁 𝑖 ∈ [ 0 , +∞) . • The matrices commute, i.e. [ 𝐴 𝑖 , 𝐴 𝑗 ] = 0 for every 𝑖 , 𝑗 = 1 , . . . , 𝑁 , and the eigenspaces associated to the eigenvalue 1 of every 𝐴 𝑗 ≠ Id coincide. Then the 𝑝 - W asserstein barycenter between 𝜇 1 , . . . , 𝜇 𝑁 is given by 𝜈 𝑝 =  𝐴 · + 𝑣  # 𝜇 , 𝐴 : = 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) and 𝑣 : = 𝑥 𝑝 ( 𝑣 1 , . . . , 𝑣 𝑁 ) , where with the barycenters of matrix we intend 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) : = argmin 𝐵 𝑁  𝑖 = 1 𝜆 𝑖   𝐴 𝑖 − 𝐵   𝑝 , | 𝐴 | : = Tr ( 𝐴 𝑇 𝐴 ) 1 2 =  𝑁  𝑖 , 𝑗 = 1 𝐴 2 𝑖 𝑗  1 2 , for every 𝐴 . Proof. Thanks to our assumptions, without loss of generality , we can assume the matrices 𝐴 𝑖 to be already in diagonal form. Inde ed, for 𝐷 𝑖 : = 𝑅 𝐴 𝑖 𝑅 𝑇 , we have that, for ev er y matrix 𝐵 , if we set 𝐵 ′ : = 𝑅 𝐵 𝑅 𝑇 , then 𝑁  𝑖 = 1 𝜆 𝑖   𝑅 𝑇 𝐷 𝑖 𝑅 − 𝐵   𝑝 = 𝑁  𝑖 = 1 𝜆 𝑖   𝑅 𝑇 ( 𝐷 𝑖 − 𝐵 ′ ) 𝑅   𝑝 = 𝑁  𝑖 = 1 𝜆 𝑖   𝐷 𝑖 − 𝐵 ′   𝑝 , ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 29 which in particular ensures that 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) = 𝑅 𝑥 𝑝 ( 𝐷 1 , . . . , 𝐷 𝑁 ) 𝑅 𝑇 . Therefore we can assume 𝐴 𝑖 = 𝐷 𝑖 being of the form 𝐴 𝑖 =  Id 𝑘 0 0 𝜁 𝑖 Id 𝑑 − 𝑘  for some common 𝑘 ∈ { 0 , . . . , 𝑑 } . A direct conse quence of this assumption is that the bar ycenters of these matrices is also of the same form. Indee d, 𝑁  𝑖 = 1 𝜆 𝑖   𝐴 𝑖 − 𝐵   𝑝 = 𝑁  𝑖 = 1 𝜆 𝑖  𝑑  𝑚,𝑙 = 1 | ( 𝐴 𝑖 ) 𝑚𝑙 − 𝐵 𝑚𝑙 | 2  𝑝 2 = 𝑁  𝑖 = 1 𝜆 𝑖  𝑑  𝑙 = 1 | ( 𝐴 𝑖 ) 𝑙 𝑙 − 𝐵 𝑙 𝑙 | 2 +  𝑚 ≠ 𝑙 | 𝐵 𝑚𝑙 | 2  𝑝 2 . This sho ws that the optimizers 𝐵 in the denition of 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) are also diagonal (i.e. 𝐵 𝑗 𝑘 = 0 for every 𝑗 ≠ 𝑘 ). For 𝑏 ∈ R 𝑑 , we denote by 𝐵 = 𝑏 · Id the corresponding diagonal matrix so that 𝐵 𝑘 𝑘 = 𝑏 𝑘 . Viceversa, for a given matrix 𝐵 , we denote by diag ( 𝐵 ) the corresponding diagonal vector ( 𝐵 11 , . . . , 𝐵 𝑑𝑑 ) ∈ R 𝑑 . With this notation at hand, from the ab ov e considerations we conclude that 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) = argmin 𝐵 = 𝑏 · Id , 𝑏 ∈ R 𝑑 𝑁  𝑖 = 1 𝜆 𝑖  𝑑  𝑙 = 1 | ( 𝐴 𝑖 ) 𝑙 𝑙 − 𝑏 𝑙 | 2  2 = 𝜉 · Id , (5.11) where 𝜉 ∈ R 𝑑 is given by 𝜉 : = 𝑥 𝑝  diag ( 𝐴 1 ) , . . . , diag ( 𝐴 𝑁 )  =  1 , . . . , 1    𝑘 , 𝜁 , . . . , 𝜁    𝑑 − 𝑘  ∈ R 𝑑 , 𝜁 = 𝑥 𝑝 ( 𝜁 1 , . . . , 𝜁 𝑁 ) ≥ 0 . Here the latter equality follows from similar consideration as above , as 𝑥 𝑝  diag ( 𝐴 1 ) , . . . , diag ( 𝐴 𝑁 )  = argmin 𝑏 ∈ R 𝑑 𝑁  𝑖 = 1 𝜆 𝑖  𝑘  𝑙 = 1 | 𝑏 𝑙 − 1 | 2 + 𝑑  𝑙 = 𝑘 + 1 | 𝑏 𝑙 − 𝜁 𝑖 | 2  𝑝 2 , and thus minimisers satisfy 𝑏 𝑙 = 1 for every 𝑙 ∈ { 1 , . . . , 𝑘 } . Another conse quence of ( 5.11 ) is that the invertibility of 𝐴 1 , together with the fact that 𝐴 𝑖 are all nonnegative denite, ensures that 𝐴 : = 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) is invertible as well, as the 𝑝 -barycenters of 𝑁 nonnegative numbers of which at least one is strictly positive is strictly positive as well, hence the triviality of the kernel of 𝐴 . Let us denote by 𝑅 𝑖 the ane transformations given by 𝑅 𝑖 𝑧 : = 𝐴 𝑖 𝐴 − 1 𝑧 + 𝑤 𝑖 , 𝑤 𝑖 : = 𝑣 𝑖 − 𝐴 𝑖 𝐴 − 1 𝑣 ∈ R 𝑑 , ∀ 𝑧 ∈ R 𝑑 , (5.12) which is nothing but the comp osition of the map 𝐴 𝑖 · + 𝑣 𝑖 with the inverse of 𝐴 · + 𝑣 . Note that thanks to ( 5.11 ) , the maps 𝑅 𝑖 satisfy the assumptions of The or em 5.1 , and thus ar e optimal for 𝑊 𝑝 ( 𝜈 , 𝜇 𝑖 ) . Furthermore, as showed in the very proof of The orem 5.1 , we know that there exist Kantor ovich potentials 𝜑 𝑖 (thus optimal for 1 𝑝 𝑊 𝑝 𝑝 ( 𝜈 , 𝜇 𝑖 ) ) so that 𝑅 𝑖 ( 𝑧 ) = 𝑧 − ∇ 𝜑 𝑖 ( 𝑧 ) | ∇ 𝜑 𝑖 ( 𝑧 ) | 𝛼 𝑝 , ∀ 𝑧 ∈ R 𝑑 . (5.13) W e prov e the optimality of 𝜈 using Proposition 2.8 . Inde ed, the fact that ( 5.13 ) holds globally on the whole R 𝑑 (wher eas typically such conditions only hold on the support of the barycenter) allows for the proof of 𝑥 𝑝 ◦ ( 𝑅 1 , . . . , 𝑅 𝑁 ) ( 𝑧 ) = 𝑧 , ∀ 𝑧 ∈ R 𝑑 . (5.14) Using ( 5.14 ) and ( 2.11 ), one can show that 𝑁  𝑖 = 1 𝜑 𝑖 ( 𝑧 ) = 𝐶 ∈ R , ∀ 𝑧 ∈ R 𝑑 , and conclude that 𝜈 is the 𝑝 - W asserstein barycenter b etw een 𝜇 1 , . . . , 𝜇 𝑁 by Proposition 2.8 . 30 C. BRIZZI AND L. PORTINALE Proof of ( 5.14 ) . Recalling the denition of 𝑅 𝑖 as comp osition of tw o ane maps, the sought equality is equivalent to 𝑥 𝑝 ( 𝐴 1 𝑧 + 𝑣 1 , . . . , 𝐴 𝑁 𝑧 + 𝑣 𝑁 ) = 𝐴𝑧 + 𝑣 , ∀ 𝑧 ∈ R 𝑑 . (5.15) In other words, w e have to prov e that 𝑁  𝑖 = 1 𝜆 𝑖   𝐴 𝑖 𝑧 + 𝑣 𝑖 − ( 𝐴 𝑧 + 𝑣 )   𝑝 − 2  𝐴 𝑖 𝑧 + 𝑣 𝑖 − ( 𝐴 𝑧 + 𝑣 )  = 0 , ∀ 𝑧 ∈ R 𝑑 . (5.16) This is precisely where the structure assumptions on 𝐴 𝑖 and 𝑣 𝑖 come into play . In the rst case where we hav e no dilation and 𝐴 𝑖 = Id for every 𝑖 = 1 , . . . , 𝑁 (and therefor e 𝐴 = Id ), then ( 5.16 ) reduces to the very denition of 𝑣 = 𝑥 𝑝 ( 𝑣 1 , . . . , 𝑣 𝑁 ) , whence the claimed e quality . Let us show the validity of ( 5.16 ) when the shift is constant, i.e. when 𝑣 𝑗 = 𝑣 ∈ R 𝑑 for every 𝑗 = 1 , . . . , 𝑁 . As the bar y center of vectors which are shifted by the same vector 𝑣 is the shifted ( by 𝑣 ) bar ycenter of the original v ectors, in this case ( 5.16 ) be comes 𝑁  𝑖 = 1 𝜆 𝑖   𝐴 𝑖 𝑧 − 𝐴𝑧   𝑝 − 2  𝐴 𝑖 𝑧 − 𝐴𝑧  = 0 , ∀ 𝑧 ∈ R 𝑑 . As the matrices 𝐴 1 , . . . , 𝐴 𝑁 commute, we have alr eady obser ved that we can work in coordinates which make the matrices diagonal, and where by ( 5.11 ) we hav e that 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) = 𝜉 · Id , 𝜉 =  1 , . . . , 1    𝑘 , 𝜁 , . . . , 𝜁    𝑑 − 𝑘  ∈ R 𝑑 , 𝜁 = 𝑥 𝑝 ( 𝜁 1 , . . . , 𝜁 𝑁 ) , (5.17) where we recall that 𝜁 𝑗 is the only eigenvalue of 𝐴 𝑗 possibly dierent from one. In particular , when writing 𝑧 = ( 𝑧 1 , . . . , 𝑧 𝑘 , ˜ 𝑧 ) with ˜ 𝑧 ∈ R 𝑑 − 𝑘 , the validity of ( 5.17 ) is equivalent to 0 = 𝑁  𝑖 = 1 𝜆 𝑖   𝜁 𝑖 ˜ 𝑧 − 𝜁 ˜ 𝑧   𝑝 − 2  𝜁 𝑖 ˜ 𝑧 − 𝜁 ˜ 𝑧  =  𝑁  𝑖 = 1 𝜆 𝑖   𝜁 𝑖 − 𝜁   𝑝 − 2  𝜁 𝑖 − 𝜁   | ˜ 𝑧 | 𝑝 − 2 ˜ 𝑧 , ∀ ˜ 𝑧 ∈ R 𝑑 − 𝑘 , or , in other wor ds, 𝑁  𝑖 = 1 𝜆 𝑖   𝜁 𝑖 − 𝜁   𝑝 − 2  𝜁 𝑖 − 𝜁  = 0 . This is precisely the condition arising from 𝜁 = 𝑥 𝑝 ( 𝜁 1 , . . . , 𝜁 𝑁 ) . ■ Remark 5.3 (What goes wrong with general ane transformations) . The structural assumptions of the previous theorem ar e two-fold. On one side, the fact that the matrices must have at most one eigenvalue dierent from one is intrinsically related to the fact that such a property characterises the optimality of the asso ciated transport maps given by ( 5.12 ) , as describe d in Theorem 5.1 . The commutation property b etw een the matrices 𝐴 𝑖 is what ensures that such a property is also preser ved for 𝐴 , and therefore for the composition in ( 5.12 ). On the other hand, the need for having either pure translations or ane maps with the same shift is a conse quence of the lack of linearity of the Euclidean p-barycenter . Once working in coordinates which make each 𝐴 𝑖 in diagonal form, it is clear that proving ( 5.15 ) accounts to show that ( 𝑥 𝑝 on the right-hand side below is meant as a one-dimensional barycenter)  𝑥 𝑝 ( 𝜉 1 ˜ 𝑧 + 𝑣 1 , . . . , 𝜉 𝑁 ˜ 𝑧 + 𝑣 𝑁 )  𝑙 = 𝑥 𝑝 ( 𝜉 𝑖 , . . . , 𝜉 𝑁 ) ˜ 𝑧 𝑙 + 𝑥 𝑝 ( 𝑣 𝑙 1 , . . . , 𝑣 𝑙 𝑁 ) , for every ˜ 𝑧 ∈ R 𝑑 − 𝑘 and 𝑙 = 1 , . . . , 𝑑 − 𝑘 . This this is not true when 𝑝 ≠ 2 , while it is simply a consequence of the linearity of the arithmetic mean for the case 𝑝 = 2 . It would b e interesting to understand what the 𝑊 𝑝 -barycenters would be in the case of genuinely ane transformations, which unfortunately goes beyond the reach of this work. ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 31 T o conclude this section and the paper , let us highlight the explicit structure of the maps involved in the bar ycentric pr oblem in the setting of the previous Proposition 5.2 . In particular , in this case 𝑔 𝑝 is clearly regular , as 𝐽 bar 𝑝 is constant. Let 𝜇 ≪ L 𝑑 be a reference probability measure . Consider probability measures of the form 𝜇 𝑖 =  𝐴 𝑖 · + 𝑣 𝑖  # 𝜇 , in the two dierent regimes. W e rst discuss the case of pure translations (i.e . 𝐴 𝑖 = Id for every 𝑖 ) and then the case of commuting, block matrices. Translations . Consider 𝐴 𝑖 = Id for every 𝑖 = 1 , . . . , 𝑁 . From Proposition 5.2 , we know that the 𝑝 - W asserstein barycenter between 𝜇 1 , . . . , 𝜇 𝑁 is given by 𝜈 𝑝 =  · + 𝑣  # 𝜇 =  · − 𝑣 1 + 𝑣  # 𝜇 1 , where 𝑣 : = 𝑥 𝑝 ( 𝑣 1 , . . . , 𝑣 𝑁 ) . In particular , whenever 𝜇 = 𝑓 1 L 𝑑 with 𝑓 1 ∈ 𝐿 𝑞 , we have 𝑔 𝑝 ∈ 𝐿 𝑞 and ∥ 𝑔 𝑝 ∥ 𝐿 𝑞 = ∥ 𝑓 1 ∥ 𝐿 𝑞 . Indee d, in this particular case, the maps 𝑇 𝑖 are explicitly giv en by 𝑇 𝑖 ( 𝑥 ) = 𝑥 − 𝑣 1 + 𝑣 𝑖 , 𝑥 ∈ R 𝑑 , 𝑖 = 1 , . . . , 𝑁 , and the map bar 𝑝 is simply given by bar 𝑝 ( 𝑥 1 ) = 𝑥 𝑝 ( 𝑥 1 , 𝑇 2 ( 𝑥 1 ) , . . . , 𝑇 𝑁 ( 𝑥 1 ) ) = 𝑥 𝑝 ( 𝑥 1 − 𝑣 1 + 𝑣 1 , 𝑥 1 − 𝑣 1 + 𝑣 2 , . . . , 𝑥 1 − 𝑣 1 + 𝑣 𝑁 ) = 𝑥 1 − 𝑣 1 + 𝑥 𝑝 ( 𝑣 1 , 𝑣 2 , . . . , 𝑣 𝑁 ) = 𝑥 1 − 𝑣 1 + 𝑣 , hence it is the translation by the vector 𝑣 − 𝑣 1 . Linear transformations . Consider 𝑣 𝑖 = 𝑣 , and without loss of generality we assume 𝑣 = 0 and assume that 𝐴 1 , . . . , 𝐴 𝑁 satisfy all the assumption of Proposition 5.2 . Then by Proposition 5.2 , the 𝑝 - W asserstein barycenter between 𝜇 1 , . . . , 𝜇 𝑁 is given by 𝜈 𝑝 =  𝐴 ·  # 𝜇 , where 𝐴 : = 𝑥 𝑝 ( 𝐴 1 , . . . , 𝐴 𝑁 ) . In this case, the maps 𝑇 𝑖 are given by 𝑇 𝑖 ( 𝑥 ) = 𝐴 𝑖 𝐴 − 1 1 𝑥 , 𝑥 ∈ R 𝑑 , 𝑖 = 1 , . . . , 𝑁 . Therefore , the map bar 𝑝 can be computed as bar 𝑝 ( 𝑥 1 ) = 𝑥 𝑝 ( 𝑥 1 , 𝑇 2 ( 𝑥 1 ) , . . . , 𝑇 𝑁 ( 𝑥 1 ) ) = 𝑥 𝑝 ( 𝐴 1 𝐴 − 1 1 𝑥 1 , 𝐴 2 𝐴 − 1 1 𝑥 1 , . . . , 𝐴 𝑁 𝐴 − 1 1 𝑥 1 ) = 𝑥 𝑝 ( 𝐴 1 𝐴 − 1 1 , 𝐴 2 𝐴 − 1 1 , . . . , 𝐴 𝑁 𝐴 − 1 1 ) 𝑥 1 = 𝑥 𝑝 ( 𝐴 1 , 𝐴 2 , . . . , 𝐴 𝑁 ) 𝐴 − 1 1 𝑥 1 = 𝐴𝐴 − 1 1 𝑥 1 , hence it is the linear transformation with asso ciated matrix 𝐴𝐴 − 1 1 . Note that in the latter compu- tation, we used the 1 homogeneity of the bar ycenter map (as a map ov er matrices) for our sp ecial choice of linear transformations, which holds true as a conse quence of the 𝑝 -homogeneity of the 𝑝 -Euclidean barycenter and the property showed in ( 5.11 ) . This property generally fails to hold for general matrices 𝐴 1 , . . . 𝐴 𝑁 , as the 𝑝 -barycenter between matrices is inde ed not 1 -homogenous in the space of all self-adjoint matrices. Such property holds more generally if we assume that the matrices commute, the proof being ver y similar to the proof in ( 5.11 ) when diagonal in a common basis. Appendix A. Injectivity estima te and proof of Theorem 2.4 The aim of this appendix is to prove Theorem 2.1 . Lemma A.2 b elow is crucial in Section 4 for the proof of Corollary 4.3 , and consequently of Proposition 1.3 . The next lemma, whose proof is a direct consequence of the denition of 𝑥 𝑝 and its lo cal Lipschitz regularity (Remark 2.11 ), describ es the properties of the 𝑝 -barycenter on the sets 𝐷 𝑆 , dened in ( 1.5 ). 32 C. BRIZZI AND L. PORTINALE Lemma A.1. Let 𝑆 ⊊ { 1 , . . . , 𝑁 } . Then for ev er y x = ( 𝑥 1 , . . . , 𝑥 𝑁 ) ∈ 𝐷 𝑆 , 𝑥 𝑖 is a solution of the variational problem 𝑥 𝑝 , 𝑆 c ( x 𝑆 c ) : = argmin 𝑧 ∈ R 𝑑  𝑗 ∉ 𝑆 𝜆 𝑗 | 𝑥 𝑗 − 𝑧 | 𝑝 for every 𝑖 ∈ 𝑆 , where, if | 𝑆 | = 𝐾 < 𝑁 , x 𝑆 c ∈ R ( 𝑁 − 𝐾 ) 𝑑 is the vector with components 𝑥 𝑗 , 𝑗 ∉ 𝑆 . In particular , 𝑥 𝑝 , 𝑆 c is Lipschitz on > 𝑖 ∉ 𝑆 spt 𝜇 𝑖 . Notice that, since 𝑥 𝑖 = 𝑥 𝑝 ( x ) on 𝐷 𝑆 , 𝑥 𝑝 ( x ) = 𝑥 𝑝 , 𝑆 c ( x 𝑆 c ) for every x ∈ 𝐷 𝑆 . The following Lemma is a rened version of [ BFR25 , Lemma 2.9] applie d to the case ℎ ( ·) = | · | 𝑝 . Lemma 2.9 in [ BFR25 ] provides inequality ( A.3 ) below and is in turn a rened version of [ BFR26 , Lemma 5.2]. Here we use the notation 𝐵 𝑟 ( 𝑥 ) to denote the op en ball centered on 𝑥 of radius 𝑟 . Recall the denition of 𝑐 𝑝 -monotone sets ( 2.5 ). Lemma A.2 (Local regularity on 𝑐 𝑝 -monotone set) . Let Γ ⊂ R 𝑁 𝑑 be a 𝑐 𝑝 -monotone set. Then, for every 𝑆 ⊊ { 1 , . . . , 𝑁 } and x ∈ 𝐷 𝑆 , there exists 𝑟 ( x ) > 0 such that | 𝑥 𝑝 ( y ) − 𝑥 𝑝 ( ˜ y ) | ≥ 1 2 ( min 𝑖 ∉ 𝑆 Λ 𝑖 ( x ) ) max 𝑖 ∉ 𝑆 | 𝐻 𝑖 ( x ) |   𝑖 ∉ 𝑆 | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2  1 2 , (A.1) for every y , ˜ y ∈ 𝐵 𝑟 ( x ) ( x ) ∩ 𝐷 𝑆 ∩ Γ , where 𝐻 𝑖 and Λ 𝑖 are dened in ( 1.4 ) . Note that 𝐻 𝑖 ( x ) ≥ 0 ( and thus Λ 𝑖 ( x ) ≥ 0 ) , for every x ∈ R 𝑑 𝑁 , (A.2) 𝐻 𝑖 ( x ) = 0 ( and thus Λ 𝑖 ( x ) = 0 ) ⇐ ⇒ 𝑥 𝑖 = 𝑥 𝑝 ( x ) , hence, for ev ery x ∈ 𝐷 𝑆 , 𝐻 𝑖 ( x ) > 0 , i.e. it is a strictly positive denite matrix, and Λ 𝑖 ( x ) > 0 , for every 𝑖 ∉ 𝑆 . Proof. Fix Γ ⊂ R 𝑁 𝑑 a 𝑐 𝑝 -monotone set and let x ∈ R 𝑑 be such that 𝑥 𝑖 ≠ 𝑥 𝑝 ( x ) , for some 𝑖 = 1 , . . . , 𝑁 . By [ BFR26 , Lemma 5.2], we know that for every 𝜀 > 0 , there exists 𝑟 ( x ) > 0 such that for y , ˜ y ∈ 𝐵 𝑟 ( x ) ( x ) ∩ Γ , it holds ( 𝑦 𝑖 − ˜ 𝑦 𝑖 ) 𝑇 𝐻 𝑖 ( x )  𝑥 𝑝 ( y ) − 𝑥 𝑝 ( ˜ y )  ≥ Λ 𝑖 ( x ) | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2 − 𝜀 𝑁 ( 1 + | 𝐻 𝑖 ( x ) | ) | y − ˜ y | 2 , (A.3) for Λ 𝑖 ( x ) > 0 the minimum eigenvalue of the matrix 𝐻 𝑖 ( x ) 𝐻 ( x ) − 1 𝐻 𝑖 ( x ) . Now let 𝑆 ⊊ { 1 , . . . , 𝑁 } and x ∈ 𝑆 . Then, by ( A.3 ), for ev er y y , ˜ y ∈ 𝐵 𝑟 ( x ) ( x ) ∩ 𝐷 𝑆 ∩ Γ we have  𝑖 ∉ 𝑆 ( 𝑦 𝑖 − ˜ 𝑦 𝑖 ) 𝑇 𝐻 𝑖 ( x )  𝑥 𝑝 ( y ) − 𝑥 𝑝 ( ˜ y )  ≥  𝑖 ∉ 𝑆 Λ 𝑖 ( x ) | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2 − 𝜀 𝑁 2 ( 1 + | 𝐻 𝑖 ( x ) | ) | y − ˜ y | 2 . Notice now that | y − ˜ y | 2 =  𝑖 ∉ 𝑆 | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2 +  𝑖 ∈ 𝑆 | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2 =  𝑖 ∉ 𝑆 | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2 +  𝑖 ∈ 𝑆 | 𝑥 𝑝 ( y ) − 𝑥 𝑝 ( ˜ y ) | 2 , where the last equality follows directly from the denition of 𝐷 𝑆 . By Lemma A.1 , we know that 𝑥 𝑝 is locally Lipschitz with respect to the components whose index does not b elong to S, i.e. | 𝑥 𝑝 ( y ) − 𝑥 𝑝 ( ˜ y ) | 2 = | 𝑥 𝑝 , 𝑆 c ( y 𝑆 c ) − 𝑥 𝑝 , 𝑆 c ( ˜ y 𝑆 c ) | 2 ≤ 𝐿 𝑝 ( x )  𝑖 ∉ 𝑆 | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2 . where 𝐿 𝑝 ( x ) = 𝐿 𝑝 ( 𝝀 , 𝑁 , x ) is the common Lipschitz constant 𝐿 𝑝 ( x ) : = sup 𝑆 ⊂ { 1 ,. ..,𝑁 } Lip  𝑥 𝑝 , 𝑆 c ; ? 𝑖 ∉ 𝑆 𝜋 𝑖 ( 𝐵 𝑟 ′ ( x ) )  ∈ R + , where 𝐵 𝑟 ′ ( x ) is a ball of xed radius centered at x . All in all, putting the estimates together , we get that for x ∈ 𝐷 𝑆 , for every 𝜀 > 0 , there exists 𝑟 = 𝑟 ( x , 𝜀 ) > 0 , such that  𝑖 ∉ 𝑆 ( 𝑦 𝑖 − ˜ 𝑦 𝑖 ) 𝑇 𝐻 𝑖 ( x )  𝑥 𝑝 ( y ) − 𝑥 𝑝 ( ˜ y )  ≥  𝑖 ∉ 𝑆 ( Λ 𝑖 ( x ) − 𝜀 𝐿 𝑝 ( x ) 𝑁 2 ( 1 + | 𝐻 𝑖 ( x ) | ) ) | 𝑦 𝑖 − ˜ 𝑦 𝑖 | 2 , (A.4) ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 33 for ev ery y , ˜ y ∈ 𝐵 𝑟 ( x ) ∩ 𝐷 𝑆 ∩ Γ . In order to conclude the proof, it is then enough to pick 𝜀 < 1 2 min 𝑖 ∉ 𝑆 { Λ 𝑖 ( x ) } 𝐿 𝑝 ( x ) 𝑁 2 max 𝑖 ∉ 𝑆 { ( 1 + | 𝐻 𝑖 ( x ) | ) } and choose 𝑟 ( x ) = 𝑟 ( x , 𝜀 ( x ) ) accordingly , as the sought estimate ( A.1 ) straightforwardly follows by ( A.4 ) and Cauchy–Schwarz inequality . ■ As a direct consequence we obtain the following: Corollary A.3. Let Γ ⊂ R 𝑁 𝑑 be a 𝑐 𝑝 -monotone set and 𝑆 ⊂ { 1 , . . . , 𝑁 } , 𝑆 ≠ { 1 , . . . , 𝑁 } . Then there exists a countable cover { 𝑈 𝑚 } 𝑚 ∈ N of the set 𝐷 𝑆 ∩ Γ with the following property: For every 𝑚 ∈ N , there exists 𝐿 𝑚 > 0 such that | 𝑥 𝑝 ( y ) − 𝑥 𝑝 ( ˜ y ) | ≥ 𝐿 𝑚   𝑗 ∉ 𝑆 | 𝑦 𝑗 −  𝑦 𝑗 | 2  1 2 for every y = ( 𝑦 1 , . . . , 𝑦 𝑁 ) , ˜ y = ( ˜ 𝑦 1 , . . . , ˜ 𝑦 𝑁 ) ∈ 𝐷 𝑆 ∩ Γ ∩ 𝑈 𝑚 . Proof. Clearly , Γ ∩ 𝐷 𝑆 ⊂  x ∈ 𝐷 𝑆 𝐵 ( x , 𝑟 ( x ) ) . As every subset of R 𝑑 𝑁 is second countable, we can extract countably many p oints { x 𝑚 } ⊂ 𝐷 𝑆 such that 𝐷 𝑆 ∩ Γ ⊂  𝑚 ∈ N 𝐵 𝑟 𝑚 ( x ) . The claim then follows with 𝑈 𝑚 ≔ 𝐵 ( x 𝑚 , 𝑟 𝑚 ) and 𝐿 𝑚 = 1 2 ( min 𝑖 ∉ 𝑆 Λ 𝑖 ( x 𝑚 ) ) max 𝑖 ∉ 𝑆 | 𝐻 𝑖 ( x 𝑚 ) | . ■ Remark A.4. Corollary A.3 provides the same injectivity estimate as [ BFR25 , Proposition 3.7], with the dierence that in Corollary A.3 the result holds for any 1 < 𝑝 < ∞ , while in the aforementioned paper it is pr oved only for 𝑝 ≥ 2 . This injectivity estimate is the key for the pr oof of the absolute continuity of the 𝑝 - W asserstein bar ycenter 𝜈 𝑝 . In particular , it allows for the proof of ( 1 ) in Lemma A.5 below . Notice that in [ BFR25 ] the corresponding result is stated in Lemma 3.3 only for 𝑆 = ∅ or in Lemma 3.8 for any 𝑆 ≠ { 1 , . . . , 𝑁 } , but with the assumption that 𝑝 ≥ 2 . The fact that ( 1 ) in Lemma A.5 holds or any 𝑆 ≠ { 1 , . . . , 𝑁 } allows weakening of the assumption on the marginals 𝜇 1 , . . . , 𝜇 𝑁 , by requiring only one to be absolutely continuous, for any 1 < 𝑝 < ∞ . The proof of ( 1 ) in Lemma A.5 below is the same as the one of [ BFR25 , Lemma 3.9 ] and it is a consequence of Corollary A.3 . The proof of ( 2 ) is a direct application of the denition of 𝐷 𝑆 . For it we refer to [ BFR25 , Lemma 3.4]. Lemma A.5. Let Γ ∈ R 𝑁 𝑑 be a 𝑐 𝑝 -monotone set. (1) Let 𝑆 ⊂ { 1 , . . . , 𝑁 } be such that 𝑆 ≠ { 1 , . . . , 𝑁 } , and let { 𝑈 𝑚 } 𝑚 ∈ N be the countable cover of 𝐷 𝑆 ∩ Γ dene d in Corollary A.3 . If 𝐸 ⊂ R 𝑑 is such that diam ( 𝐸 ) < 𝛿 for some 𝛿 > 0 , then diam ( 𝜋 𝑖 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ 𝐷 𝑆 ∩ 𝑈 𝑚 ∩ Γ ) ) < 𝛿 𝐿 𝑚 for every 𝑖 ∉ 𝑆 . In particular , if 𝐸 ⊂ R 𝑑 is such that L 𝑑 ( 𝐸 ) = 0 , then L 𝑑 ( 𝜋 𝑖 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ 𝐷 𝑠 ∩ 𝑈 𝑚 ∩ Γ ) ) = H 𝑑 ( 𝜋 𝑖 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ 𝐷 𝑠 ∩ 𝑈 𝑚 ∩ Γ ) ) = 0 , for every 𝑖 ∉ 𝑆 and for every 𝑚 ∈ N . (2) Let 𝑆 ⊂ { 1 , . . . , 𝑁 } be such that 𝑆 ≠ ∅ . If 𝐸 ⊂ R 𝑑 is such that diam ( 𝐸 ) < 𝛿 for some 𝛿 > 0 , then diam ( 𝜋 𝑖 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ 𝐷 𝑆 ) ) < 𝛿 for every 𝑖 ∈ 𝑆 . In particular , if 𝐸 ⊂ R 𝑑 is such that L 𝑑 ( 𝐸 ) = 0 , then L 𝑑 ( 𝜋 𝑖 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ 𝐷 𝑆 ) ) = H 𝑑 ( 𝜋 𝑖 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ 𝐷 𝑆 ) ) = 0 for every 𝑖 ∈ 𝑆 . W e are now ready to pr ove Theorem 2.4 . Proof of Theorem 2.4 . Let us consider a set 𝐸 such that L 𝑑 ( 𝐸 ) = 0 , and consider the family F 1 = { 𝑆 ⊂ { 1 , . . . , 𝑁 } : 1 ∈ 𝑆 } . A s 𝛾 𝑝 is optimal for ( C 𝑝 − MM ) , its support spt 𝛾 𝑝 is 𝑐 𝑝 -monotone. 34 C. BRIZZI AND L. PORTINALE Therefore , let { 𝑈 𝑆 𝑚 } is the countable cover of 𝐷 𝑆 ∩ spt 𝛾 𝑝 provided by Proposition A.3 . Then, ( 𝑥 𝑝 ) ♯ 𝛾 𝑝 ( 𝐸 ) = 𝛾 𝑝    𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩  𝑆 ⊂ { 1 ,. ..,𝑁 } 𝐷 𝑆    ≤  𝑆 ∉ F 1 𝛾 𝑝  𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆  +  𝑆 ∈ F 1 𝛾 𝑝  𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆  ≤  𝑆 ∉ F 1 𝜇 1  𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 )  +  𝑆 ∈ F 1 𝜇 1  𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 )  ≤  𝑆 ∉ F 1  𝑚 ∈ N 𝜇 1  𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 ∩ 𝑈 𝑆 𝑚 )  +  𝑆 ∈ F 1 𝜇 1  𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 )  . Notice that the second ine quality is due to the marginal constraint 𝜋 1 ♯ 𝛾 𝑝 = 𝜇 1 . By Lemma A.5 , L 𝑑 ( 𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 ∩ 𝑈 𝑆 𝑚 ) ) = H 𝑑 ( 𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 ∩ 𝑈 𝑆 𝑚 ) ) = 0 for every 𝑆 ∉ F 1 and ev ery 𝑚 ∈ N . Lemma A.5 then gives L 𝑑 ( 𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 ) ) = H 𝑑 ( 𝜋 1 ( 𝑥 − 1 𝑝 ( 𝐸 ) ∩ spt 𝛾 𝑝 ∩ 𝐷 𝑆 ) ) = 0 for every 𝑆 ∈ F 1 . W e conclude thanks to the absolute continuity of 𝜇 1 . ■ Ackno wledgments CB gratefully acknowledges funding from the Deutsche Forschungsgemeinschaft (DFG – German Research Foundation) – Project-ID 195170736 – TRR109. LP gratefully acknowledges funding from the Deutsche Forschungsgemeinschaft (DFG – German Research Foundation) under Germany’s Excellence Strategy -GZ 2047/1, Projekt-ID 390685813. Financial support by the Deutsche Forschungsgemeinschaft (DFG) within the CRC 1060, at University of Bonn project number 211504053, is also gratefully acknowledge d. LP is also thankful for the supp ort under the U-GO V project, identication number PSR_LINEA8A_25SMAN T_05. Both the authors w ould like to thank the IAM (Institüt für Ange wandte Mathematik) of the University of Bonn, the mathematics department "Federigo Enriques" of University of Milan, as well as the T ÜM (T echnische Universität München) for their kind hospitality during the preparation of this project. References [ABS24] Luigi Ambrosio, Elia Brué, and Daniele Semola. Lectures on optimal transp ort , volume 169 of Unitext . Springer , Cham, second edition, [2024] © 2024. La Matematica per il 3+2. [A C11] Martial Agueh and Guillaume Carlier . Barycenters in the Wasserstein space. SIAM Journal on Mathematical A nalysis , 43(2):904–924, 2011. [AFP00] Luigi Ambrosio, Nicola Fusco, and Diego Pallara. Functions of bounded variation and free discontinuity problems . Oxford Mathematical Monographs. The Clar endon Press, Oxford University Press, New Y ork, 2000. [BFR25] Camilla Brizzi, Gero Friesecke, and T obias Ried. 𝑝 -Wasserstein barycenters. Nonlinear A nal. , 251:Paper No. 113687, 19, 2025. [BFR26] Camilla Brizzi, Gero Friesecke, and T obias Ried. ℎ -Wasserstein bar ycenters. J. Math. Anal. A ppl. , 553(1):Paper No. 129952, 16, 2026. [BLGL15] Emmanuel Boissard, Thibaut Le Gouic, and Jean - Michel Loubes. Distribution’s template estimate with Wasserstein metrics. Bernoulli , 21(2):740–759, 2015. [Buz25] Maciej Buze. Constrained Hellinger–Kantor ovich barycenters: Least-cost soft and conic multimarginal formulations. SIAM Journal on Mathematical A nalysis , 57(1):495–519, 2025. [BVFRT22] Julio Backho- V eraguas, Joaquin Fontb ona, Gonzalo Rios, and Felip e T obar . Bayesian learning with Wasserstein barycenters. ESAIM: Probability and Statistics , 26:436–472, 2022. [CCE24] G. Carlier , E. Chenchene, and K. Eichinger . Wasserstein me dians: Robustness, pde characterization, and numerics. SIAM Journal on Mathematical A nalysis , 56(5):6483–6520, 2024. [CD14] Marco Cuturi and Arnaud Doucet. Fast computation of Wasserstein barycenters. In Proce edings of the 31st International Conference on Machine Learning , volume 32, pages 685–693. PMLR, 2014. [CE10] Guillaume Carlier and Ivar Ekeland. Matching for teams. Economic Theory , 42(2):397–418, 2010. [Chi25] L. Chizat. Doubly regularized entropic Wasserstein barycenter . Found Comput Math , 2025. ON THE 𝑞 -IN TEGRABILIT Y OF 𝑝 - W ASSERSTEIN BARY CEN TERS 35 [FMS21] Gero Friesecke, Daniel Matthes, and Bernhard Schmitzer . Bar ycenters for the Hellinger–Kantorovich distance over R 𝑑 . SIAM Journal on Mathematical A nalysis , 53(1):62–110, 2021. [Fri24] Gero Friesecke. Optimal Transport: A Comprehensive Introduction to Modeling, A nalysis, Simulation, A ppli- cations . Society for Industrial and Applied Mathematics, 2024. [GK25] Michael Goldman and Lukas K o ch. Partial r egularity for optimal transport with 𝑝 -cost away from xed points. Proc. A mer . Math. Soc. , 153(9):3959–3970, 2025. [GM96] Wilfrid Gangbo and Rob ert J. McCann. The geometry of optimal transportation. Acta Mathematica , 177(2):113–161, 1996. [GŚ98] Wilfrid Gangbo and Andrzej Świ e ¸ ch. Optimal maps for the multidimensional Monge–Kantor ovich problem. Communications on Pure and A pplied Mathematics , 51(1):23–45, 1998. [HLZ24] Bang xian Han, Dengyu Liu, and Zhuonan Zhu. On the geometry of Wasserstein barycenter i, 2024. [ Jia17] Yin Jiang. Absolute continuity of Wasserstein barycenters over Alexandrov spaces. Canad. J. Math. , 69(5):1087–1108, 2017. [Kel84] Hans G. Keller er . Duality theorems for marginal problems. Zeitschrift für W ahrscheinlichkeitstheorie und V erwandte Gebiete , 67(4):399–432, 1984. [KP14] Y oung Heon Kim and Brendan Pass. A general condition for Monge solutions in the multi-marginal optimal transport problem. SIAM Journal on Mathematical A nalysis , 46:1538–1550, 2014. [KP17] Y oung-Heon Kim and Br endan Pass. Wasserstein barycenters ov er riemannian manifolds. A dvances in Mathematics , 307:640–683, 2017. [Kro18] A. Kroshnin. Fréchet barycenters in the Monge–Kantorovich spaces. Journal of Classical A nalysis , 25(4):1371– 1395, 2018. [PZ20] Victor M. Panaretos and Y oav Zemel. A n invitation to statistics in Wasserstein space . SpringerBriefs in Probability and Mathematical Statistics. Springer , Cham, 2020. [RPDB11] Julien Rabin, Gabriel Peyré, Julie Delon, and Marc Bernot. Wasserstein barycenter and its application to texture mixing. In Scale Space and V ariational Methods in Computer Vision (SSVM 2011) , volume 6667 of Lecture Notes in Computer Science , pages 435–446, 2011. [San15] Filippo Santambrogio. Optimal transport for applied mathematicians , volume 87 of Progr ess in Nonlinear Dierential Equations and their A pplications . Birkhäuser/Springer , Cham, 2015. Calculus of variations, PDEs, and modeling. (C. Brizzi) Technische Universit ä t München, Departwheneverment of Ma thema tics, Boltzmannstrasse 3, 85748 Garching, Germany Email address : camilla.brizzi@tum.de (L. Portinale) Universit à degli studi di Milano, Milano, It aly Email address : lorenzo.portinale@unimi.it

On the $q$-integrability of $p$-Wasserstein barycenters

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment