Hyper-contractivity and entropy decay in discrete time
Consider a measure-preserving transition kernel $T$ on an arbitrary probability space $(\mathbb X,\mathcal cA,π)$. In this level of generality, we prove that a one-step hyper-contractivity estimate of the form $\|T\|_{p\to q}\le 1$ with $p< q$ implie…
Authors: Justin Salez
Hyp er-con tractivit y and en trop y deca y in discrete time Justin Salez F ebruary 20, 2026 Abstract Consider a measure-preserving transition k ernel T on an arbitrary probabilit y space ( X , A , π ). In this lev el of generalit y , w e pro ve that a one-step h yp er-con tractivity estimate of the form ∥ T ∥ p → q ≤ 1 with p < q implies a one-step entrop y contraction estimate of the form H ( µT | π ) ≤ θ H ( µ | π ), with θ = p/q . Neither reversibilit y , nor an y sort of regularit y is required. This static implication is sim ultaneously simpler and stronger than the celebrated dynamic relation b et w een exp onen tial hyper-contractivit y and exp onen tial entrop y decay along contin uous-time Mark ov semi-groups. 1 In tro duction T ransition k ernels. Throughout this note, we consider a measure-preserving tran- sition kernel on a probability space ( X , A , π ), i.e. a map T : X × A → [0 , 1] such that (i) A 7→ T ( x, A ) is a probabilit y measure for each x ∈ X ; (ii) x 7→ T ( x, A ) is measurable for each A ∈ A ; (iii) π is fixed by the natural action of T on P ( X ), i.e. the map µ 7→ µT giv en by ( µT )( A ) := Z X T ( x, A ) µ (d x ) . (1) Our aim is to shed a new ligh t on the in terplay betw een t w o fundamen tal regularization prop erties of T : hyp er-c ontr activity on the one hand, and entr opy c ontr action on the other. Let us first briefly recall what those notions are. 1 Hyp er-con tractivit y . The transition k ernel T naturally acts on non-negative mea- surable functions f : X → [0 , ∞ ] via the familiar formula T f ( x ) := Z X f ( y ) T ( x, d y ) . (2) This definition of course extends to signed functions b y linearit y , as long as T f + or T f − is finite. Moreo ver, for each p ≥ 1, Jensen’s inequality and the stationarity prop ert y π T = π easily and classically guaran tee that the ab o v e action is a contraction on the Banac h space L p ( π ), equipp ed with its usual norm, ∥ f ∥ p := Z | f | p d π 1 /p . Hyp er-con tractivit y is the stronger requirement that, for some q > p and all f ∈ L p ( µ ), ∥ T f ∥ q ≤ ∥ f ∥ p . (3) The first estimates of this form w ere discov ered b y Nelson along the Ornstein-Uhlenbeck semigroup [12], and b y Bonami and Bec kner on the discrete h yp ercube [6, 4]. F ollo wing the foundational contributions of Gross [9], Bakry and ´ Emery [1], and Diaconis and Saloff-Coste [11], hypercontractivit y has emerged as a fundamental to ol in the quanti- tativ e study of Marko v pro cesses. In recen t years, its impact has extended dramatically , yielding remark able adv ances in statistical physics and computer science [10, 3]. En tropy con traction. The second regularization prop ert y that we shall consider is a classical entrop y contraction estimate, which takes the form ∀ µ ∈ P ( X ) , H ( µT | π ) ≤ θ H ( µ | π ) , (4) for some constant θ < 1. Here, H ( · | · ) denotes the Kullback-Leibler divergence: H ( µ | π ) := Z log d µ d π d µ if µ ≪ π + ∞ else. Among sev eral other applications, en tropy contraction plays a prominent role in the analysis of mixing times of Mark o v pro cesses [5, 11], as well as in quantifying the celebrated c onc entr ation-of-me asur e phenomenon under the reference la w π . W e refer the unfamiliar reader to the recent lecture notes [7, 13] and the references therein for a self-contained introduction, and many examples. 2 2 Result and discussion In the presen t note, w e establish the following simple, general, and seemingly new quan titative relation b etw een hyper-contractivit y and entrop y contraction. Theorem 1 (Main result) . F or any p ar ameters 1 ≤ p ≤ q , the hyp er-c ontr activity estimate (3) implies the entr opy c ontr action estimate (4), with θ = p/q . W e emphasize that Theorem 1 applies to any measure-preserving transition k ernel on any probability space: neither reversibilit y , nor any sort of regularity is required. In particular, w e may think of T as the transition kernel of a discrete-time Marko v pro cess on X with inv ariant law π , in which case the conclusion can readily b e iterated to pro vide a geometric rate of conv ergence to equilibrium, in relative entrop y . Alter- nativ ely , we can choose T = P t , where ( P t ) t ≥ 0 is a given measure-preserving Marko v semi-group on ( X , A , π ) and t ≥ 0 a particular time-scale whic h w e w an t to in v estigate. In this well-studied contin uous-time setting, the ability to fo cus on a single instant ap- p ears to be new, and mak es our static implication stronger than its celebrated dynamic coun terpart, which we now review. T o lighten our discussion, w e delib erately omit tec hnical details and refer the inter- ested reader to the excellen t references [2] (for Mark o v diffusions on Euclidean spaces or smo oth manifolds) and [8] (on finite state spaces). Consider a m easure-preserving Mark ov semi-group ( P t ) t ≥ 0 on a probabilit y space ( X , A , π ), and assume that it satisfies an exp onential hyper-contractivit y estimate of the form ∀ f ∈ L 2 ( π ) , ∀ t ≥ 0 , ∥ P t f ∥ 1+ e 4 β t ≤ ∥ f ∥ 2 , (5) for some β > 0. Then, a classical differentiation leads to the lo g-Sob olev ine quality ∀ f ∈ F , E ( p f , p f ) ≥ β H ( f d π | π ) , (6) where F is an appropriate class of probabilit y densities on ( X , A , π ), and E ( · , · ) the Diric hlet form asso ciated with the semi-group. No w, in view of the elemen tary estimate ∀ a, b ∈ (0 , ∞ ) , b log b a ≥ 2 √ b ( √ b − √ a ) , the log-Sob olev inequality (6) alw ays implies its “mo dified” v ersion ∀ f ∈ F , E ( f , log f ) ≥ 2 β H ( f d π | π ) , (7) whic h, b y a Gr¨ on wall-t yp e argument, finally guarantees the exp onential entrop y decay ∀ µ ∈ P ( X ) , ∀ t ≥ 0 , H ( µP t | π ) ≤ e − 2 β t H ( µ | π ) . (8) 3 In other words, the implication (5) = ⇒ (8) alwa ys holds along Mark ov semi-groups. This general relation b et w een hyper-contractivit y and entrop y con traction is of course a well-established fact, with many applications. It is imp ortant to realize, ho wev er, that it is inheren tly dynamical: the estimates (5) and (8) hold for all t ≥ 0, and the semi-group structure is crucially used to reduce them to the resp ective functional inequalities (6) and (7), which can then b e appropriately compared. In contrast, fixing a particular time t ≥ 0 and choosing T = P t in Theorem 1 directly yields ∀ µ ∈ P ( X ) , H ( µP t | π ) ≤ 2 1 + e 4 β t H ( µ | π ) , whic h, in view of the inequality 1 + e 4 β t ≥ 2 e β t , is alwa ys stronger than (8). On finite state spaces for example, this readily leads to the mixing-time estimate ∀ ε ∈ (0 , 1) , t mix ( ε ) ≤ 1 4 β log log 1 π ⋆ + log 1 ε 2 , whic h is twice b etter than what the traditional estimate (8) w ould give. Here, w e hav e used the classical notation π ⋆ := min x ∈ X π ( x ) for the minimum stationary mass, and t mix ( ε ) := min { t ≥ 0 : ∀ µ ∈ P ( X ) , d tv ( µP t , π ) ≤ ε } , for the worst-case total-v ariation mixing time. More imp ortantly , our result do es not require the assumption (5) to hold at all times: a static hyper-contractivit y estimate at some fixed time suffices to guaran tee an entrop y contraction estimate at the very same time. Exploiting this instantaneous relation can b e quite interesting in practice, b ecause the large-time regularizing effect of P t is often m uc h stronger than what an infinitesimal computation at t = 0 w ould predict. Finally , in addition to b eing more general and p erhaps more natural, Theorem 1 admits an elementary pro of b y duality , whic h completely av oids the use of semi-groups, hence the technical precautions needed in order to safely differentiate along them. Remark 1 (Conv erse) . It is wel l known that the implic ation (5) = ⇒ (8) c an b e r everse d for Markov diffusions, and an appr oximate version of this was r e c ently establishe d on discr ete state sp ac es as wel l [14, 15]. In light of this, it is natur al to ask for an appr o- priate c onverse to our main the or em. Mor e pr e cisely, if a r eversible tr ansition kernel T satisfies a one-step entr opy c ontr action estimate, and if it is sufficiently “r e gular”, c an one de duc e a one-step hyp er-c ontr activity estimate, at a r e asonable pric e? 4 3 Pro of of the theorem Let us start b y recalling that T has an adjoint T ⋆ , characterized b y the duality relation Z g T ⋆ f d π = Z f T g d π , (9) for any measurable functions f , g : X → R such that those integrals make sense. Now, fix 1 ≤ p ≤ q and supp ose that T satisfies the hyper-contractivit y prop erty ∀ g ∈ L p ( π ) , ∥ T g ∥ q ≤ ∥ g ∥ p . (10) Giv en a probability measure µ ∈ P ( X ), our goal is to prov e that H ( µT | π ) ≤ p q H ( µ | π ) . (11) W e may assume that the righ t-hand side is finite, otherwise there is nothing to prov e. In other w ords, µ admits a density f w.r.t. π , and f log f ∈ L 1 ( π ). First, using the definition of µT at (1) and the dualit y relation (9) with g = 1 A , we find ∀ A ∈ A , ( µT )( A ) = Z f T 1 A d π = Z 1 A T ⋆ f d π , whic h shows that µT is also absolutely con tin uous w.r.t. π , with density T ⋆ f . Next, the hyper-contractivit y assumption (10) applied to g = ( T ⋆ f ) 1 /p reads Z T e 1 p log T ⋆ f q d π ≤ 1 . On the other hand, Jensen’s inequalit y ensures that e T h ≤ T e h for an y measurable function h : X → R such that T h + < ∞ , hence in particular for h := 1 p log T ⋆ f . Inserting this p oint wise estimate into the ab ov e integral, we deduce that the function φ := q p T log T ⋆ f , satisfies R e φ d π ≤ 1. Finally , b y the v ariational formulation of entrop y (or just the con vexit y estimate u log u ≥ 1 − u applied to u = f e − φ ), this last condition implies Z f φ d π ≤ Z f log f d π . The right-hand side is H ( µ | π ), and the left-hand side equals q p H ( µT | π ) b ecause Z f T (log T ⋆ f ) d π = Z ( T ⋆ f ) log( T ⋆ f ) d π = H ( µT | π ) , where w e ha ve used the duality relation (9) with g = log T ⋆ f . Th us, (11) is established. 5 Ac kno wledgmen t The author warmly thanks Liming W u for raising the question answered in the present note. This w ork w as supported b y the ER C consolidator grant CUTOFF (101123174). References [1] Dominique Bakry and Michel ´ Emery . Diffusions h yp ercon tractiv es. In S ´ eminair e de pr ob abilit´ es, XIX, 1983/84 , volume 1123 of L e ctur e Notes in Math. , pages 177– 206. Springer, Berlin, 1985. [2] Dominique Bakry , Iv an Gen til, and Michel Ledoux. A nalysis and ge ometry of Markov diffusion op er ators , volume 348 of Grund lehr en der mathematischen Wis- senschaften [F undamental Principles of Mathematic al Scienc es] . Springer, Cham, 2014. [3] Roland Bauerschmidt, Thierry Bo dineau, and Benoit Dagallier. Sto chastic dy- namics and the Polc hinski equation: An introduction. Pr ob ability Surveys , 21(none):200 – 290, 2024. [4] William Beckner. Inequalities in Fourier analysis. Ann. of Math. (2) , 102(1):159– 182, 1975. [5] Sergey G. Bobk ov and Prasad T etali. Mo dified logarithmic Sob olev inequalities in discrete settings. J. The or et. Pr ob ab. , 19(2):289–336, 2006. [6] Aline Bonami. ´ etude des co efficients de Fourier des fonctions de L p ( G ). Ann. Inst. F ourier (Gr enoble) , 20:335–402, 1970. [7] Pietro Caputo. Lecture notes on entrop y and Mark ov chains. A vailable at: http://www.mat.unir oma3.it/users/c aputo/entr opy.p df , 2022. [8] Persi Diaconis and Laurent Saloff-Coste. Logarithmic Sob olev inequalities for finite Marko v chains. Ann. Appl. Pr ob ab. , 6(3):695–750, 1996. [9] Leonard Gross. Logarithmic Sob olev inequalities. A mer. J. Math. , 97(4):1061– 1083, 1975. [10] Jeff Kahn, Gil Kalai, and Nathan Linial. The influence of v ariables on Bo olean functions. In 29th Annual Symp osium on Foundations of Computer Scienc e , pages 68–80. IEEE Comput. So c. Press, W ashington, DC, [1988] © 1988. 6 [11] Ravi Mon tenegro and Prasad T etali. Mathematical asp ects of mixing times in Mark ov c hains. F ound. T r ends The or. Comput. Sci. , 1(3):x+121, 2006. [12] Edward Nelson. A quartic interaction in t wo dimensions. In Mathematic al The ory of Elementary Particles (Pr o c. Conf., De dham, Mass., 1965) , pages 69–73. MIT Press, Cambridge, Mass.-London, 1966. [13] Justin Salez. Mo dern asp ects of Marko v chain s: en tropy , curv ature and the cutoff phenomenon, 2025. [14] Justin Salez, Konstantin Tikhomirov, and Pierre Y oussef. Upgrading MLSI to LSI for reversible Marko v chains. J. F unct. Anal. , 285(9):Paper No. 110076, 15, 2023. [15] Justin Salez and Pierre Y oussef. In trinsic regularity in the discrete log-sobolev inequalit y , 2025. 7
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment