Regret and Jeffreys Integrals in Exp. Families
The problem of whether minimax redundancy, minimax regret and Jeffreys integrals are finite or infinite are discussed.
Authors: Peter Grunwald, Peter Harremoes
Re gret and Jef fre y s Inte grals in Exp. F amilies Peter Gr ¨ unwald and Peter Harremo ¨ es Centrum W iskunde & Informatica Amsterdam, The Netherlands Emails: Peter .Grunwald@cwi.nl and Peter .Harremoes@cwi.nl I . P R E L I M I NA R I E S Let { P β | β ∈ Γ can } b e a 1-dimen sional expon ential family giv en in a c anonical param eterization, dP β dQ = 1 Z ( β ) e β x , (1) where Z is the partition function Z ( β ) = R exp( β x ) dQ x , and Γ can := { β | Z ( β ) < ∞} is the can onical parameter space . W e let β sup = sup { β | β ∈ Γ can } , and β inf like wise. The elements of the e xpo nential family are also parametrize d by their m ean value µ . W e wr ite µ β for the mean value co rrespond ing to th e can onical parameter β and β µ for the cano nical parameter corr espondin g to the mean value µ. For any x the m aximum likelihood distribution is P β x . The Shtarkov inte gral S is d efined as S = Z 1 Z ( β x ) e β x x dQx. (2) The variance function V is the functio n that maps µ ∈ M into the variance of P µ . Th e Fisher informa tion of an exponential family in its cano nical p arametrization is I β = V ( µ β ) an d the Fisher in formation of th e exponential family in its mean value p arametrization is I µ = ( V ( µ )) − 1 . The Jef fr eys integr al J is defined a s J = Z Γ can I 1 / 2 β dβ = Z M ( I µ ) 1 / 2 dµ. (3) More on Fisher informa tion can b e foun d in [2]. As first established by [3], if the parameter space is re- stricted to a compact subset of the interior of th e param eter space with non -empty inter ior (called an ineccsi set in [2]), then th e minimax regret is fin ite an d eq ual to the logarithm of the Sh tarkov integral, which in turn is equa l to 1 2 log n 2 π + log J + o (1) . (4) It thus bec omes quite relev ant to inv estigate whe ther the same thing still holds if the param eter space s are no t restricted to an ineccsi set. Whether or not this is so is discussed at len gth and posed as an op en prob lem in [2, Chap ter 11, Section 11 .1]. I I . R E S U L T S Theor em 1: For a 1-d imensional le ft-truncated expo nential family , the following statements are all eq uiv alent: 1) The Shtarkov integral is finite. 2) The minimax ind i vid ual-sequen ce r egret is finite. 3) The minimax expected re dundan cy is finite. 4) The exponential family has a domina ting distrib u- tion Q dom in terms of information divergence, i.e. sup β ∈ Γ can D ( P β k Q dom ) < ∞ . 5) There is distribution P β with β ∈ Γ can that dominates the exponential family in ter ms of inf ormation diver genc e. 6) The infor mation ch annel β → P β has finite capacity . 7) There exists β 0 ∈ Γ can such that lim β ↑ β sup D ( β 0 k β ) < ∞ or lim β ↑ β sup D ( β k β 0 ) < ∞ . Most of the equiv alences between (1)–(6) are quite straight- forward. Th e surpr ising par t is the fact that statements (1)–( 6) are also equiv alent to (7) . Theor em 2: Let (Γ can 0 , Q ) rep resent a left- truncated expo- nential family . If the Shtarkov integral is infin ite, then the Jeffreys integral is infinite. The converse does n ot hold in genera l. Theor em 3: Let (Γ can 0 , Q ) rep resent a left- truncated expo- nential family such that β sup = 0 an d Q adm its a density q either with r espect to Lebesgue measure or co unting measure. If q ( x ) = O (1 /x 1+ α ) fo r some α > 0 , then the Jeffreys integral R 0 β I ( γ ) 1 / 2 dγ is finite. In most cases finite Shtarkov im plies finite Jeffreys. Theor em 4: Let Q be a m easure on the real line with support I . Assume that µ inf is the left en d po int of I . If Q has density f ( x ) = ( x − µ inf ) γ − 1 g ( x ) in an interval just to the rig ht of a where g is an ana lytic function and g ( µ inf ) > 0 then the lef t end of the inter val I gives a finite contribution to Jeffrey’ s integral if an d only if Q has a poin t mass in a . If Y is a Cauchy d istributed ran dom variable th en X = exp ( Y ) h as density 1 π 1 x 1 + log 2 ( x ) . A prob ability m easure Q is define d as a 1 / 2 and 1 / 2 mixtur e of a po int mass in 0 and an exponentiated Cauchy distribution. The expo nential family ba sed on Q has redun dancy up per bound ed by 1 bit but the Jeffreys in tegral is infinite. For exponen tial families in mor e dimension s the analysis becomes more mo re in volved an d one may ev en have expo- nential families with finite r edunda ncy an d infinite regre t. R E F E R E N C E S [1] O. Barndorf f-Nielsen , Information and Exponentia l F amilies in Statistical Theory . Ne w Y ork: John W ile y , 1978. [2] P . Gr ¨ unwal d, the Minimum Description Length principle . MIT Press, 2007. [3] J. Rissanen, “Fisher information and stochastic complexity , ” IEE E T rans. Inform. Theory , vol. 42, no. 1, pp. 40–47, 1996.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment