Projective Limit Random Probabilities on Polish Spaces

A pivotal problem in Bayesian nonparametrics is the construction of prior distributions on the space M(V) of probability measures on a given domain V. In principle, such distributions on the infinite-dimensional space M(V) can be constructed from the…

Authors: Peter Orbanz

Electronic Journal of Stati stics V ol. 5 (2011) 1354–13 73 ISSN: 1935-7524 DOI: 10.1214/ 11-EJS64 1 Pro jectiv e limit random p robabilitie s on P olish spaces P eter Orbanz ∗ Computational and Biolo gica l L ea rning L ab or atory University of Cambridge e-mail: p.orbanz @eng.cam .ac.uk Abstract: A pivo tal problem in Ba y esian nonparametrics is the construc- tion of prior distributions on the space M ( V ) of probability measures on a give n domain V . In principle, such di s tributions on the infinite-dimensional space M ( V ) can b e constructed from their finite-dimensional marginals— the most prominent example being the con struction of the Dirichlet pro cess from finite-dimensional Dirichlet distributions. This approach is b oth int u- itive and applicable to the construction of arbitrary distributions on M ( V ), but also hamstrung by a num ber of te c hnical difficulties. W e sho w ho w these difficulties can be r esolv ed if t he domain V is a P olish top ological s pace, and give a represen tation theorem dir ectly applicable to the construction of an y probability distribution on M ( V ) whose first momen t measure is we ll-defined. The pro of dra ws on a pro jective limit theo rem of Bo c hner, and on properties of set functions on Polish spaces to establish count able additivit y of the resulting random probabilities. AMS 2000 sub ject classificati ons: Prim ary 62C10; second ary 60G57. Keywords and phrases: Ba y esian no nparametrics, Dirichlet processes, random probability m easures. Receiv ed Jan uary 2011. 1. In troductio n A v ariety of wa ys exists to construct the Dirichlet pr o cess. F or this particular case of a random probabilit y measure , the sp ectrum o f constr uction approa ches ranges from the pro jective limit co nstruction from finite-dimensional Dirich- let distributions pr op osed by F erguson [ 8 ] to the stic k-breaking construction o f Seth uraman [ 25 ]; see e.g. the s urvey b y W alker et al. [ 27 ] for a n o verview. Mo st of these constructions are b esp oke repr esentations more or less specific t o the Dirichlet. An ex ception is the pro jective limit representation, whic h can repre- sent any probability distribution on the space of probability measures. Ho w ever, several authors [e.g. 12 , 1 3 ] have noted technical proble ms arising for this con- struction. The key role o f the Dirichlet pro cess, and the prov e n utility of its representation b y stick-breaking or by Poisson proc esses, may account for the slightly surprising fact that these problems hav e not y et been addressed in the literature. ∗ Researc h supp orted b y EPSRC gr an t EP/F028628 /1. 1354 P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1355 The purpose of this pap er is to provide a pro jective limit r e s ult dir e ctly ap- plicable to the construction of any pro bability distribution on M ( V ). W e do so by first modifying and then proving a construction idea put fo r th by F erguso n [ 8 ]. In tuitiv ely speaking , o ur main result (Theorem 1.1 ) allows us to co nstruct distributions on M ( V ) by substituting the Dirichlet distr ibutions used in the deriv a tio n of the Diric hlet pro cess b y other families of distributions, a nd b y v er- ifying that these families satisfy the tw o necessary and sufficien t conditions of the theorem. Stic k-breaking, urn schemes [ 3 ] and other sp ecialized representa- tions of the Diric hlet pro cess all r ely on the latter ’s particular discreteness and spatial decorrela tion properties. O ur approach may facilitate the deriv ation of mo dels for which no suc h representations ca n b e exp ected to exist, for example, of smooth r andom mea sures. F or Ba y esian nonpara metrics, the result provides what currently s eems to b e the only av a ila ble tool to co nstruct an arbitrar y prior distribution on the set M ( V ). It also makes Bayesian methods bas ed o n r an- dom measur es more readily co mparable to o ther t ypes o f nonpara metric priors constructed in a similar fashion, notably to Gaus s ian pro cesses [ 2 , 28 ]. The technical difficulties a r ising for the construction pr op osed in [ 8 ] can be summarized as three separ ate pro blems, which App endix A r eviews in detail. In shor t: i Pro duct spaces. The pro duct space s etting of the standard Kolmog orov extension theor em is not well-adapted to the problem of cons tructing ran- dom pro babilit y measures. ii Measurability proble ms. A str aightforw ard formaliza tio n of the con- struction in terms of an extension or pro jective limit theorem results in a space whose dimensions a re lab eled by the Borel sets of V , and is hence of uncountable dimensio n. As a consequence, the co ns tructed measure can- not resolve most event s of interest. In particular, s ingletons, a nd hence the even t that the r andom mea sure a ssumes a sp ecific measur e as its v a lue , are not meas ur able [ 13 , Sec. 2 .3.2]. iii σ -additivit y . The constructed measure is supported on finitely additive probabilities (charges), rather than σ -additive probabilities (measures); see Ghosal [ 12 , Sec. 2.2 ]. F urther conditions a r e necessary to obtain a measure on pr obability measures. T o make the pro jective limit construction feasible, w e hav e to impo s e some top ological requirements on the domain V of the random mea s ure. Specifica lly , we require that V is a Polish space, i.e. a top olog ical spa ce whic h is complete, separable and metriza ble [ 17 ]. This setting is sufficiently general to acco mmo- date any applications in Bay esian nonpara metr ics—Bay esian metho ds do not solicit the gener ality of arbitr a ry measur a ble spaces, since no us eful notio n of conditional probability ca n be defined without a modicum o f top olo gical struc- ture. P olish spaces ar e in many regards th e natural habitat of B ay esian statistics, whether par ametric o r nonpa rametric, since they g uarantee both the existence of regular conditional probabilities and the v alidit y of de Finetti’s theorem [ 16 , Theorem 11 .1 0]. The r estriction to P olish spaces is hence unlikely to incur any loss o f g enerality . W e a ddress problem (i) b y means of a generaliza tion of Kol- P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1356 mogorov’s extensio n theorem, due to Bo chner [ 4 ]; problem (ii) b y means of the fact that the Bo rel σ -alg ebra of a Polish space V is genera ted b y a countable subsystem of sets, which a llows us to subs titute the uncountable-dimensional pro jective limit s pace by a coun table-dimensional surro gate; and problem (iii) using a result o f Harris [ 14 ] on σ -a dditivit y of set functions on Polish spaces. The remainder of the article is structured as follows: The main res ult is stated in Sec. 1.1 , whic h is meant to provide a ll information r equired to apply the theorem, without going into the details of the proo f. Related w ork is summarized in Sec. 1.3 . A br ief ov erview of pro jective limit cons tructions is given in Sec. 2 , to the ex ten t relev a nt to the pr o of. Secs. 3 and 4 contain the actual pro of of The o rem 1.1 : The pr o jective limit construction of rando m set functions is describ ed in Sec. 3 . A neces s ary and sufficient condition fo r these ra ndom set functions to b e σ -additive is given in Sec. 4 . App endix A r eviews pro ble ms (i)-(iii) ab ov e in more detail. 1.1. Main r esult T o state our main theorem, w e m ust intro duce s ome notation, and sp ecify the relev a n t notion of a marg inal distribution in the pr esent co ntext. Let M ( V ) b e the set o f Bor el probability measures ov er a Polish top ological spa ce ( V , T V ); re- call that the space is Polish if T V is a metrizable top ology under which V is com- plete and separable [ 1 , 17 ]. Throughout, the underlying mo del of r andomness is an abstract proba bility space (Ω , A , P ). A random v aria ble X : Ω → M ( V ), with the image measure P := X P as its distr ibution, is called a r andom pr ob abili ty me asure on V . O ur ma in result, Theorem 1.1 , is a general r epresentation result for the distribution P of s uc h a r andom mea sure. T o define measures on the space M ( V ), w e endow it with the weak ∗ top ology T w ∗ (whic h in the co n text of probability is often called the top ology of weak co n vergence) and with the corres p onding Bo rel σ -algebr a B w ∗ := σ ( T w ∗ ). Since V is Polish, the top ologica l space ( M ( V ) , T w ∗ ) is Polish as well [ 17 , Theorem 17.2 3]. Let I = ( A 1 , . . . , A n ) be a me asur able p artition o f V , i.e. a partition o f V into a finite n um ber of measur a ble, dis join t sets. Denote the set of a ll suc h partitions H ( B V ). An y probability measure x ∈ M ( V ) can be ev a luated on a partition I to pro duce a vector x I := ( x ( A 1 ) , . . . , x ( A n )), and we write φ I : x 7→ x I for the ev alua tio n functional so defined. Clear ly , x I represents a probability measure on the finite σ -a lgebra σ ( I ) g enerated by the pa r tition. L e t △ I be the set of all measures x I = φ I ( x ) obtained in this manner, wher e x runs through all measures in M ( V ). This set, △ I = φ I M ( V ), is precis e ly the unit simplex in the n -dimensional Euclidean spa ce R I , △ I = n x I ∈ R I    x I ( A i ) ≥ 0 and X A i ∈ I x I ( A i ) = 1 o . (1.1) Let J = ( B 1 , . . . , B m ) and I = ( A 1 , . . . , A n ) b e partitions such that I is a coarsening of J , that is, for eac h A i ∈ I , ther e is a se t J i ⊂ { 1 , . . . , m } o f P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1357 B 2 B 3 B 1 △ J x J B 1 ∪ B 2 B 3 △ I f JI x J f JI Fig 1 . L eft: The simplex △ J ⊂ R J for a p artit i on J = ( B 1 , B 2 , B 3 ) . R ight: A new simp lex △ I = f JI △ J is obtaine d b y mer ging the sets B 1 and B 2 , pr o ducing the p artition I = ( B 1 ∪ B 2 , B 3 ) . The mapping f JI is given by f JI x J = ( x J ( B 1 ) + x J ( B 2 ) , x J ( B 3 )) . Its image △ I is a subset of the pr o duct sp ac e R I , which shar es only a single axis, B 3 , with the sp ac e R J . indices suc h that A i = ∪ j ∈J i B j . The sets J i form a partition of the index set { 1 , . . . , m } . If I is a coarsening of J , we write I  J . Let x, x ′ ∈ M ( V ). If I  J , then φ J x = φ J x ′ implies φ I x = φ I x ′ . In o ther words, φ I x is completely determined by φ J x , and inv ariant under any changes to x whic h do not affect φ J x . Therefore , the implicit definition f JI ( φ J ( x )) := φ I ( x ) determines a w ell-defined mapping f JI : △ J → △ I . With notation for J and I as ab ov e, f JI can equiv a lent ly b e defined as ( f JI x J )( A i ) = X j ∈J i x J ( B j ) . (1.2) Figure 1 illustra tes the mapping f JI and the simplices △ J and △ I . The image f JI x J ∈ △ I constitutes a pr o bability distributio n on the even ts in I . The fol- lowing intuition is often helpful: The space M ( V ) is convex, with the Dirac measures on V as its e xtreme p oints, and we can roughly think o f M ( V ) as the infinite-dimensional analogue of t he simplices △ I . Similarly , w e can r egard the ev alua tio ns ma ps φ I : M ( V ) → △ I as a na logues of the ma ps f JI : △ J → △ I . Even though b oth M ( V ) and the spaces △ I are Polish, how ever, w e hav e to keep in mind that the w eak ∗ top ology on M ( V ) is, in man y regards, quite different from the top olo g y whic h △ I inherits fro m Euclidean space. F or further pr op er- ties of the space M ( V ), w e refer to the exce lle nt expo sition given by Alipran tis and Border [ 1 , Chapter 15]. Suppo se that P is a probability meas ure on M ( V ). Denote by φ I P the imag e measure of P under φ I , i.e. the measure on △ I defined b y ( φ I P )( A I ) := P ( φ − 1 I A I ) for a ll A I ∈ B ( △ I ). W e refer to φ I P as the mar ginal of P on △ I . Simila rly , if P J is a measure on △ J , then for any I  J , the image measure f JI P J is called the marginal of P J on △ I . The follo wing theorem, our main result, s tates that a measure P on M ( V ) can be constructed from a suitable family of marginals P I on the simplices △ I . The notation E Q [ . ] refer s to expecta tion with resp ect to the law Q . P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1358 Theorem 1. 1 . L et V b e a Polish sp ac e with Bor el sets B V . L et M ( V ) b e the set of pr ob abil ity me asur es on ( V , B V ) , and B w ∗ the Bor el σ -algebr a gener ate d by the we ak ∗ top olo gy on M ( V ) . L et h P I i H ( B V ) := { P I | I ∈ H ( B V ) } b e a family of pr ob ability me asure s on the finite-dimensional simpli c es △ I . The fol lowing statements a r e e quivalent: (1) The family h P I i H ( B V ) is pr oje ctive, P I = f JI P J whenever I  J (1.3) and satisfies E P I [ X I ] = φ I G 0 for a l l I ∈ H ( B V ) . (1.4) (2) The r e exists a unique pr ob ability me asure P on ( M ( V ) , B w ∗ ) satisfyi ng P I = φ I P for a l l I ∈ H ( B V ) (1.5) and E P [ X ] = G 0 for so me G 0 ∈ M ( V ) . (1.6) If eithe r statement h olds, P is a R adon me asure . Remark 1. 2. Theorem 1.1 is applicable to the construction of any random probability measur e X on V whose first moment E P [ X ] exis ts. In par ticular, the rando m measure X need not b e discrete. See Sec. 1 .2 for examples. The t w o conditions of Theorem 1.1 ser ve tw o separa te purp oses: Condi- tion ( 1.3 ) guara n tees that the family h P I i H ( B V ) defines a unique pr obability measure P H ( B V ) . The s upp or t o f this measure is not actually M ( V ), but a larger s et—spe cifically , the set C ( Q ) of finitely additive pr obability mea sures (ch arges) defined on a cer tain subsystem Q ⊂ B V , which w e will make pr ecise in Sec. 3 . The set C ( Q ) co n tains the set M ( Q ) o f σ -additive probability mea- sures on Q as a mea surable subset, and M ( Q ) is in turn isomorphic to M ( V ), by Carath´ eo dory’s extension theorem [ 16 , Theorem 2.5]. T o obtain the distribution of a random measure, we ne e d to ensure that P H ( B V ) concentrates on the subs et M ( Q ) ∼ = M ( V ), or in other words, that draws from P H ( B V ) are σ -additive al- most surely . Condition ( 1.4 ) is sufficient—and in fact necess ary—for P H ( B V ) to concentrate o n M ( V ), and there fore for a random v aria ble X H ( B V ) with distr i- bution P H ( B V ) to constitute a random measure. If ( 1.4 ) is satisfied, the measure constructed on C ( Q ) can be r estricted to a measure on M ( V ), resulting in the measure P descr ibed by Theorem 1.1 . Sec. 3 pro vides more details. The technical restr iction that V be Polish is a mild one for all practical purp oses, a fact b est illustr ated by some concrete examples of Polish spaces: The real line is Polish, a nd so are R n and C n ; an y finite space; all separable Ba nach spaces (since Banac h s pa ces are complete metric spa ces), in particular L 2 and any other separ a ble Hilb ert space; the space M ( V ) o f probability measur es o v er a P olish domain V , in the weak ∗ top ology [ 1 , Chapter 15]; the spac es C ([0 , 1] , R ) and C ( R + , R ) of co n tin uous funct ions, in the top olo gy of compact conv e rgence [ 2 , § 38]; and the Sk oroho d space D ( R + , R ) of c` adl` ag functions [ 24 , Chapter VI]. Any countable pr o duct of Polish spa ces is Polish, in particula r R N , C N , P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1359 and the Hilb ert cub e [0 , 1] N . A subset of a given Polish space is Polish in the relative topology if and only if it is a G δ set [ 17 , Theor e m 3.11 ]. A border line example a r e the spa ces C ( T , E ) of contin uous functions with Polish rang e E . This space is Polish if T = R ≥ 0 or if T is compact and Polish, but no t e.g. for T = R [ 10 , § 454 ]. In Bay esian nonparametrics, this distinction ma y be relev ant in the con text of the “dependent Dirichlet proces s” mo del o f MacEachern [ 21 ], which inv olves Diric hlet proces ses o n spaces of con tin uous functions. F or more background on P olish spaces, see [ 1 , 10 , 17 ]. 1.2. Examples Theorem 1.1 y ields straigh tforward co nstructions fo r sev eral mo dels studied in the literature , a nd we consider three sp ecific examples to illustra te the result. First, by cho o sing the finite-dimensional marginals P I in Theorem 1.1 as a suit- able family of Dirichlet distributions, we obtain a construction of the Dirichlet pro cess in the spirit of F ergus on [ 8 ]. Corollary 1.3 (Diric hlet Pro c e ss) . L et V b e a Polish sp ac e, G 0 a pr ob ability me asure on B V , and let α ∈ R > 0 . F or e ach I ∈ H ( B V ) , define P I as the Dirichlet distribution on △ I ⊂ R I , with c onc entr ation α and exp e ctation φ I G 0 ∈ △ I . Then ther e is a uniquely determine d pr ob ability me asur e P on M ( V ) with exp e ctation G 0 and the distributions P I as its mar ginal s, that is, φ I P = P I for al l I ∈ H ( B V ) . A similar construc tio n yields the normalize d inverse Gaussian pr o c ess of Li- joi et al. [ 20 ]. The inv erse Gaussia n distribution on R ≥ 0 is given by the den- sity p IG ( z | α, γ ) = α √ 2 π x − 3 / 2 exp( − 1 2 ( α 2 x + γ 2 x ) + γ α ) with resp ect to Leb esg ue measure. Lijoi et al. [ 20 ] define a nor malized inverse Gaussia n distribution NIG( α 1 , . . . , α n ) on the simplex △ n ⊂ R n as the dis tr ibution of the vector w = ( z 1 P i z i , . . . , z n P i z i ), wher e z i is distributed according to p IG ( z i | α i , γ = 1). The density of w can b e derived explicitly [ 20 , Equa tio n (4)]. Applicability of Theorem 1 .1 is a dir ect co nsequence of the results of Lijoi et al. [ 20 ], which imply conditions ( 1.3 ) [ 20 , (C3)] and ( 1.4 ) [ 20 , Pro po sition 2]. Corollary 1.4 (Normaliz e d Inv erse Gaussian Proc ess) . L et α ∈ R + and G 0 ∈ M ( V ) . F or any p artition I = ( A 1 , . . . , A n ) in H ( B V ) , cho ose the me asur e P I as the normalize d inverse Gaussian distribution NIG( αG 0 ( A 1 ) , . . . , αG 0 ( A n )) . Ther e is a uniquely determine d pr ob ability me asur e P on M ( V ) with exp e ctation G 0 and φ I P = P I for a l l I ∈ H ( B V ) . Although both the Dirichlet pro cess and the normalized in v erse Gaussian pro cess are discrete almost surely , Theor e m 1.1 is applicable to the construction of contin uous rando m measures. The P´ olya tr e e random measur es introduced by F erg uson [ 9 ] provide a co n venien t example. They ca n be obtained a s pro jectiv e limits as follows: Cho ose V = R and let G 0 ∈ M ( R ) b e a probability measur e with cumulative distribution function g 0 . F o r ea ch n , let I n be the pa rtition of R into int erv a ls [ g − 1 0 ( k − 1 2 n ) , g − 1 0 ( k 2 n )), where k = 1 , . . . , 2 n . All sets in I n hav e ident ical probability 1 / 2 n under G 0 . Since ea ch partition I n is obtained fro m P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1360 I n − 1 by splitting each s et in I n − 1 at a single point, the seq ue nc e ( I n ) satisfies I 1  I 2  . . . . It can b e r epresented as a binary tree whose n th level corres po nds to I n , each no de r epresenting one constituent set. There are t w o natural w a ys of indexing sets in the par titions: One is to wr ite A n,k for the k th set in I n , i.e. n indexes tree levels and k enumerates s ets within each level. The o ther is to index sets a s A m 1 ,...,m n by a binary sequence encoding the unique path from the ro ot no de R and the set in questio n, where m i = 1 indicates passing to a right child node. Let [ m ] 2 denote the binary representation of an a r bitrary p ositive int eger m . Then A n,k = h g − 1 0  k − 1 2 n  , g − 1 0  k 2 n  = A [2 n +( k − 1)] 2 and I n = ( A n, 1 , . . . , A n, 2 n ) . It is useful to use b oth index co nven tio ns interc hangeably . With each no de A m 1 ··· m n , w e asso c iate a pair ( Y m 1 ··· m n 0 , Y m 1 ··· m n 1 ) ∼ Beta( α m 1 ··· m n 0 , α m 1 ··· m n 1 ) of be ta ra ndo m v aria bles: A 0 , 1 = A 1 = R A 1 , 1 = A 10 A 2 , 1 = A 100 . . . A 2 , 2 = A 101 . . . A 1 , 2 = A 11 . . . . . . Y 10 Y 11 Y 100 Y 101 Y 110 Y 111 T o apply Theorem 1.1 , define pr obability measures P I n on the simplices △ I n as follows: Supp ose a par ticle slides down the tree, mo ving along each edge with the as so ciated probability Y m 1 ··· m n . The probability of reaching the set A n,k is a random v ariable X n,k , defined recursively in terms of the beta v a riables as X m 1 ··· m n m n +1 := X m 1 ··· m n Y m 1 ··· m n m n +1 . Cho ose P I n as the dis tr ibution of X I n = ( X n, 1 , . . . , X n, 2 n ). Applicabilit y o f Theorem 1.1 follows from tw o res ults of F erg uson [ 9 ]: (a) The partitions I n generate the B orel sets B ( R ) and (b) ea ch r an- dom measur e X I n ∈ △ I n has exp ectation E [ X I n ] = ( G 0 ( A n, 1 ) , . . . , G 0 ( A n, 2 n )). Prop erty (a) implies that the s equence P I n induces a complete family h P I i o f probability measures on all simplices △ I , I ∈ H ( B ( R )). By co ns truction, h P I i satisfies ( 1.3 ). Acco rding to (b), ( 1.4 ) ho lds. Theor em 1.1 and the well-kno wn contin uity prop erties of P´ olya trees [ 19 , Theor e m 3 ] yield: Corollary 1.5 (P´ olya tree) . L et h P I i b e a family of me asur es define d as ab ove. Ther e is a unique pr ob ability me asur e P on M ( R ) satisfying φ I P = P I . The distribution P is a P ´ olya tr e e in the sense of F er guson [ 9 ], with p ar ameters G 0 and ( α [ n ] 2 ) n ∈ N . The r ando m pr ob ability me asur e X on R with distribution P has exp e cte d m e asur e E P [ X ] = G 0 . If α n,k = cn 2 for some c > 0 , then X is absolutely c ontinuous with r esp e ct to L eb esgue me asur e on R almost su r ely. 1.3. R elate d work Theorem 1.1 was effectively conjectured by F er guson [ 8 ]. Although he o nly co n- sidered the sp ecia l case of the Dir ichlet pro ces s, and despite the tec hnical diffi- culties already men tioned, he r ecognized both the usefulness of indexing spaces P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1361 by measurable partitions (a key ingredient of the construction in Sec. 3 ), a nd the co nnection b etw een σ - additivity of r andom draws from the Dirichlet pro cess and σ -a dditivit y o f its para meter measure [cf. 8 , Pr op osition 2]. Authors who hav e reco gnized problems to the effect that suc h a construction is not feasible on an a rbitrary measur a ble space V include Ghosh and Ramamo orthi [ 13 ] and Ghosal [ 12 ]; bo th r eferences a lso pr ovide excellent surveys of the different con- struction approaches av aila ble for the Diric hlet pro ce ss. Ghosal [ 12 ] additionally po int s out, in the co n text o f pr o blem (ii), that a co un table generato r may b e substituted for B V , provided the underlying space is separable and metrizable. T o resolve the σ -a dditivit y pro blem (iii), we app eal to a result of Harris [ 14 ], which r educes the conditions for σ -additivit y of r andom set functions to their behavior on a coun table num ber of sequences. This result is well-kno wn in the theory of p oint proce sses and random measures [ 7 , 15 ]. Although Sethuraman was aw are of Harris’ work and refer enced it in his well-known article [ 25 ], it has to our knowledge never b een follow ed up on in the nonparametric B ay esian literature. F or the s p ecific problem of defining the Dirichlet pr o cess, it is p oss ible to forego the pro jective limit constructio n altogether and in v oke approaches specif- ically tailore d to the prop erties of the Dirichlet [ 12 , 13 , 27 ]. On the real line, bo th the Dirichlet pro cess and the closely related Poisson-Dirichlet distribution of Kingman [ 18 ] aris e in a v ariety of contexts throug hout mathematics, each o f which can b e reg arded as a possible means of definition [e.g. 23 , 26 ]. On arbi- trary Polish spaces, the Diric hlet pro ces s can b e derived implicitly as de Finetti mixing measure o f an urn scheme [ 3 ], or as special cas e of a P´ olya tree [ 9 ]. Seth uraman’s stick-breaking sc heme [ 25 ] is remark able not only for its sim- plicit y . In contrast to all other cons tr uctions listed ab ove, it do es not req uir e V to b e Polish, but is a pplicable on an arbitr ary measurable space with mea- surable s ingletons. The s tic k-breaking a nd pr o jectiv e limit represe n tations of the Dirichlet pr o cess trade off t w o differen t types of generality: Stick-breaking impo ses less restrictions on the choice of V , but is not a pplicable to represent other types o f distributions on M ( V ). The pro jective limit a ppr oach requires more struc tur e on V , but can represent a ny probabilit y measure on M ( V ). The trade-off is reminisce n t of similar phenomena encountered throughout sto c hastic pro cess theory . F or ex a mple, probability meas ures on infinite-dimensional pr o d- uct spaces ca n b e c o nstructed by means o f Kolmog o rov’s extension theorem. If the meas ure to b e cons tr ucted is factorial over t he pro duct, the comp onent spaces of the pro duct may b e chosen as ar bitrary measurable spaces [ 2 , Theo- rem 9.2]. T o mo del sto chastic depe ndence across different subspaces, ho w ev er, a minimum of top olo gical structure is indisp ensable, and Kolmo gorov’s theo- rem hence requir e s the comp onent spaces to be P olish [ 16 , Theor e m 6.16]. The Dirichlet pro ces s , a s a purely atomic random measure whose different atoms are sto chastically dependent only through the g lobal norma lization constraint, ca n be regar ded as the closest analogue of a factorial measure on the space M ( V ). In analogy to a factorial measure, it can b e constructed on v ery g eneral spaces, whereas the pr o jective limit approach, whic h can r epresent arbitrary co rrelatio n structure, requir es stronger top ologica l prope r ties. P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1362 2. Bac kground : Pro jectiv e l imits A pr o jectiv e limit is constructed from a family of mathematical structures, in- dexed by the elements of an index set D [ 5 , 22 ]. F or our purp oses, the str uc - tures in question will b e to p olo gical measurable spaces ( X I , B I ), with I ∈ D . The pro jective limit defined b y this family is again a measur able space, denoted ( X D , B D ). This pr o jective limit space is the smallest space containing all spaces ( X I , B I ) a s its substructures, in a sense to be made precise shortly . T o obtain a meaningful notion o f a limit, the index set D need not be totally ordere d, but it must b e poss ible to f orm infinite sequences of suitably c hosen elemen ts. The set is therefore required to b e dir e cte d : There is a partial order rela tio n  on D and, whenever I , J ∈ D , there exists K ∈ D such that I  K and J  K . A simple example o f a directed se t is the se t D := F ( L ) of all finite subsets of an infinite set L , where D is partially o rdered by inclusio n. The comp onent spac e s X I used to define the pro jective limit need to “fit in” with ea ch other in a s uitable manner. This idea is formalized by defining a family o f mappings f JI betw een the spaces which ar e regular with resp ect to the structure p osited on the p oint sets X I . F or measurable space s, the adequate notion of regularity is measurability . Since we a ssume each σ -a lgebra B I to b e generated by an under lying top ology T I , we slig h tly strengthen this requirement to contin uit y . Definition 2 . 1 (P ro jective limit set) . Let D be a directed set and ( X I , T I ), with I ∈ D , a family of topolog ical spaces. F or any pair I  J ∈ D , let f JI : X J → X I be a function such that 1. f JI is T J - T I -contin uous. 2. f II = Id X I . 3. f KJ ◦ f JI = f KI whenever I  J  K . The functions f JI are called gener ali ze d pr oje ctions . The family {X I , T I , f JI | I  J ∈ D } , which we denote  X I , T I , f JI  D , is ca lled a pr oje ctive system of topolo gical spaces. Define a set X D as follows: F or ea ch collection { x I ∈ X I | I ∈ D } of p o ints satisfying x I = f JI x J whenever I  J , (2.1) ident ify the set { x I ∈ X I | I ∈ D } with a point x D , and let X D be the co llection of all such po in ts. The set X D is called the pr oje ctive limit set of  X I , f JI  D . Denote the Borel σ -a lgebras on the top olo g ical spaces X I by B I := σ ( T I ). F or each I ∈ D , the map defined as f I : x D 7→ x I is a well-defined function f I : X D → X I . These functions are called c anonic al mappings . They define a top ology T D and a σ -alg ebra on the pro jective limit space X D , as the smallest top ology (resp. σ -alg ebra) which makes all ca no nical mappings f I contin uo us (resp. measura ble). In particular, B D := σ ( f I | I ∈ D ) = σ ( ∪ I ∈ D f − 1 I B I ) = σ ( T D ) . (2.2) P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1363 In analogy to the set X D , the top olog ic a l spa ce ( X D , T D ) is called the pro jective limit of  X I , T I , f JI  D , and the meas urable space ( X D , B D ) the pr o jective limit of  X I , B I , f JI  D . A measure P D on the pro jective limit ( X D , B D ) can b e constructed by defining a measure P I on each space ( X I , B I ). By sim ultanously applying the pro jective limit to the pro jectiv e system  X I , B I , f JI  D and to the measures P I , the family  P I  D is assembled in to the meas ure P D . The only req uir ement is that the mea- sures P I satisfy a condition ana logous to the one imp osed o n p oints by ( 2.1 ). More precisely , P I has to coincide with the image measur e of P J under f JI , P I = f JI P J = P J ◦ f − 1 JI whenever I  J . (2.3) A family of measures  P I  D satisfying ( 2.3 ) is ca lled a pr oje ctive family . The existence and uniqueness of P D on ( X D , B D ) is gua ranteed b y the following result [ 6 , IX.4.3, Theor em 2]. Theorem 2. 2 (Bochner) . L et  X I , B I , f JI  D b e a pr oje ctive system of me asur- able sp ac es with c ountable, dir e cte d index set D , and  P I  D a pr oje ctive family of pr ob abili ty me asur es on these sp ac es. Then ther e exists a uniquely define d me asure P D on the pr oje ctive limit sp ac e ( X D , B D ) such t hat P I = f I P D = P D ◦ f − 1 I for al l I ∈ D . (2.4) W e refer to the measures in the family  P I  D as the mar ginals of the sto chastic pro cess P D . Since the mar ginals completely determine P D , some authors refer to  P I  D as the we ak distribution of the process, o r as a pr ome asur e [ 6 ]. Theorem 2.2 was introduced by Bo chner [ 4 , Theorem 5.1 .1], for a p ossibly uncountable index set D . The uncountable case requires an a dditio na l co ndition known as se qu en tial maximality , whic h ens ures the pro jective limit space is non-empty . F or our purp oses, ho w ever, countabilit y of the index set is essential: Measurability pro blems (problem (ii) in Sec. 1 ) arise whenev er D is uncoun table, and are not r esolved by sequential maximality . The most co mmo n example of a pro jective limit theorem in pro ba bilit y the- ory is K olmogor ov’s extension theo r em [ 16 , Theorem 6 .1 6], which can b e r e- garded as the sp ecial c a se of Bo chner’s theorem obtained for pro duct spac e s: Let D be the set o f all finite subsets of an infinite set L , partially ordered by inclusion. Choose an y Polish measurable space ( X 0 , B 0 ), and set X I := Q i ∈ I X 0 . The resulting pro jective limit space is the infinite pro duct X D = Q i ∈ L X 0 , a nd B D coincides with the Borel σ -alg ebra generated b y the pro duct top ology . F or pro duct spaces, the sequential maximality co nditio n men tioned ab ov e holds au- tomatically , so L may b e e ither countable or uncoun table. O nce again, though, the measurability problem (ii) arises unless L is countable. The pro duct space form of the theorem is typically used in the construction of Gaussian pro cess distributions on random funct ions [ 2 ]. F or random measures, a more adequate pro jective system is constructed in following section. P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1364 3. Pro jectiv e l imits of probability simplices This section constitutes the first par t of the pro of of Theo rem 1.1 : The co n- struction of a pro jective limit space X D from simplices △ I , and the analysis of its prop erties. The space X D turns out to consist of set functions which are not necessarily σ - a dditive, a nd the remaining part of the proo f in Sec. 4 will be the deriv a tio n of a criterio n fo r σ -additivity . The distinction b etw een finitely additive and σ -additiv e set functions will b e crucial to the ensuing discuss ion. W e consider t wo types o f set systems Q on the space V : Algeb r as , whic h con tain both ∅ and V , and are closed under co m- plement s and finite unions , and σ - algebr as , which are algebr as and additionally closed under countable unions. A non-neg ative se t function µ on either an al- gebra or σ -a lgebra Q is called a char ge if it satisfies µ ( ∅ ) = 0 and is finitely additive. If a charge is no rmalized, i.e. if µ ( V ) = 1, it is called a pr ob abili ty char ge . A charge is a meas ure if and only if it is σ - additive. If Q is an alg e br a, and not clos e d under countable unions , the definition of σ -a dditivit y only re- quires µ to be additiv e along those c ount able sequences of sets A n ∈ Q whose union is in Q . 3.1. Definiti on of the pr oje cti ve system F or the choice o f compo nents in a pro jective system, it can be helpful to regar d the elements x D of the pr o jectiv e limit space X D as mappings, fro m a domain defined by the index s et D to a range defined by the spaces X I . The simplest example is once again the pro duct space X D = X L 0 in Kolmogor ov’s theorem, for which eac h x D ∈ X D can be in terpreted as a function x D : L → X 0 . P robability measures on ( V , B V ) a re in particular set funct ions B V → [0 , 1], so it is natural to construct D fro m the sets in B V . It is not necessar y to include all mea surable sets: If Q is an alg ebra that generates B V , any pr obability mea s ure on Q has, b y Carath´ eo dory’s theorem [ 16 , Theor e m 2.5], a unique extensio n to a probability measure on B V . In other w ords, the space M ( Q ) of probabilit y measure s on Q is isomorphic to M ( V ), and Q can b e substituted for B V in the pro jective limit construction. Desiderata for the pro jective limit are: (1) The pro jective limit space X D should con tain a ll meas ures o n Q (and hence on B V ). (2) Q should be countable, to address the mea s urability pro blem (ii) in Sec. 1 . (3) The mar ginal spaces X I should consist of the finite-dimensional ana logues of measures on Q , and hence of measures on finite subsets of even ts in Q . (4) The definition of the system should facilitate a pr o of of σ -additivit y . In this section, w e will re capitulate the pro jective limit sp ecified in Sec. 1.1 and show it indee d satisfies (1)-(3); that (4) is satisfied as well will b e shown in Sec. 4 . Choice of Q . W e start with the pr ototypical choice of basis for a ny Polish topol- ogy: Let W ⊂ V be a coun table, dense subset of V . Fix a metric d : V × V → R + which generates the topolog y T V , a nd denote b y B ( v, r ) the o pen d -ball of radius P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1365 r around v . Denote the system of open balls with rational radii and c en ters in W b y U := { B ( v, r ) | v ∈ W, r ∈ Q + } ∪ {∅} . (3.1) Since V is separa ble and metrizable, U forms a co un table ba sis of the topo logy T V [ 1 , Lemma 3.4 ]. Let Q ( U ) b e the algebra g enerated b y U . Then U ⊂ Q ( U ) ⊂ B V . In particula r, Q ( U ) is a countable generator of B V . Index set. As the index set D , we do not choo se Q ( U ) itself, but rather the set of all finite partitions of V consis ting of disjoint sets A i ∈ Q ( U ), D := H ( Q ) = n ( A 1 , . . . , A n )    n ∈ N , A i ∈ Q ( U ) , ˙ ∪ A i = V o . (3.2) Each element I ∈ D is a finite partition, a nd the set of probability measures on the even ts in this partition is precisely the simplex △ I . T o define a partial order on D , let I = ( A 1 , . . . , A m ) and J = ( B 1 , . . . , B n ) be any tw o partitions in D , and deno te their in tersection (common refinemen t) by I ∩ J := ( A i ∩ B j ) i,j . Since Q ( U ) forms an alg ebra, I ∩ J is again an element of D . Now define a partial order r elation  as I  J : ⇔ I ∩ J = J , (3.3) that is, I  J iff J is a refinement of I . The set ( D ,  ) is a v a lid index set for a pro jective limit system, b ecause it is directed: K := I ∩ J a lways satisfies I  K and J  K . Pro jection functions. What remains to b e do ne is to sp ecify the functions f JI . Consider a par titio n J = ( A 1 , . . . , A n ), and any x J ∈ △ J . Each en try x J ( A j ) assigns a n um ber (a proba bilit y) to the event A j , and w e define f JI according ly to pr eserve this prop er t y . T o this end, let J = ( B 1 , . . . , B n ) b e a pa rtition in D , and let I = ( A 1 , . . . , A m ) be a coa rsening o f J (that is, I  J ). F or each A i , let J i ⊂ { 1 , . . . , n } be the s ubset of indices for which A i = ∪ j ∈J i B j . Then define f JI as ( f JI x J )( A i ) := X j ∈J i x J ( B j ) . (3.4) W e c ho ose X I := △ I as defined in ( 1.1 ), and endow △ I with the relative topolo gy T I := T ( R I ) ∩ △ I and the corresp onding Borel sets B I := B ( T I ) = B ( R I ) ∩ △ I . The relative topolo gy makes additions on △ I , and hence the mappings f JI , contin u- ous. Each f II is the iden tit y o n △ I , and f KI = f KJ ◦ f JI . F or any pair I  J ∈ D , △ I = f JI △ J and conv ersely , △ J = f − 1 JI △ I . Therefore,  △ I , B I , f JI  D is a pro jectiv e system. 3.2. Stru ctu r e of the pr oje cti ve li mit sp ac e Let ( X D , B D ) b e the pro jective limit of  △ I , B I , f JI  D . W e obser ve immediately that X D contains M ( Q ): If x is a pro bability measure on Q ( U ), let x I := f I x P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1366 for ea ch partition I ∈ D . The collection { x I | I ∈ D } sa tisfies ( 2.1 ) , and hence constitutes a p oint in X D . The f ollowing r esult provides more details ab out the constructed measurable s pa ce ( X D , B D ), whic h turns out to b e the spac e C ( Q ) of all probabiliy charges defined on Q ( U ). By B w ∗ , we again denote the Bore l σ -algebr a on M ( V ) generated b y the weak ∗ top ology . Prop ositio n 3.1. L et V b e a Pol ish sp ac e, and ( X D , B D ) t he pr oje ctive limit of the pr oje ct ive system  △ I , B I , f JI  D define d in Se c. 3.1 . Denote by ψ : M ( V ) → M ( Q ) the r estriction mapp ing wh ich t akes e ach me asur e x on B V to its r estric- tion x D = x | Q on Q ⊂ B V . Th en the following hold: (i) X D = C ( Q ) , the sp ac e of p r ob abili ty char ges on Q ( U ) . (ii) M ( Q ) is a me asur able subset of C ( Q ) . (iii) ψ is a Bor el isomorph ism of ( M ( V ) , B w ∗ ) and ( M ( Q ) , B D ∩ M ( Q )) . Part ( ii ) implies that a pro jective limit measur e P D constructed on C ( Q ) by means of Theor em 2.2 can b e restricted to a measure on M ( Q ) without further complications, in particular without a pp ea ling to outer measures . Accor ding to ( iii ), ther e is a measur e P on M ( V ) which can b e reg arded as equiv alen t to P D , namely the ima ge measure P := ψ − 1 P D under the in v erse of the res triction map ψ . This is o f course the measure P des crib ed in T heo rem 1.1 , though some details still remain to b e esta blished later o n. Since ψ is a Borel isomorphism, P constitutes a mea sure with r e s pec t to the “natural” top ology on M ( V ). Pr o of. Part (i). Let x D ∈ X D . The triv ial par tition I 0 := ( V ) is in D , which implies x D ( V ) = f I 0 x D = 1 and x D ( ∅ ) = 0. T o show finite additivit y , let A 1 , A 2 ∈ Q ( U ) be disjoin t sets and cho ose a partition J ∈ D such that A 1 , A 2 ∈ J . Let I  J b e th e coars e ning of J obtained by joining the tw o sets. As the elements of each space △ I are finitely additiv e, x D ( A 1 ) + x D ( A 2 ) = ( f J x D )( A 1 ) + ( f J x D )( A 2 ) ( 3.4 ) = ( f I x D )( A 1 ∪ A 2 ) = x D ( A 1 ∪ A 2 ) . Hence, x D is a charge. C o nv ersely , a ssume that x D is a probability charge on Q ( U ). The ev aluation f I x D of x D on a partition I ∈ D defines a probabilit y measure on the finite σ - a lgebra σ ( I ), and thus f I x D ∈ △ I . Since additionally f JI ( f J x D ) = f I x D , the set  f I x D  D forms a collection of p oints f I x D ∈ △ I satisfy- ing ( 2.1 ), a nd hence x D ∈ X D . Part (ii). Rega rd the res triction map ψ as a mapping into C ( Q ), with ima g e M ( Q ). By Caratheo dory’s extension theorem, ψ is injective [ 16 , Theor e m 2.5]. If an injective ma pping betw een Polish spaces is measurable, its in v erse is mea- surable as w ell [ 16 , Theorem A1.3]. Th us, if we can sho w ψ to b e measurable, M ( Q ) = ψ ( M ( V )) is a measurable s e t. First observe that ψ relates the ev aluation functionals f I : C ( Q ) → △ I on probability c harges to the ev aluation functionals φ I : M ( V ) → △ I on pro ba bilit y measures via the equations φ I = f I ◦ ψ for all I ∈ D . (3.5) P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1367 W e will show that the mappings φ I generate the σ -algebra B w ∗ on M ( V ). Since the ca nonical mapping s f I generate B D on C ( Q ) by definition, ( 3.5 ) then implies B w ∗ - B D -measurability of ψ : Let φ A : M ( V ) → [0 , 1] b e the ev aluation functional x 7→ x ( A ). Since M ( V ) is separ a ble, the B o rel sets of the weak ∗ top ology c o incide with those genera ted by the maps φ A [ 11 , Theorem 2 .3], th us B w ∗ = σ ( φ A | A ∈ B V ). Each mapping φ A can be iden tified with φ I for I = ( A, A c ), b e cause φ ( A,A c ) ( x ) = ( x ( A ) , 1 − x ( A )) . Hence equiv alently , B w ∗ = σ ( φ ( A,A c ) | A ∈ B V ), and with ( 3.5 ), B w ∗ = ψ − 1 σ ( f ( A,A c ) | A ∈ B V ) . (3.6) Clearly , the maps f ( A,A c ) for A ∈ Q are sufficient to e x press all info r mation expressible by the larg er family o f maps f I , I ∈ D , and thus g enerate the pro jective limit σ - a lgebra, σ ( f ( A,A c ) | A ∈ Q ) = B D . (3.7) In summary , ψ is B w ∗ - B D -measurable , and we deduce M ( Q ) = ψ ( M ( V )) ∈ B D . Part (iii). As shown above, ψ is injective and measurable, and r egarded a s a mapping o nt o its image M ( Q ), it is trivially surjectiv e. What r emains to be shown is measura bilit y of the in v erse. By part ( ii ), the image ψ ( M ( V )) = M ( Q ) is a Borel subset of C ( Q ). As a countable pro jectiv e limit of P olish spaces, ( C ( Q ) , T D ) is Polish [ 5 , Cha pter IX]. Since M ( V ) is Polish, ( M ( V ) , B w ∗ ) is a standard Bo rel spa ce, i.e. a Borel space gener ated b y a P olish topolog y . The space ( M ( Q ) , B D ∩ M ( Q )) is standa r d Borel as w ell, s ince M ( Q ) is a Borel subset of a Polish space [ 16 , Theorem A1.2 ]. As noted a bove, mea surable bijections betw een s tandard Bo rel spa c es are automatically bimeasurable [ 16 , Theorem A1.3], which sho ws ψ to be a Borel isomo r phism. 4. σ -additivity of random cha rges The previous section provides the means to construct the distribution P D of a random charge X D : Ω → C ( Q ) as a pro jective limit measur e. T o obtain random measures rather than random charges in th is manner, we need to additionally ensure that P D concentrates on the measur able subspace M ( V ), or in other words, that X D is σ -a dditiv e P -almost surely . Consider a pro jective limit random charge X D , distr ibuted according to a pro jective limit mea sure P D on C ( Q ). The follo wing pro po sition gives a nec- essary and sufficient c ondition for almost sur e σ -additivit y of X D , formulated in terms of its expe c tation E P D [ X D ]. It also shows that the ex pected v alues of P D and the pro jective family  P I  D are themselves pro jectiv e, in the sense that f I E P D [ X D ] = E P I [ X I ], and accor ding ly f JI E P J [ X J ] = E P I [ X I ] for any pair I  J . The latter makes the criterion directly applicable to construction problems: If we initiate the construction by c ho osing an expected measure G 0 ∈ M ( V ) for the pr o sp ective measure P D , and then choose the pro jective family s uch that E P I [ X I ] = f I G 0 , random draws from P D will take v alues in M ( V ) almost s urely . P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1368 Prop ositio n 4. 1. L et ( X D , B D ) b e the pr oj e ctive limit of fi n ite-dimensional pr ob abil ity simplic es define d in Pr op osition 3.1 , and let  P I  D b e a pr oje ctive family of pr ob abi lity me asur es on the sp ac es ( △ I , B I ) . Denote by P D the pr oje c- tive limi t me asur e, and by G 0 := E P D [ X D ] its exp e ctation. Then: (i) The exp e ctation G 0 is an element of X D and f I G 0 = E P I [ X I ] f or any I ∈ D . (4.1) (ii) X D is σ -additive P -almost sur ely if and only if G 0 is σ -additive. The proo f requires a criter io n for σ - additivity of pro bability charges express- ible in terms of a count able num ber of conditions. Assuming that G 0 is σ - additive, w e will deduce from the pro jective limit construction that, if a fixed sequence of s e ts is g iven, the ra ndo m co nt ent X D is countably a dditive along this sequence with probability one. This only implies almos t sure σ -additivity of X D on Q ( U ) if the condition for σ -additivity ca n be r educed to a countable subset of seq ue nc e s in Q ( U ) (cf. Appendix A.3 ). Suc h a reduction w as deriv ed by Har ris [ 14 , Lemma 6.1]. F or our par ticular choice of Q ( U ), his result can be stated as follows: Lemma 4.2 (Harris) . L et V b e any Polish sp ac e and Q ( U ) the c ou n table algebr a gener ate d by the op en b al ls ( 3.1 ) . Then the set of al l se quenc es of elements of Q ( U ) c ontains a c ountable subset of se quenc es ( A m n ) n , wher e A m n ց ∅ for al l m ∈ N , such that any pr ob abili ty char ge µ on Q ( U ) is σ -additive if and only if it s atisfi es lim n →∞ µ ( A m n ) = 0 for al l m ∈ N . (4.2) Pr o of of Pr op osition 4.1 . Part (i). The exp ectation E P D [ X D ] is finitely additive: F or a ny finite n um ber of disjoin t s ets A i ∈ B D , n X i =1 E P D [ X D ]( A i ) = Z C ( Q ) n X i =1 x D ( A i ) P D ( dx D ) = E P D [ X D ]( ∪ i A i ) . (4.3) Since clearly also E P D [ X D ]( ∅ ) = 0 and E P D [ X D ]( V ) = 1, the expectation is a n element of X D . T o verify ( 4.1 ), note the mappings f JI : △ J → △ I are affine, and hence f JI E P J [ X J ] f affine = E P J [ f JI X J ] = Z △ J = f − 1 JI △ I f JI x J P J ( dx J ) = Z △ I x I ( f JI P J )( dx I ) = Z △ I x I P I ( dx I ) . (4.4) Therefore, the exp ectations o f a pro jective family  P I  D satisfy f JI E P J [ X J ] = E P I [ X I ]. By the same device, f I G 0 = f I E P D [ X D ] = E P I [ X I ] holds for the pro jec- tive limit measure P D . Part (ii). Firs t a ssume that G 0 is σ -a dditive. Let ( A m n ) n be any of the set sequences given by Lemma 4.2 . As n → ∞ , the r andom sequence ( X D ( A m n )) P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1369 conv erges to 0 almost surely: σ -Additivity of G 0 implies lim n →∞ E P D [ X D ]( A m n ) = lim n →∞ G 0 ( A m n ) = G 0 ( ∅ ) = E P D [ X D ]( ∅ ) , (4.5) hence X D ( A m n ) L 1 − − → 0. The s e quence ( A m n ) is decre asing a nd the r andom v ariable X D is c harge-v alued, which implies X D ( A m n +1 ) ≤ X D ( A m n ) a.s. In pa rticular, the sequence ( X D ( A m n )) forms a sup ermar ting ale when endow ed with its c a nonical filtration. F or sup ermar tingales, convergence in the mean implies almos t sure conv ergence [ 2 , Theorem 19 .3], and thus indeed X D ( A m n ) a.s. − − → 0. Consequently , there is a P -null subset N m of the abstract probability spa ce Ω such that ( X D ( ω ))( A m n ) n →∞ − − − − → ( X D ( ω ))( ∅ ) for ω 6∈ N m . (4.6) The union N := ∪ m ∈ N N m of these n ull sets, taken over a ll sequences ( A m n ) required b y Lemma 4.2 , is again a P -null set. The charge X D ( ω ) satisfies ( 4.2 ) for a ll m whenever ω 6∈ N . Therefo re, X D is σ -additive P -a .s. by Lemma 4.2 , and hence almo st surely a pr o bability measur e . Conv ersely , let X D assume v alues in M ( V ) ∼ = M ( Q ) almost surely . Since A m n ց ∅ , the sequence o f mea surable functions ω 7→ ( X D ( ω ))( A m n ) conv erges to 0 almost everywhere. By hypothesis, C ( Q ) r M ( Q ) is a n ull set, hence lim n →∞ E P D [ X D ]( A m n ) = lim n →∞ Z X − 1 D M ( Q ) ( X D ( ω ))( A m n ) P ( dω ) = 0 , (4.7) where the sec ond identit y holds by dominated convergence [ 16 , Theor e m 1.21]. Since E P D [ X D ] is a probabilit y c harge according to par t ( i ) and sa tis fies ( 4.7 ), it satisfies the c onditions of Lemma 4.2 , and we conclude E P D [ X D ] ∈ M ( Q ). Theorem 1.1 is now finally obtained by deducing the prop er ties o f P from those of P D as established b y Prop osition 4.1 . Pr o of of The or em 1.1 . Firs t supp ose that ( 1.3 ) and ( 1.4 ) hold. By Theorem 2.2 , a unique pro jective limit measur e P D exists o n X D = C ( Q ), with f I P D = P I . Prop ositio n 4.1 ( ii ) shows P D is concent rated on the mea surable subset M ( Q ). By Pr op osition 3.1 ( iii ), it uniquely defines a n equiv alen t measure P := ψ − 1 P D on M ( V ), whic h satisfies ( 1.5 ). As a pro babilit y measure o n a P olish s pa ce, P is a Radon meas ure [ 6 , IX.3.3, P r op osition 3]. Conv ersely , assume that P is given. Then ( 1.3 ) follows from ( 1.5 ). The exp ec- tation G 0 = E P [ X ] is in M ( V ) by Propo sition 4.1 ( ii ). Any measur e on M ( V ) can b e represented as a mea sure on X D = C ( Q ), hence by Pro po sition 4.1 ( i ), the exp e ctation G 0 and the marginals P I = f I P satisfy ( 4.1 ). Thus, ( 1.4 ) holds, and the pro o f is complete. App endix A: Re view of tec hnical probl ems This app endix provides a more detailed description of problems (i)–(iii) listed in Sec. 1 . The discussion addresses readers of passing fa miliarity with measur e - theoretic proba bilit y; to the proba bilist, it will o nly state the obvious. P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1370 The appro ach prop osed in [ 8 ] is, in s ummary , the following: A pr o bability measure on ( V , B V ) is a set function B V → [0 , 1]. The set M ( V ) o f probability measures can b e reg arded as a subset of the spa ce [0 , 1] B V of all suc h functions. More precisely , the space chosen in [ 8 ] is [0 , 1 ] H ( B V ) , where H ( B V ) again denotes the se t of all mea surable, finite par titions of V . This spa ce contains one axis for ea ch partition, and hence is a larg er space than [0 , 1] B V , but redundan tly enco des the same infor mation. The Kolmogorov extension theorem [ 16 , Theorem 6.16] is then applied to a family of Dir ich let distributions defined on the finite- dimensional subspace s of the pro duct space [0 , 1 ] H ( B V ) . A.1. Pr o du ct sp ac es The K olmogor ov extension theorem used in the construction is not w ell-adapted to the pr oblem of constructing measure s on measures , beca use the setting as- sumed by the theorem is that of a pro duct spa c e : A finite-dimensional marginal of a measure P on M ( V ) is a measure P I on the set of measures over a finite σ -algebr a C of events. Any such σ -algebr a ca n be g enerated by a partition I of even ts in B V . The set consisting of the margina ls on I of all measures x ∈ M ( V ) is necessarily isomorphic to the unit simplex in | I | -dimensional Euclidea n spa ce. Hence, the mar ginals of a measure P defined on M ( V ) always liv e on simplices of th e f orm △ I as des c rib ed in Sec. 1.1 . In other w ords, when w e set up a pro- jective limit constructio n for measures o n M ( V ), the choice of po ssible finite- dimensional marg inal spaces is limited—either the simplices ar e used directly , as in Sec. 1.1 , or they are em bedded in to some o ther finite-dimensional spac e . If the pro jectiv e limit result to b e applied is the Kolmo gorov extension theorem, the simplices must b e embedded in to Euclidea n pro duct spaces, as prop osed in [ 8 ]. The problem here is that it is difficult to pro per ly formalize mar ginaliza- tion to subspa ces, as requir ed b y the theor em. F or c onstructions o n [0 , 1] B V , the problem ca n b e illustrated by the example in Fig. 1 : F or J = ( B 1 , B 2 , B 3 ), the simplex △ J is a subspace of R J ∼ = R 3 . Marginalization corresp onds to merging t wo event s, such as B 1 and B 2 in the example. The r esulting simplex △ I for I = ( B 1 ∪ B 2 , B 3 ) is a subspace of R I . How e ver, R I is not a subspace of R J , nor is △ I a subspace of △ J . Hence, in the pr o duct space setting of the K olmogor ov theorem, the natur al w ay to formalize a reduction in dimension for measur e s o n a finite num ber o f even ts does no t corr esp ond to a pro jection onto a subspace. A.2. Me asu r ability pr oblems A ge neral prop erty o f pro jective limit cons tructions of sto chastic pro ce sses is that the index set—intuitiv ely , the set o f a xes labels of a product, or of dimen- sions in a more g e ne r al setting—m ust be coun table to obtain a useful pro bability measure. This is due to the fact that all pro jective limit theore ms implicitly gen- erate a σ -alge bra on the infinite-dimensional s pace—the σ - algebra B D sp ecified by ( 2.2 )—based on the σ -a lgebras on the mar ginal space s used in the construc- tion. The cons tr ucted measure lives on this σ -a lgebra. P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1371 x 2 x 3 x 1 A I f − 1 I A I R L R I Fig 2 . Thr e e-dimensional analo gue of a cylinder set in the pr o duct sp ac e setting. A n event A D ⊂ R D is indep endent of the r and om variable X 3 if it is the pr eimage A D = f − 1 I A I of some event A I ⊂ R I , that is, if t he set A D is of “axis p ar al lel” shap e in dir e ction of X 3 . The event A I in the figur e o c curs if ( X 1 , X 2 ) ∈ A I , or e quivalently, if ( X 1 , X 2 , X 3 ) ∈ f − 1 JI A I . If the dimension is uncoun table, the resolution of the σ -algebra is too coa rse to resolve mos t event s of interest. In particular, it does not con tain single tons. The problem is mos t readily illustrated in the pro duct spa c e setting: Supp ose the Kolmogo rov theorem is used to define a measure P on a n infinite-dimensional pro duct spac e X D := R L , whe r e L is some infinite set. The measur e P is co n- structed fr om giv en meas ures P I defined on the finite-dimensional sub-pro ducts R I , where I ∈ D a re finite subsets of L . The σ -alg ebra o n R L on whic h P D is defined is gener ated a s follows: Denote b y f I the pro duct spa c e pro jector R L → R I . F o r any measurable set A I ∈ R I , the pr eimage f − 1 I A I is a subs e t of R L , which is of “axis- parallel” shap e in direction of all axis not con tained in I . The finite-dimensional analogue of this situation is illustra ted in Fig. 2 , where A I is assumed to b e an elliptically shap ed set in the plane R I , and the ov erall space R L is depicted as three-dimensional. Preimages f − 1 I A I of measurable sets are, for o b vious reasons, called cylinder sets in the pro bability literature. The σ -algebr a defined by the Kolmogor ov theor em is the s ma llest σ -alg ebra co n- taining all cy linder sets f − 1 I A I , for all meas urable sets A I ∈ R I and all finite sub-pro ducts R I . Since σ -a lg ebras are defined by closure under countable oper - ations, the sets in th is σ - algebra ca n b e thought of as cy linder sets that are o f axis-par allel shape along all but a countable nu m be r of dimensions. If the overall space is o f coun table dimension, any set of interest can b e expressed in this for m. If the dimension is uncount able, how ever, these even ts only sp ecify the join t be- havior of a countable subset of random v aria bles—in Fig. 2 , R I would represent a subspace of co un table dimension o f the uncountable-dimensional space R L . F or example, consider the set R L := R R , regarded as the se t of all functions x D : R → R , which a rises in the construction of Gaussian pro cesses. Although the constructed mea sure P D is a distr ibution on ra ndom functions x D , this mea - sure ca nnot ass ign a pro bability to even ts of the form { X D = x D } , i.e. to the even t that the outcome of a random draw is a particular function x D . The only measurable ev en ts are o f the form { X D ( s 1 ) = t 1 , X D ( s 2 ) = t 2 , . . . } and sp ecify the v a lue of the function at a countable s ubs et of p oints s 1 , s 2 , . . . ∈ R . P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1372 A.3. σ -additivity The ma rginal distributions used in the construction specify the join t b ehavior of the co nstructed measure P D on an y finite subset o f meas urable sets. σ -additivity requires additivity along an infinite seq ue nc e , a nd cannot b e deduced directly from additivit y of the margina ls. Supp os e that some sequence A 1 , A 2 , . . . of measurable sets in V is g iven, and that x D is a random set function drawn from P D . Countable a dditivit y of x D along the sequence can be shown to hold almost surely (with resp ect to P D ) b y means of a simple conv ergence ar gument [ 8 , Prop os ition 2]. Ho w ev er, as a σ -alg ebra, B V is either finite or uncountable. Hence, if V is infinite, B V contains a n uncountable n um ber of such sequences. Even though x D is additive along an y giv en sequence with probability one, the nu ll s ets of exceptio ns aggr egate into a non-null set o ver all sequences, and x D is not σ -a dditiv e with pr obability o ne. Substituting a countable generator Q for B V do es not resolve the problem, since the num b er o f sequences in Q rema ins uncountable. Ac kno wle dgmen ts I would like to tha nk the asso ciate editor and tw o referees for v alua ble sugg es- tions, in particular for p ointing out the example in Co rollary 1.4 . I am grateful to Daniel M. Ro y for helpful comments and correc tions. References [1] Aliprantis, C. D. and Border, K. C. (20 06). Infinite Dimensional Analy sis . Springer, 3rd editio n. MR23784 91 [2] Bauer, H. (1996). Pr ob ability The ory . W. de Gruyter. MR13854 60 [3] Blackwell, D. and MacQueen, J. B. (1973 ). F erguso n distributions via P´ oly a urn schemes. Ann. Statist. , 1 , 353–355. MR0362614 [4] Bochner, S. (19 55). Harmonic Analysis and the The ory of Pr ob ability . Univ ersity of California Pre s s. MR0072370 [5] Bourbaki, N. (1966). Elements of Mathematics: Gener al T op olo gy . Her - mann (Paris) and Addison-W esley . [6] Bourbaki, N. (2004). Elements of Mathema tics: Inte gr ation . Springer. [7] Craue l, H. (20 02). R andom pr ob ability me asur es on Poli sh sp ac es . T aylor & F rancis. MR19938 44 [8] Fer guson, T. S. (1973). A Bay esian analysis o f some nonparametr ic problems. Ann. Statist. , 1 (2). MR03509 49 [9] Fer guson, T. S. (197 4). Pr ior distributions on spaces of probability mea- sures. Ann. Statist. , 2 (4), 615– 629. MR04385 68 [10] Fremlin, D. H . (200 0–200 6). Me asur e The ory , volume I–IV. T o r res F rem- lin. MR246 2372 [11] Gaud ard, M. and Hadwin, D. (1989 ). Sigma-algebr as on spac e s of probability measures. Sc and. J. Stat. , 16 , 1 69–16 5. MR1 02897 6 P. Orb anz/Pr oje ctive limit r andom pr ob abilities 1373 [12] Ghosal, S. (2010). Dirichlet pro cess, related priors and po sterior asymp- totics. In N. L. Hjort et al. , editors , Bay esian Nonp ar ametrics . Cambridge Univ ersity Press. MR27306 60 [13] Ghosh, J. K. and Ramamoor thi, R. V. (200 2). Baye sian Nonp ar amet- rics . Springer. MR19922 45 [14] Harris, T. E. (1968). Counting mea s ures, mono tone r andom set functions. Pr ob ab. The ory R elate d Fields , 10 , 102– 119. MR02355 92 [15] Kallenberg , O. (1983). R andom Me asur es . Academic Press. MR081 8219 [16] Kallenberg , O. (200 1 ). F oundations of Mo dern Pr ob ability . Springer, 2nd edition. MR14646 94 [17] Kechris, A. S. (1995 ). Classic al Descriptive Set The ory . Spring er. MR13215 97 [18] Kingman, J. F. C. (1975 ). Ra ndom discrete distributions. J. R. St at. So c. Ser. B Stat. Metho dol. , 37 , 1–2 2. MR03 6826 4 [19] La vine, M. (1992). Some a spe cts of P´ olya tr ee distributions fo r statistical mo delling. A nn. Stat ist . , 20 (3), 1222–1 2 35. MR1186 248 [20] Lijoi, A., Mena, R . H. , and Pr ¨ unster, I. (200 5 ). Hierarchical mixture mo deling with normalized inv erse-Gaussia n priors. J. Amer. Statist. Ass o c. , 100 , 127 8–129 1. MR22364 41 [21] Ma cEachern, S. N. (20 00). Dep e ndent Dirichlet pro cesses . T echnical rep ort, Ohio Sta te Univ ersit y . [22] Mallor y, D. J. and Sion, M. (19 71). Limits of in v erse s ystems of mea - sures. Ann. Inst. F ourier (Gr en oble) , 21 (1), 25–5 7. MR02845 57 [23] Olshanski, G . (2003). An introduction to ha rmonic a nalysis on the in- finite symmetric group. In Asymptotic Combinatorics with Applic ations to Mathematic al Physics , volume 1815 o f L e ctur e Notes in Mathematics , pages 12 7–160 . Springer. MR20098 38 [24] Pollard, D. (19 84). Conver genc e of St o chastic Pr o c esses . MR076298 4 [25] Sethuraman, J. (1994). A constructive definition of Dirichlet prio rs. Statist. Sinic a , 4 , 639–65 0 . MR1309 433 [26] T alagrand, M. (2003). Spin Glasses: A Chal lenge for Mathematicians . Springer. MR19938 91 [27] W alker, S. G., Damien, P ., Laud, P. W., and Smith, A. F. M. (1999). Ba y esian nonparametric inference for random distributions and related functions . J. R. Stat. So c. Ser. B Stat. Metho dol. , 61 (3), 485–527 . MR17078 58 [28] Zhao, L . H. (200 0 ). Ba y esian asp ects of some no nparametric pro blems. Ann. St atist. , 28 , 532– 552. MR17 9000 8

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment