On empirical meaning of randomness with respect to a real parameter
We study the empirical meaning of randomness with respect to a family of probability distributions $P_\theta$, where $\theta$ is a real parameter, using algorithmic randomness theory. In the case when for a computable probability distribution $P_\the…
Authors: Vladimir Vyugin
On empirical meaning of sets of algori thmically random and non-random s equences Vladimir V. V’yugin 1 Institute for Information T r ansmission Pr oblems, Ru ssian A ca demy of Scienc es, Bol’sh oi Kar e tnyi p er. 19, Mosc ow GSP-4, 12799 4, Russia. e-mail vyugin@iitp.ru Abstract W e study the a priori semimeasure of sets of P θ -random infi nite sequences, where P θ is a fa mily of probability distributions dep end in g on a real parameter θ . In the case when for a computable probabilit y distribution P θ an effectiv ely strictly consistent estimator exists, w e sho w th at the Levin’s a p riory semimeasure of th e set of all P θ -random sequ ences is p ositive if and only if the parameter θ is a comput ab le real num b er. F or t h e Bernoulli family B θ , we show that the a priory semimeasure of the set ∪ θ I θ , where I θ is the set of all B θ -random sequences and th e union is taken o ver all non-random θ , is p ositive. 1. In tro duction W e use algorithmic randomness theory to analy z e “the size” of s ets o f infinite sequences random with r esp ect to par ametric families of probability distributions. Let a para metric family o f pro bability distr ibutions P θ , wher e θ is a real n um ber , b e given such that an effectiv ely strictly consistent estimator ex ists for this family . The Bernoulli family w ith a rea l parameter θ is an example of such family . Theorem 1 shows that the Levin’s a prior y semimeasur e of the set of a ll P θ -random s equences is p os itive if a nd only if the parameter v a lue θ is a computable rea l num ber. W e say that a prop erty of infinite sequences has no “e mpir ical meaning” if the Levin’s a priory semimeasure of the set o f all seq uences posses s ing this pr o p erty is 0. In this resp ect, the mo del of the biased coin with “a presp ecified” pr obability θ of head is meaningless when θ is a noncomputable real num b er; no ncomputable para meters θ ca n 1 This paper was present ed in part at 2nd In ternational Computer Science Symposium in Russia. E k a- terinb urg, Russia, Septem b er 3-7, 2007 [14]. This researc h was partially supported by Russian foundation for f undamen tal research: 06-01-00122-a. Preprint submitted to Elsevier 13 Nov em ber 2021 hav e e mpirical meaning only in their totality , i.e., as elements of some uncountable sets. F or exa mple, P θ -random sequences with noncomputable θ c an b e g enerated by a Bayesian mixture of these P θ using a computable prior . In this case, evidently , the semicomputable semimeasure o f the set of all seq uences r andom with res pe ct to this mixture is p o s itive. W e give in App endix A the simple pr o of of our previo us result (formulated in Theo- rem 3) which s ays that the Le v in’s a prio ry semimeasure of the set of all infinite binary sequences non- equiv alent by T uring to Martin-L¨ of random s equences is p ositive. In par- ticular, these sequences are non- r andom with res pec t to each computable probability distribution. W e use this result to pro ve Theorem 4. This theorem shows that a proba bilistic machine can be constructed, whic h with probability clo se to 1 outputs a random θ -B ernoulli sequence suc h tha t the parameter θ is not r andom with resp ect to eac h computable probability distribution. This result can b e interpreted such that the Bay esia n statistical approach is insufficient to cover all p o ssible “meaningful” case s for θ - r andom sequenc e s . 2. Preliminaries Let Ξ b e the set of all finite binar y sequences, Λ be the empt y sequence, and Ω b e the set of a ll infinite binar y s e quences. W e write x ⊆ y if a sequence y is an extension of a sequence x , l ( x ) is the length of x . F or any ω ∈ Ω, ω n = ω 1 . . . ω n . A rea l-v alued fun ction P ( x ), wher e x ∈ Ξ, is called semimeasure if P (Λ) ≤ 1 , P ( x 0) + P ( x 1) ≤ P ( x ) (1) for all x , and the function P is semico mputable from be low; this means that the set { ( r , x ) : r < P ( x ) } , where r is a rationa l num b er , is r ecursively e numerable. A definition of upper semicomputability is analog ous. Solomonoff propos ed ideas for defining the a prior i pro bability distribution on the basis o f the g eneral theor y of algorithms. Levin [3,15] gav e a precise form of So lomonoff ’s ideas in a concept o f a maximal semimeasure semicomputable fr om b elow (see also Li and Vit´ anyi [7 ], Section 4.5, Shen et al. [1 0]). Levin prov ed that there exists a maximal to within a m ultiplicativ e positive constan t factor s emimeasure M semicomputable from below, i.e. such that for every semimeasure P semicomputable from below a p ositive constant c exis ts suc h that the inequality cM ( x ) ≥ P ( x ) (2) holds for all x . The semimea sure M is ca lled the a priory or universal semimeasure. F or an y semimea s ure Q , its supp ort set E Q is a set of all infinite sequences ω such that Q ( ω n ) > 0 for all n , i.e., E Q = ∪ Q ( x ) > 0 Γ x . A function P is a measur e if (1) holds, whe r e b oth inequality signs ≤ are replaced on =. Any function P satisfying (1) (with equalities) ca n be extended o n all Bo rel subsets of Ω if we define P (Γ x ) = P ( x ) in Ω, wher e x ∈ Ξ and Γ x = { ω ∈ Ω : x ⊆ ω } ; after that, we use the standar d metho d for extending P to all Bore l subsets of Ω. By simple set in Ω w e mean a unio n of interv als Γ x from a finite set. A mea sure P is computable if it is, at one time, low er and upp er semicomputable. 2 F or tec hnical reasons, for an y semimeas ure P , we conside r the maximal mea sure ¯ P such that ¯ P ≤ P . This measure sa tisfies ¯ P ( x ) = inf n X l ( y )= n,x ⊆ y P ( y ) . In g eneral, the meas ure ¯ P is nonco mputable (and it is not a pro bability mea sure). By (2), for each lo wer semicomputable semimeas ure P , the inequa lit y c ¯ M ( A ) ≥ ¯ P ( A ) holds for e very Bo rel s et A , where c is a p ositive constant. In the manner of Levin’s pa pe r s [4–6,15 ] (see also [13]), we co ns ider c ombinations of probabilistic and deterministic pro ces ses a s the mos t g eneral class of pro cesses for gene r - ating data. With any probabilistic pro ces s some co mputable probability distr ibution can be assigned. An y deterministic pro cess is realized by means o f a n algorithm. Algor ithmic pro cesses trans fo rm sequences generated by probabilistic pro ces ses in to new sequences. More precise, a proba bilistic computer is a pair ( P , F ), where P is a co mputable proba- bilit y distribution, and F is a T uring machine s upplied with a n additional input tape. In the pro ce s s of computation this machine rea ds on this tap e a sequence ω distributed ac- cording to P and produces a s equence ω ′ = F ( ω ) (A correct definition see in [4,7,10 ,13]). So, we can compute the probability Q ( x ) = P { ω ∈ Ω : x ⊆ F ( ω ) } that the result F ( ω ) of the computation b egins with a finite se quence x . It is easy to see that Q ( x ) is a semimeasure s emicomputable from b elow. Generally , the semimeasure Q can be not a probability distribution on Ω, s ince F ( ω ) may be finite for some infinite ω . The conv erse r esult is prov ed in Zvonkin and Levin [15]: for every semimea sure Q ( x ) semicomputable from below a probabilistic computer ( L, F ) exists such that Q ( x ) = L { ω | x ⊆ F ( ω ) } , for all x , where L ( x ) = 2 − l ( x ) is the unif orm pr obability distribution o n the set of all binary sequences. Analogously , for any Borel set A ⊆ Ω consisting of infinite sequences, w e consider the probability Q ( A ) = L { α ∈ Ω : F ( α ) ∈ A } (3) of g e nerating a se q uence ω ∈ A by means o f a pro babilistic computer F . Obviously , we hav e c ¯ M ( A ) ≥ Q ( A ) for a ll such A , where c is a p ositive constant. Therefore, by (2) and (3) M ( x ) and M ( A ) define universal upp e r bo unds of the pro b- ability of generating x and ω ∈ A by probabilis tic computers . W e distinguish betw een subsets of Ω of ¯ M -measure 0 a nd subsets of po sitive measure ¯ M . If ¯ M ( A ) = 0 then the probability of generating a s e q uence ω ∈ A by means of any probabilistic computer is equal to 0. The simplest example of a set of ¯ M -measure 0 is A = { ω } , where ω is a non-co mputable sequence. Indeed, if ¯ M { ω } > 0 then ther e exis t a rationa l r > 0 such that M ( ω n ) > r for all n . Obviously , there are only finite num b er of uncompara ble string s x suc h that M ( x ) > r . Then there exists an k s uch tha t ω k ⊆ x and M ( x ) > r imply x ⊆ ω . W e can compute each bit o f ω by enumerating all such x . 3 The sets of ¯ M -measure 0 were describ ed b y Lev in [4,6 ] in terms of q uantit y of infor - mation. W e refer readers to Li and Vit´ anyi [7] and to Shen et al. [10] for the theor y of al- gorithmic ra ndomness. W e use definition of a random sequence in terms of univ er sal probability . Let P be so me computable measur e in Ω. The deficiency of randomness of a sequence ω ∈ Ω with resp ect to P is de fined as d ( ω | P ) = sup n M ( ω n ) P ( ω n ) , (4) where ω n = ω 1 ω 2 . . . ω n . This definition leads to the same cla ss of rando m sequences as the or iginal Martin-L¨ of [8] definition. Let R P be the set o f all infinite bina ry sequenc e s random with r esp ect to a measure P R P = { ω ∈ Ω : d ( ω | P ) < ∞} . W e also consider p ar ametric families of proba bilit y distributions P θ ( x ), where θ is a re a l nu m be r ; we supp os e that θ ∈ [0 , 1 ]. An ex ample of such family is the Bernoulli family B θ ( x ) = θ k (1 − θ ) n − k , where n is the length of x and k is the n um be r of ones in it. W e asso cia te with a binary sequence θ 1 θ 2 . . . a real num b er with the binary expa nsion 0 .θ 1 θ 2 . . . . When the sequence θ 1 θ 2 . . . is co mputable or random with respect to some measure we say that the num b er 0 .θ 1 θ 2 . . . is computable o r ra ndo m with res pe c t to the corres p o nding measure in [0 , 1]. W e consider probability distributions P θ computable with resp ect to a pa rameter θ . Informally , this means that ther e exis ts an algorithm enumerating all tr iples ( x, r 1 , r 2 ), where x ∈ Ξ and r 1 , r 2 are rational num ber s , such that r 1 < P θ ( x ) < r 2 . This algo rithm uses an infinite sequence θ as an additiona l input; if some triple ( x, r 1 , r 2 ) is enumerated by this algorithm then only a finite initial fragment of θ w as us e d in the process of computation (for corr ect definition, see also Shen et al. [10] and V ovk a nd V’yugin [1 1]). Analogously , we consider pa r ametric low er semicomputable se mimea sures. It can b e prov ed that there exis t a universal par ametric lower semicomputable semimea s ure M θ . This means that for each par ametric low er semicomputable semimeasure R θ there exists a p os itive constant C such that C M θ ( x ) ≥ R θ ( x ) for a ll x and θ . The corresp onding definition of randomness with resp ect to a family P θ is obtained by relativizatio n of (4 ) with res pec t to θ d θ ( ω ) = sup n M θ ( ω n ) P θ ( ω n ) (see also [3]). This definition leads to the same class of rando m sequences as the o riginal Martin-L¨ o f [8] definition relitivized with resp ect to a pa rameter θ . F or any θ , let I θ = { ω ∈ Ω : d θ ( ω ) < ∞} be the set of a ll infinite binary seq uences random with resp ect to the measure P θ . In case of Bernoulli family , we call elements of this set θ - Bernoul li se quenc es . 4 3. Randomness with resp ect to a parameter family W e need so me statistical no tions (se e Cox and Hinkley [2]). Let P θ be some computable parametric family of pro bability distr ibutions. A function ˆ θ ( x ) from Ξ to [0 , 1] is called an estimator . An estimator ˆ θ is called strictly c onsistent if for each par ameter v alue θ for P θ -almost all ω , ˆ θ ( ω n ) → θ as n → ∞ . Let ǫ and δ b e ra tional num b ers. An estimator ˆ θ is called effe ctively strictly c onsistent if ther e exists a computable function N ( ǫ, δ ) such that for ea ch θ for all ǫ a nd δ P θ { ω ∈ Ω : sup n ≥ N ( ǫ ,δ ) | ˆ θ ( ω n ) − θ | > ǫ } ≤ δ (5) The strong law of large nu mbers Borovk ov [1] (Chapter 5) B θ ( sup k ≥ n 1 k k X i =1 ω i − θ ≥ ǫ ) < 1 ǫ 4 n shows that the function ˆ θ ( ω n ) = 1 n n P i =1 ω i is a co mputable strictly consisten t estimator for the Ber noulli family B θ . Prop ositi on 1 F or any effe ctively strictly c onsistent est imator ˆ θ , lim n →∞ ˆ θ ( ω n ) = θ for e ach ω ∈ I θ . Pr o of . Assume an infinite sequence ω b e Mar tin-L ¨ of random with resp ect to P θ for some θ . A t fir st, we pr ov e that lim n →∞ ˆ θ ( ω n ) e xists. Le t for j = 1 , 2 , . . . , W j = { α ∈ Ω : ( ∃ n, k ≥ N (1 /j, 2 − ( j +1) )) | ˆ θ ( α n ) − ˆ θ ( α k ) | > 1 /j } . By (5) for any θ , P θ ( W j ) < 2 − j for all j . Define V i = ∪ j >i W j for a ll i . By definition fo r any θ , P θ ( V i ) < 2 − i for all i . Also, any set V i can b e represe nted as a r ecursively enu- merable union of interv als of type Γ x . T o reduce this definition of Martin-L¨ of test to the definition of the test (4) define a sequence of uniform low er semicomputable par ametric semimeasures R θ ,i ( x ) = 2 i P θ ( x ) if Γ x ⊆ V i 0 o therwise and co nsider the mixture R θ ( x ) = ∞ P i =1 1 i ( i +1) R θ ,i ( x ). Suppo se that lim n →∞ ˆ θ ( ω n ) do es not exis t. Then for ea ch sufficiently larg e j , | ˆ θ ( ω n ) − ˆ θ ( ω k ) | > 1 /j fo r infinitely ma ny n and k . This implies that ω ∈ V i for a ll i , and then for some p o sitive constant c , d θ ( ω ) = sup n M θ ( ω n ) P θ ( ω n ) ≥ sup n R θ ( ω n ) cP θ ( ω n ) = ∞ , 5 i.e., ω is not Martin-L¨ o f rando m with resp ect to P θ . Suppo se that lim n →∞ ˆ θ ( ω n ) 6 = θ . Then the ra tional num b er s r 1 , r 2 exist such that r 1 < lim n →∞ ˆ θ ( ω n ) < r 2 and θ 6∈ [ r 1 , r 2 ]. Since the estimator ˆ θ is co nsistent, P θ { α : r 1 < lim n →∞ ˆ θ ( α n ) < r 2 } = 0, and w e can effectively (using θ ) enum erate an infinite sequence of po sitive in teger n um ber s n 1 < n 2 < . . . such that fo r W ′ j = ∪{ Γ x : l ( x ) ≥ n j , r 1 < ˆ θ ( x ) < r 2 } , we have P θ ( W ′ j ) < 2 − j for a ll j . Define V ′ i = ∪ j >i W ′ j for a ll i . W e hav e P θ ( V ′ i ) ≤ 2 − i and ω ∈ V ′ i for all i . Then ω can not b e Martin-L¨ of ra ndom with re s pe c t to P θ . These t wo contradictions obtained a b ove prov e the pro p o sition. △ The following theorem gener alizes the simplest example of a set o f ¯ M -measure 0 pre- sented in Section 2. It can b e interpreted s uch that P θ -random sequences with “a presp ec- ified” noncomputable para meter θ ca n not be obtained in any combinations of sto chastic and deter minis tic pro cesses. Theorem 1 Assume a c omputable p ar ametric family P θ of pr ob ability distributions has an effe ctively strictly c onsistent estimator. Then for e ach θ , ¯ M ( I θ ) > 0 if and only if θ is c omput able. Pr o of . If θ is computable then the probability distribution P θ is als o computable a nd by (2) c ¯ M ( I θ ) ≥ P θ ( I θ ) = 1, where c is a po sitive constant. The pro of o f the co nv er s e assertio n is more complicated. Assume ¯ M ( I θ ) > 0. Ther e exists a simple set V (a union o f a finite set of in terv als) and a rationa l n um ber r s uch that 1 2 ¯ M ( V ) < r < ¯ M ( I θ ∪ V ). F or a ny finite set X ⊆ Ξ, let ¯ X = ∪ x ∈ X Γ x . Let n b e a p os itive integer n umber. When w e compute a rationa l approximation θ n of θ up to 1 2 n as follows. Using the exhaus tive sea rch, we find a finite set X n of pairwise incomparable finite seq ue nce s of length ≥ N (1 /n , 2 − n ) s uch that ¯ X n ⊆ V , X x ∈ X n M ( x ) > r, | ˆ θ ( x ) − ˆ θ ( x ′ ) | ≤ 1 2 n (6) for a ll x, x ′ ∈ X n . I f a n y such set X n will be found, we put θ n = ˆ θ ( x ), where x is the minimal elemen t of X n with respe c t to some natural (lex ic ographic) ordering of all finite binary sequences. Let us prov e that for each n some s uch set X n exists. Since ¯ M ( I θ ∩ V ) > r , there exists a closed (in the top olo g y defined by interv a ls Γ x ) set E ⊆ I θ ∩ V s uch that ¯ M ( E ) > r . Consider the function f k ( ω ) = inf { n : n ≥ k, | ˆ θ ( ω n ) − θ | ≤ 1 4 n } . By Pro p osition 1 this function is contin uous on Ω and, since the set E is c ompact, it is bo unded on E . Hence, for each k , ther e exists a finite set X ⊆ Ξ consisting o f pair wise incomparable sequences of leng th ≥ k such that E ⊆ ¯ X and | ˆ θ ( x ) − ˆ θ ( x ′ ) | ≤ 1 2 n for all x, x ′ ∈ X . Since E ⊆ ¯ X , we have P x ∈ X M ( x ) > r . Ther efore, the set X n can be found b y exhaustive search. Lemma 1 F or any Bor el set V ⊆ Ω , ¯ M ( V ) > 0 and V ⊆ I θ imply P θ ( V ) > 0 . 6 Pr o of . By definition of M θ each co mputable parametric measure P θ is absolutely contin- uous with res p ect to the measure ¯ M θ , and so, we hav e representation P θ ( X ) = Z X dP θ d ¯ M θ ( ω ) d ¯ M θ ( ω ) , (7) where dP θ d ¯ M θ ( ω ) is the Radon- Nico dim deriv a tive; it exis ts for ¯ M θ -almost all ω . By definition we have for ¯ M θ -almost all ω ∈ I θ dP θ d ¯ M θ ( ω ) = lim n →∞ P θ ¯ M θ ( ω n ) ≥ lim inf n →∞ P θ ¯ M θ ( ω n ) ≥ C θ ,ω > 0 . (8) By definition c θ ¯ M θ ( X ) ≥ ¯ M ( X ) for all Borel sets X , wher e c θ is some p ositive co nstant (depending on θ ). Then by (7 ) and (8) the inequa lit y ¯ M ( X ) > 0 implies P θ ( X ) > 0 for each Borel se t X . △ W e rewrite (5) in the form E n = { ω ∈ Ω : sup N ≥ N (1 / (2 n ) , 2 − n ) | ˆ θ ( ω N ) − θ | ≥ 1 2 n } . (9) By definition P θ ( E n ) ≤ 2 − n for all n . W e prov e that X n 6⊆ E n for a lmost all n . Supp ose that the opp osite assertion holds. Then there exists an increa sing infinite sequence of po sitive integer num b er s n 1 , n 2 . . . suc h that X n i ⊆ E n i for all i = 1 , 2 , . . . . This implies P θ ( X n i ) ≤ 2 − n i for a ll i . F or an y k , define U k = ∪ i ≥ k X n i . Clearly , we hav e for all k , ¯ M ( ¯ U k ) > r and P θ ( ¯ U k ) ≤ P i ≥ k 2 − n i ≤ 2 − n k +1 . Let U = ∩ U k . Then P θ ( U ) = 0 and ¯ M ( U ) ≥ r > 1 2 ¯ M ( V ). F rom U ⊆ V and ¯ M ( I θ ∩ V ) > 1 2 ¯ M ( V ) the inequa lit y ¯ M ( I θ ∩ U ) > 0 follows. Then the set I θ ∩ U co ns ists of P θ -random sequences, P θ ( I θ ∩ U ) = 0 and ¯ M ( I θ ∩ U ) > 0. This is a contradiction with Lemma 1. Assume X n 6⊆ E n for all n ≥ n 0 . Let a ls o, a finite seq ue nce x n ∈ X n is defined s uch that Γ x n ∩ (Ω \ E n ) 6 = ∅ . Then from l ( x n ) ≥ N ( 1 2 n , 2 − n ) the inequa lity 1 l ( x n ) l ( x n ) X i =1 ( x n ) i − θ < 1 2 n follows. By (6) w e obtain | θ n − θ | < 1 n . This means that the r e al num ber θ is computable. Theorem is proved. △ Let Q b e a co mputable proba bility distr ibution on θ s (i.e., on the s et Ω ). Then the Bay es ian mixture with r esp ect to the prior Q P ( x ) = Z P θ ( x ) dQ ( θ ) is als o co mputable pro bability distribution. Recall that R Q is the set of all infinite sequences Martin-L¨ of r andom with r esp ect to a computable probability mea sure Q . Ob viously , P ( ∪ θ ∈ R Q I θ ) = 1, a nd then ¯ M ( ∪ θ ∈ R Q I θ )) > 0. Mo reov er, it follows fr o m Cor ollary 4 of V ovk and V’yug in [11] 7 Theorem 2 F or any c omputable me asur e Q , a se quenc e ω is r andom with resp e ct to the Bayesian mix tur e P if and only if ω is r andom with r esp e ct to a me asure P θ for some θ r andom with r esp e ct to the m e asur e Q ; in ot her wor ds, R P = ∪ θ ∈ R Q I θ . Notice that ea ch c omputable θ is Martin-L¨ of ra ndom with resp ect to the computable probability distribution concentrated on this seq uence. 4. Randomness with resp ect to non-random parameters W e show in this section that the Bay esian approach is insufficient to co ver all p ossible “meaningful” cases : a probabilistic machine ca n b e construc ted, which with probability close to o ne outputs a random θ - B ernoulli sequence, where the parameter θ is not random with r esp ect to ea ch computable pro bability distribution. Let P (Ω) b e the set o f a ll co mputable pro bability measures o n Ω and let S = ∪ P ∈P (Ω) R P be the set of all s equences Martin-L¨ o f r andom with r e sp e ct to computable pro ba bility measures. W e call these s equences - sto chastic . Let S c be a complement o f S - the set of non-sto chastic sequences. An infinite binary sequence α is T uring reducible to an infinite binary sequence se- quence β if α = F ( β ) for so me computable o per ation F ; we denote this α ≤ T β . Two infinite s equences α a nd β ar e T uring equiv alent if α ≤ T β and β ≤ T α . Let C l ( S ) = { α : ∃ β ( β ∈ S & β ≤ T α ) } . (10) The complement of the set (1 0 ), C l ( S ) c = Ω \ C l ( S ), consis ts of sequenc e s no n-random with resp ect to a ll computable pr obability disr ibutions, i.e., C l ( S ) c ⊆ S c ; more over, it consists o f sequences which can no t b e T uring e q uiv alent to s to chastic sequenc e s . Also, no sto chastic s equence c an b e T uring reducible to a seque nc e fro m C l ( S ) c . V’yugin [12], [1 3] proved that ¯ M ( C l ( S ) c ) > 0. Theorem 3 F or any ǫ , 0 < ǫ < 1 , a lower semic omputable semime asur e Q exists s u ch that ¯ Q ( E Q ) > 1 − ǫ and E Q ⊆ C l ( S ) c . F or completnees o f presentation we give in App endix A a new s implified pro of of this theorem. W e show that this result can be extended on parameters of the Ber noulli family . Theorem 4 L et I θ b e t he set of al l θ - Bernoul li se qu enc es. Then ¯ M ( ∪ θ ∈ C l ( S ) c I θ ) > 0 . In t erms of pr ob abilistic c omputers, for any ǫ , 0 < ǫ < 1 , a pr ob abilistic machine ( L, F ) c an b e c onst ructe d, which with pr ob ability ≥ 1 − ǫ gener ates an θ -Bernoul li se quenc e, wher e θ ∈ C l ( S ) c (i.e., θ is nonsto chastic). Pr o of . F o r an y ǫ > 0, 0 < ǫ < 1, w e define a low er semicomputable semimeas ure P such that ¯ P ( ∪ θ ∈ C l ( S ) c I θ ) > 1 − ǫ. The pro of o f the theorem is based on Theorem 3. 8 Let Q b e the s emimeasure defined in this theorem. F or a ny ω 6∈ E Q we have Q ( ω n ) = 0 for a ll sufficiently large n . F or the measure R − ( x ) = Z B θ ( x ) d ¯ Q ( θ ) , (11) where B θ is the B e rnoulli measure, we hav e R − (Ω) > 1 − ǫ by Theorem 3, and R − ( ∪ θ ∈ C l ( S ) I θ ) = 0. Unfortunately , we can not co nclude that c ¯ M ≥ R − for some co nstant c , since the measure R − is not r epresented in the form R − = ¯ P fo r some low er semicomputable semimeasure P . T o ov er come this problem, we consider some semico mputable appr oxi- mation o f this meas ure. F or any finite binary sequences α and x , let B − α ( x ) = ( θ − ) K (1 − θ + ) N − K , where N is the length of x a nd K is the num b er of ones in it, θ − is the left side of the subinterv al corres p o nding to the sequence α and θ + is its right side. By definition B − α ( x ) ≤ B θ ( x ) for a ll θ − ≤ θ ≤ θ + . Let ǫ b e a rational num b er. Let Q s ( x ) b e equa l to the maxima l rational num b er r < Q ( x ) computed in s steps of enum eration of Q ( x ) fr om b elow. Using Theorem 3, we can define for n = 1 , 2 , . . . a nd for each x of length n a co mputable sequence of p ositive int eger num ber s s x ≥ n and a s e quence of finite binar y sequences α x, 1 , α x, 2 , . . . α x,k x of length ≥ n such that the function P ( x ) defined by P ( x ) = k x X i =1 B − α x,i ( x ) Q s x ( α x,i ) (12) is a semimea sure, i.e., such that condition (1) holds for all x , a nd such that X l ( x )= n P ( x ) > 1 − ǫ (13) holds for all n . These sequences ex ist, since the limit function R − defined by (11) is a measure satisfying R − (Ω) > 1 − ǫ . By definition the semimeasur e P ( x ) is low er semico mputable. Then cM ( x ) ≥ P ( x ) holds for all x ∈ Ξ, wher e c is a p os itive constant. T o pro ve tha t ¯ P (Ω \ ∪ θ I θ ) = 0 we consider s ome pr obability measure Q + ≥ Q . Since (1) holds, it is p ossible to define some no ncomputable measure Q + satisfying these pr op erties in man y different w ays. Define the mixture of the Bernoulli measur es with resp ect to Q + R + ( x ) = Z B θ ( x ) dQ + ( θ ) . (14) By definition R + (Ω \ ∪ θ I θ ) = 0. Using definitions (12) and (14), it ca n b e easily proved that ¯ P ≤ R + . Then ¯ P (Ω \ ∪ θ I θ ) = 0 . By Theorem 3 C l ( S ) ⊆ Ω \ E Q , and then ¯ Q ( C l ( S )) = 0. By (12) w e hav e ¯ P ( ∪ θ ∈ C l ( S ) I θ ) = 0. By (1 3) w e ha v e ¯ P (Ω) > 1 − ǫ . Then ¯ P ( ∪ θ ∈ C l ( S ) c I θ ) > 1 − ǫ . Therefore , ¯ M ( ∪ θ ∈ C l ( S ) c I θ ) > 0. △ 9 App endix A. Pro of of Theorem 3 Recall that E Q is the suppo r t set of a semimeasure Q . In that follo ws we define a semicomputable semimeasure Q such that – 1) ¯ Q ( E Q ) > 0; – 2) for ea ch ω ∈ E Q and for each computable op eration F such that F ( ω ) is infinite, the sequenc e F ( ω ) is not Ma rtin-L¨ of ra ndom with resp ect to the uniform pr o bability measure L o n Ω. By Theo rem 4.2 from [15 ] for each computable measure P on Ω, there exist tw o com- putable op e r ations F and G such that – 3) F ( ω ) ∈ Ω for each ω random with r esp ect to L , and G ( F ( ω )) = ω ; – 4) for each s equence ω r a ndom with r e sp e ct to P (and such that P { ω } = 0), the sequence G ( ω ) is r andom with re s pe c t to L . By 1 )- 4) each sequence ω ∈ E Q can no t b e Martin-L¨ of ra ndom with resp ect to an y computable pr obability measure P . W e will cons tr uct a semico mputable semimeas ure Q as a so me sor t of netw ork flow. W e define an infinite netw ork on the base of the infinite bina ry tree. This netw ort has no sink ; the top of the tree (empty sequence) is the so urce. Each x ∈ Ξ defines t wo edges ( x , x 0) and ( x, x 1) of length one. In the construction below we will a dd to the netw o rk extra edg es ( x, y ) of length > 1, w her e x, y ∈ Ξ, x ⊆ y and y 6 = x 0 , x 1. By the length of the edge ( x, y ) we mean the n umber l ( y ) − l ( x ). F or a ny edge σ = ( x, y ) we denote b y st ( σ ) = x its starting v er tex and by ter ( σ ) = y its termina l vertex. A computable function q ( σ ) defined on a ll edg es o f length one and on all extra edges and taking rationa l v alues is called a network if for a ll x ∈ Ξ X σ : st ( σ )= x q ( σ ) ≤ 1 . Let G b e the set of all extra edges of the netw ork q (it is a part of the domain of q ). By q - flow we mean the minimal semimeas ure P suc h tha t P ≥ R , where the function R is defined b y the following recur s ive equations R ( λ ) = 1; R ( y ) = X σ : ter ( σ )= y q ( σ ) R ( st ( σ )) (A.1) for y 6 = λ . It is easy to see that this s emimeasure P is lo w er semicomputable if q is computable. A net work q is called elementary if the set of extra edges is finite and q ( σ ) = 1 / 2 for almost all edg e s of unit length. F or a ny netw or k q , we define the n etwork flow delay function ( q - delay function) d ( x ) = 1 − q ( x, x 0) − q ( x, x 1) . The constr uction below works with all pro grams i computing the oper a tions F i ( x ). 2 W e define some function p ( n ) suc h tha t for each p os itive in teger n um ber m we have p ( n ) = m 2 The existence of the effectively computable sequence { F i } s uc h that for eac h computable op eration F , F = F i for some i is pro ved in [9]. 10 for infinitely many n . F o r exa mple, we can de fine p ( h m, k i ) = m and p ′ ( h m, k i ) = k for all m and k , where h m, k i is some computable one-to-o ne en umera tion of all pairs o f nonnegative integer num ber s. Then for each s tep n w e co mpute h i, s i = p ( n ), where i is a pr ogra m a nd s is a n um ber (we call s n um ber of a session); so , i = p ( p ( n )) and s = p ′ ( p ( n )). Let a program i , a num b er s , finite binary sequences x and y , an elemen tary net w ork q , and a nonnegative in teger num b er n be given. Define B ( h i , s i , x, y , q , n ) b e tr u e if the following conditions hold – (i) l ( y ) = n , x ⊆ y , – (ii) d ( y k ) < 1 for all k , 1 ≤ k ≤ n , where d is the q -delay function and y k = y 1 . . . y k ; – (iii) l ( F i ( y )) > h x, s i . Let B ( h i, s i , x, y , q , n ) b e false , otherwise. Define β ( x, q , n ) = min { y : p ( l ( y )) = p ( l ( x )) , B ( h p ( p ( l ( x ))) , p ′ ( p ( l ( x )) i , x, y , q , n ) } Here p ( p ( l ( x )) is a pr ogra m and p ′ ( p ( l ( x )) is a num b er o f sess ion; min is c onsidered for lexicogra phical ordering of s trings; we s uppo se that min ∅ is undefined. Lemma 2 F or e ach c omputable op er at ion F i and fo r e ach finite se quenc e x su ch that F ( ω ) ∈ Ω for some infinite extension ω of x (i.e., x ⊆ ω ), β ( x, q , n ) is define d for al l sufficiently lar ge n such that p ( p ( n )) = i . Pr o of . The needed s equence y e xists fo r all sufficiently larg e n , since l ( F i ( ω n )) > h x, s i holds for all sufficiently larg e n , p ( n ) = h i, s i . △ The goal o f the constr uction b elow is the following. Each extra edge σ will be assigned to some ta s k n um ber I = h i, s i suc h that p ( l ( st ( σ ))) = p ( l ( te r ( σ ))) = I . The goal of the task I is to define a finite s e t of extra edges σ such that for each infinite binary sequence ω one of the following conditions ho ld: either ω cont ains some extra edge as a subw or d, or the netw o rk flow dela y function d equals 1 on some initia l fragment of ω . F or ea ch extra edge σ added to the netw ork q , B ( I , st ( σ ) , ter ( σ ) , q n − 1 , n ) is true; it is false, other wise. Lemma 5 shows that ¯ Q ( E Q ) > 1 − ǫ , where Q is the q - flow and E Q is its suppor t set. Construction. Let ρ ( n ) = ( n + n 0 ) 2 for some sufficiently large n 0 (the v alue n 0 will be sp ecified b elow in the pro of o f Lemma 5). Using the mathematica l induction b y n , we define a sequence q n of elementary net- works. Put q 0 ( σ ) = 1 / 2 for all edges σ o f length one. Assume n > 0 and a netw o rk q n − 1 is defined. Let d n − 1 be the q n − 1 -delay function and let G n − 1 be the set of all extra edges. W e supp os e also that l ( ter ( σ )) < n for all σ ∈ G n − 1 . Let us define a netw ork q n . At first, we define a netw ork flow delay function d n and a set G n . Let w ( I , q n − 1 ) b e equal to the minimal m such that p ( m ) = I and m > l ( ter ( σ )) for each extra edge σ ∈ G n − 1 such that p ( l ( st ( σ ))) < I . The inequality w ( I , q m ) 6 = w ( I , q m − 1 ) can b e induced by some task J < I tha t adds an extr a edge σ = ( x, y ) such that l ( y ) > w ( i, q m − 1 ) a nd p ( l ( x )) = p ( l ( y )) = J . Lemma 3 (below) will s how that this can happen only at finitely many steps of the construction. The construction can b e split up in to three cases. Case 1 . w ( p ( n ) , q n − 1 ) = n (the goal of this part is to star t a new ta s k I = p ( n ) or to res tart the existing tas k I = p ( n ) if it was destroy ed by some ta sk J < I at so me preceding step). 11 Put d n ( y ) = 1 /ρ ( n ) for l ( y ) = n and define d n ( y ) = d n − 1 ( y ) for all o ther y . Put a lso G n = G n − 1 . Case 2. w ( p ( n ) , q n − 1 ) < n (the goal of this par t is to pr o cess the task I = p ( n )). Let C n be the set of all x suc h that w ( I , q n − 1 ) ≤ l ( x ) < n , 0 < d n − 1 ( x ) < 1 , the function β ( x, q n − 1 , n ) is defined 3 and there is no ex tr a edge σ ∈ G n − 1 such that st ( σ ) = x . In this ca se for each x ∈ C n define d n ( β ( x, q n − 1 , n )) = 0, a nd for all other y of length n such that x ⊆ y define d n ( y ) = d n − 1 ( x ) / (1 − d n − 1 ( x )) . Define d n ( y ) = d n − 1 ( y ) for all other y . W e add an extra edge to G n − 1 , namely , define G n = G n − 1 ∪ { ( x, β ( x, q n − 1 , n )) : x ∈ C n } . W e say that the task I = p ( n ) adds the extra edg e ( x, β ( x, q n − 1 , n )) to the netw or k and that all existing tasks J > I are destroyed b y the task I . After Case 1 and Case 2 , define for each edge σ o f unit length q n ( σ ) = 1 2 (1 − d n ( st ( σ ))) and q n ( σ ) = d n ( st ( σ )) for each extra edge σ ∈ G n . Case 3 . Cases 1 and 2 do not hold. Define d n = d n − 1 , q n = q n − 1 , G n = G n − 1 . Using this construction, we define the net w ork q = lim n →∞ q n , the net work flow delay function d = lim n →∞ d n , and the set o f extra edg es G = ∪ n G n . The functions q and d a re computable a nd the set G is recurs ive by their definitions. Let Q denote the q -flow. The following lemma shows that any task ca n add new extra e dges only at finite nu m be r of steps. Let G ( I ) b e the s et of all extra edges added b y the task I , w ( I , q ) = lim n →∞ w ( I , q n ). Lemma 3 The set G ( I ) is fi nite and w ( I , q ) < ∞ for al l I . Pr o of. Note that if G ( J ) is finite for all J < I then w ( I , q ) < ∞ . Then we must prov e that the set G ( I ) is finite fo r all I . Supp ose that the o ppo site asse r tion holds. Let I be the minimal n um ber such that G ( I ) is infinite. By c ho ic e of I the sets G ( J ) for all J < I are finite. Then w ( I , q ) < ∞ . By definition if d ( ω m ) 6 = 0 then p m = 1 /d ( ω m ) is a p ositive in teger num b er. Besides, if ( ω n , y ) , ( ω m , y ′ ) ∈ G ( I ), where n < m and l ( y ) = m , then p n > p m . Hence, for e a ch ω ∈ Ω a ma ximal m ex ists such that ( ω m , y ) ∈ G ( I ) for some y or no such extra e dge exists. In the latter case put m = w ( I , q ). Define u ( ω ) = 1 /d ( ω m ). By the construction the integer v a lued function u ( ω ) is constant on the in terv al Γ ω m . Hence, it is contin uous in the top ology gener ated by such interv als. Since Ω is co mpact in this top ology , u ( ω ) is bounded. Then for some m ′ , u ( ω ) = u ( ω m ′ ) for all ω . By the construction if any extra edge of I th t yp e was added to G ( I ) at some step then d ( y ) > d ( x ) ho lds for some new pair ( x, y ) such that x ⊆ y . This is a contradiction if G ( I ) is infinite. △ 3 In particular, p ( l ( x )) = I and l ( β ( x, q n − 1 , n )) = n . 12 An infinite sequence α ∈ Ω is called an I - ext ension of a finite sequence x if x ⊆ α and B ( I , x, α n , n ) is true for almo st all n . A sequence α ∈ Ω is called I - close d if d ( α n ) = 1 for so me n such that p ( n ) = I , whe r e d is the q -delay function. Note that if σ ∈ G ( I ) is some extra edg e then B ( I , s t ( σ ) , ter ( σ ) , n ) is true, wher e n = l ( te r ( σ )). Lemma 4 A ssume for e ach initial fr agment ω n of an infin ite se quenc e ω some I -ext ension exists. Then either the se quen c e ω wil l b e I -close d in t he pr o c ess of t he c onstruction or ω c ontains an extr a e dge of I th typ e (i.e. su ch that ter ( σ ) ⊆ ω for some σ ∈ G ( I ) ). Pr o of. Assume a s equence ω is not I -clo sed. By Lemma 3 the maximal m e x ists such that p ( m ) = I and d ( ω m ) > 0. Since the s equence ω m has an I - e xtension and d ( ω k ) < 1 for a ll k , by Case 2 of the cons tr uction a new e xtra e dge ( ω m , y ) o f I th type m ust b e added to the binary tree . By the co nstruction d ( y ) = 0 and d ( z ) 6 = 0 for all z s uch that ω m ⊆ z , l ( z ) = l ( y ), and z 6 = y . By the c hoice o f m we hav e y ⊆ ω . △ Obviously , Q ( y ) = 0 if a nd o nly if q ( σ ) = 0 for so me edge σ of unit length lo cated on y (this edge satisfies ter ( σ ) ⊆ y and d ( st ( σ )) = 1). Then the r elation Q ( y ) = 0 is r ecursive and E Q = Ω \ ∪ d ( x )=1 Γ x . Lemma 5 It holds ¯ Q ( E Q ) > 1 − ǫ . Pr o of. W e b ound ¯ Q (Ω) from below. F or a ny n , let q n be the netw ork defined at step n , R n be defined b y (A.1), a nd d n be the corresp onding q n -delay function. If w ( p ( n ) , q n − 1 ) = n (i.e., Case 1 holds at step n ) then X l ( u )= n d n ( u ) R n ( u ) = ( n + n 0 ) − 2 X R n ( u ) ≤ ( n + n 0 ) − 2 . (A.2) Assume Cas e 2 holds a t the step n and x ∈ C n such that ( x, y ) ∈ G for so me y , l ( y ) = n . Since by the construction d n ( y ) = 0, X l ( z )= n,x ⊆ z d n ( z ) R n ( z ) ≤ d n − 1 ( x ) (1 − d n − 1 ( x )) X l ( z )= n,x ⊆ z ,z 6 = y R n ( z ) . (A.3) W e hav e X l ( z )= n,x ⊆ z ,z 6 = y R n − 1 ( z ) ≤ (1 − d n − 1 ( x )) R n − 1 ( x ) . (A.4) By the constr uction R n ( z ) = R n − 1 ( z ) for z such that l ( z ) = n , x ⊆ z , z 6 = y . Then X l ( z )= n,x ⊆ z d n ( z ) R n ( z ) ≤ d n − 1 ( x ) R n − 1 ( x ) . (A.5) By definitio n P ( n + n 0 ) − 2 ≤ ǫ . After that, using (A.2) and (A.5) we can prove by the mathematical induction o n n that ¯ Q (Ω) = inf n X l ( u )= n Q ( u ) ≥ inf n X l ( u )= n R ( u ) ≥ 1 − ǫ . Lemma is proved. △ Lemma 6 F or any infinite se quenc e ω ∈ E Q and for any c omputable op er ation F if the se quenc e F ( ω ) is infin ite t hen it is not Martin-L¨ of r andom with r esp e ct to the un iform pr ob ability distribution. 13 Pr o of. Assume that ω is an infinite sequence and F is a computable oper ation such that F ( ω ) is infinite. Then F i = F for so me i . Define U s = ∪{ Γ β ( x,q n − 1 ,n ) : x ∈ C n , p ( n ) = h i, s i} , where C n is the s et from Case 2 of the construction. By definition L ( U s ) = X x ∈ C n 2 −h x,s i ≤ 2 − cs for some p os itive constant c , and F i ( ω ) ∈ ∩ s U s . Therefore, the s equence F ( ω ) is not Martin-L¨ o f rando m. Lemma 6 a nd Theor em 3 are prov ed. △ References [1] Borovk ov, A. A,: Theory of Probability . N auk a. 1999 (Probabili ty Theory . Gordon and Breach. 1998). [2] Cox, D.R., Hinkley , D.V.: Theoretical Statistics. London. Chap man and Hall. 1974 [3] Levin, L. A.: On the notion of random s equence. So viet Math. Dokl. 14 (1973) 1413–1416 [4] Levin, L.A.: Laws of information conserv ation (non-gro wth) and aspects of the foundation of probability theo ry . Problems Inform. T ransmission. 10 (1974) 206–210 [5] Levin, L. A., V’jugin, V.V.: Inv ar i an t Properties of Informational Bulks. LNCS. 5 3 (1977) 359–364 [6] Levin, L. A.: Randomness conserv ation inequalities; inf ormation and indep endence in mathematical theories. Inform. and Control. 61 (19 84) 15 –37 [7] Li, M. , Vi t´ an yi, P .: An In tr o duction to K olmogoro v Compl exity and Its Appli cations. 2nd ed. New Y ork. Springer-V erlag, 1997 [8] Martin-L¨ of , P . : The definition of random sequences. Inform. and Con trol. 9 (1966) 602 –619 [9] Rogers, H., Theory of Recursive F unctions and Effectiv e Computability . New Y ork: McGra w Hil l, 1967 [10] Shen, A., Uspensky , V.A., V ereshc hagin, N.K.: Lecture Notes on Kolmogorov Complexit y . h ttp://lpes.math.msu.su/ v er/k olm-b ook. 2007 [11] V ovk, V.G. and V’yugin, V.V.: On the empirical v alidity of the Ba y esian rule. J. R. Statist. So c. B. 55 (1993) 317–3 51 [12] V’yugin, V.V. : On T uring in v ariant sets. So viet Math. Dokl. 17 (1976) 1090 –1094 [13] V’yugin, V.V.: The algebra of in v ariant prop erties of binary sequences. Problems Inform. T ransmi ssion. 18 (19 82) 147–161 [14] Vladimir V’yugin.: On Empirical Meaning of Randomness with Respect to a Real Parameter. CSR 2007 (V.Diek ert, M.V olko v, and A. V oronko v (Eds.)), LNCS, 464 9 , 387-396. Springer V erlag, Berlin Heildeber g 2007. [15] Zv onkin, A.K. and Levin, L.A.: The complexity of finite ob jects and the algorithmic concepts of information and randomness. Russ. Math. Surv. 25 (1973) 83–124 14
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment