Randomness is valid at large numbers

Randomness is a central concept to statistics and physics. Here, a statistical analysis shows experimental evidence that tossing coins and finding last digits of prime numbers are identical regarding statistics for equally likely outcomes. This analy…

Authors: Yeseul Kim, Byung Mook Weon

Randomness is valid at large numbers
Randomness is v alid at large n um b ers Y eseul Kim 1 and Byung Mo ok W eon 1, 2, 3 , ∗ 1 Soft Matter Physics L ab or atory, Scho ol of A dvanc e d Materials Scienc e and Engine ering, SKKU A dvanc e d Institute of Nanote chnolo gy (SAINT), Sungkyunkwan University, Suwon 16419, South Kor e a 2 R ese ar ch Center for A dvanc ed Materials T e chnolo gy, Sungkyunkwan University, Suwon 16419, South Kor e a 3 Dep artment of Biome dic al Engine ering, Johns Hopkins University, Baltimor e, Maryland 21218, USA (Dated: Nov ember 26, 2021) Randomness is a central concept to statistics and physics. Here, a statistical analysis sho ws exp erimen tal evidence that tossing coins and finding last digits of prime num b ers are identical regarding statistics for equally lik ely outcomes. This analysis explains why randomness in equally lik ely outcomes can b e v alid only at large num b ers. Keywords: randomness, coin tossing, prime num b er Randomness is essen tial in statistics as well as in mak- ing a fair decision [1 – 4] and in making pseudo-random n umbers [5, 6]. Coin tossing is a basic example of a ran- dom phenomenon [2]: b y flipping a coin, one believes to choose one randomly b etw een heads and tails. Coin tossing is a simple and fair wa y of deciding b et ween tw o arbitrary options [3]. It is commonly assumed that coin tossing is random. F or a fair coin, the probability of heads and tails is equal, i.e., Prob(heads) = Prob(tails) = 50% as illustrated in Fig. 1. This situation is v alid only under a condition that all p ossible orientations of the coin are equally likely [4]. In fact, real coins spin in three dimensions and hav e finite thickness, so that coin tossing is a ph ysical phenomenon go v erned b y Newtonian mec hanics [1–4]. Making a choice b y flipping a coin is still imp ortan t in quan tum mechanical statistics [6, 7]. The randomness in coin tossing or rolling dice is of great in- terest in physics and statistics [8 – 13]: coin or dice tossing is commonly b eliev ed to be random but can b e chaotic in real world [14]. A similar situation appears in distribution of prime n umbers. Prime num bers are p ositive integers larger than 1: they are dividable only b y 1 and themselves. All primes except 2 and 5 should end in a last digit ( j ) of 1, 3, 7, or 9. In mathematics, the last digits are b eliev ed (without a proof ) to be random or ev enly distributed when num b ers are large enough [15]. If the last digits of prime num b ers come out with the same frequency , then the probabilit y of the four last digits would b e equal, i.e., Prob( j ) = 25% as illustrated in Fig. 2. The study of the distribution of prime num b ers has fascinated mathe- maticians and physicists for man y centuries [15–19]. The distribution of prime num b ers is essen tial to mathemat- ics as well as physics and biology . P articularly in man y disparate natural datasets and mathematical sequences, the leading digit ( d ) is not uniformly distributed, but in- stead has a biased probability as P ( d ) = log 10 (1 + 1 /d ) ∗ Electronic address: bmweon@skku.edu with d = 1 , 2 , ..., 9, known as the Benfords law [16 – 18]. The distribution of last digits of prime num bers is an- other imp ortant topic: in particular, it is unclear that four last digits are random or evenly distributed when n umbers are large enough. In this article, we presen t exp erimental evidence from a statistical analysis, as highlighted in Fig. 3, indicating that tossing coins and finding last digits of primes are in trinsically identic al in statistics with resp ect to equally lik ely outcomes. This analysis explains that randomness can b e v alid only at large n umbers. There are many examples for equally likely outcomes: represen tatively , coin tossing is b eliev ed to o ccur with a probabilit y of 50% b et ween heads and tails. F or rep eated exp erimen ts with a same sample, if its frequency b etw een exp ected outcomes is equal, one can sa y: the exp ected outcome of the sample is r andom . Here we suggest a simple w ay to define the randomness concerning equally lik ely outcomes at large num b ers. The frequency of each outcome ( n i ) can v ary compli- catedly according to exp eriments and conditions. The relativ e frequency of an outcome ( f i = n i / N ) is taken b y dividing n i b y the total n umber of rep etition ( N or equally the size of the sample). The range of frequency ( R ) is defined as the difference b et ween the maxim um fre- quency ( n max i ) and the minimum frequency ( n min i ) and consequen tly describ ed as R = ( n max i − n min i ). In statis- tics, it is w ell kno wn that the range ( R ) tends to b e larger, the large the size of the sample ( N ) [20, 21]. This ten- dency can be describ ed b y a pow er-law scaling as R ∼ N α where 0 < α < 1. Suc h a p o wer-la w scaling commonly app ears in statistics and physics [22, 23]. Additionally , the range of relative frequency ( R/ N ) b et ween equally lik ely outcomes is defined as R/ N = ( f max i − f min i ), which is equiv alent to ( n max i − n min i ) / N . F rom R ∼ N α , R/ N should hav e a simple relation as R/ N ∼ N β , where β = α − 1 (here note that β < 0 b ecause α < 1). The sta- tistical constraint of R / N ∼ N β ( β < 0) implies that the frequency of eac h outcome should become equal (because R/ N → 0) as the total num b er of repletion increases ( N → ∞ ). Consequently , the condition of R/ N → 0 2 1 10 100 1000 10000 0. 0 0. 2 0. 4 0. 6 0. 8 1. 0 Relative freq uency of heads N umber of tos se s (b)  Coin T osses  (a)  50%  FIG. 1: Coin tosses. (a) Schematic illustration of a fair coin with t wo equally likely outcomes (heads or tails): t wo out- comes equally hav e 50% in probability . (b) The relative fre- quency of heads tak en from fiv e exp eriments (tossing each coin up to 10 4 rep etitions). Differen t exp erimen ts are illustrated b y differen t colors. The relative frequency of heads gradually approac hes to 50% [to w ard the dashed line]. The ra w data for fiv e exp eriments are summarized in the supplementary tables S1 ∼ S5. at N → ∞ explains why randomness is v alid only at large num b ers, which is known as the law of large num- b ers in probability theory . In this study , we would like to identify the β exp onents for equally lik ely outcomes; in particular, coin tosses (with tw o outcomes) and last digits of prime num bers (with four outcomes). First, we conducted exp erimen ts for coin tossing. T o rule out ph ysical and mechanical asp ects of tossed coins, w e used an online virtual coin toss simulation applica- tion (h ttp://www.virtualcointoss.com) with an ideal coin of zero thic kness, where there is no bias b et ween heads and tails, ensuring the equal probabilities for heads and tails. Our exp erimen ts with p erfectly thin coins enable us to consider only the statistical features of the coin- tossing problems. W e carried out separately five ex- p erimen ts. The frequency of heads ( n H ) or tails ( n T ) for eac h experiment was recorded with the n um b er of tosses ( N ) (equally the size of the sample). The relative frequencies ( f H = n H / N or f T = n T / N for heads or 10 1 10 2 10 3 10 4 10 5 10 6 0. 0 0. 1 0. 2 0. 3 Relative freq uency N um ber of primes las t d igit 1 las t d igit 3 las t d igit 7 las t d igit 9 25% Last Digits  (a)  (b)  25%  FIG. 2: Last digits of prime num b ers. (a) Schematic illus- tration of last digits (1, 3, 7, and 9) of prime num b ers up to 10 7 . The probability of each last digit is exp ected to b e equal as 25% at large num bers. (b) The relative frequency of last digits gradually approac hes to 25% [tow ard the dashed line]. The raw data are summarized in the supplementary table S6. tails), the range of frequency [ R = ( n max i − n min i ) where i = heads or tails], and the range of relative frequency [ R/ N = ( n max i − n min i ) / N ] were summarized in the sup- plemen tary tables S1 ∼ S5. Eac h of exp erimen ts w as illus- trated by different colors. In turn, w e examined the last digits of prime n um- b ers. As w ell known, all prime n umbers except 2 and 5 should end in a last digit (1, 3, 7, or 9) and the last dig- its are exp ected to b e random when num b ers are large enough, which suggests that the frequency of four last digits should b e equal, i.e., Prob( j ) = 25% [Fig. 2(a)]. F or the prime num bers in base 10 for integers up to 10 7 (where totally 664,579 prime n umbers exist), we coun ted the frequency of each last digit ( n j where j = 1, 3, 7, or 9), the range of frequency [ R = ( n max j − n min j )], and the range of relative frequency [ R / N = ( n max j − n min j ) / N ], as summarized in the supplementary table S6. Here the n umber of prime num bers ( N ) (including 2 and 5) is equiv alent to the size of the sample. The statistical uncertain ties were c heck ed for coin toss- ing exp erimen ts in the plot of R/ N with N [Fig. 3(a)] b y 3 measuring one standard deviation from five exp eriments (from fiv e data p oin ts for R/ N for a given N ). How ev er, the prime n umbers and the range of relative frequency w ere completely deterministic for integer num b ers up to 10 7 , which implies no errors in the plot of R/ N with N [Fig. 3(b)]. F or coin tosses, the relativ e frequency of heads for five exp erimen ts differently v aries at small n umbers but sim- ilarly conv erges on the exp ected v alue (50%) at large n umbers [tow ard the dashed line in Fig. 1(b)], whic h supp orts that coin tossing is a problem of equally likely outcomes. The well-kno wn statistical feature of that the range ( R ) tends to b e larger, the large the size of the sample ( N ) suggests a p o w er-law scaling as R ∼ N α (0 < α < 1. On this basis, we expected a simple relation for the range of relative frequency for heads and tails, denoted R / N = ( n max i − n min i ) / N ( i = heads or tails) as R/ N ∼ N β where β = α − 1 < 0. As illustrated in Fig. 3(a): R/ N = 3 . 1461 N − 0 . 6237 for the trend line, we obtained β = − 0 . 6237 (the standard error = ± 0.0272) for fiv e coin tossing exp eriments (error bars came from one standard deviations). This result clearly supports the v alidity of Prob( i ) = 50% by R/ N → 0 at N → ∞ , indi- cating statistical evidence of randomness for coin tosses at large num b ers, which is consist with a common b elief ab out coin tossing [9]. F or last digits of prime num b ers, the relativ e frequency of last digits finally approaches to the ultimately ex- p ected v alue (25%) [tow ard the dashed line in Fig. 2(b)]. The range of frequency among last digits increases with the total num ber of primes as a p o w er-law scaling of R ∼ N α with α ≈ 0 . 4, which is similar to the case of coin tossing. The range of relative frequency among last digits, denoted R / N = ( n max j − n min j ) / N (where j = 1, 3, 7, or 9), shows R/ N ∼ N β where β = − 0 . 5832 (the standard error = ± 0 . 0094) for last digits [Fig. 3(b): R/ N = 0 . 5294 N − 0 . 5832 for the trend line], which is iden- tical to the case of coin tossing. This result supp orts the v alidity of Prob( j ) = 25% for one of four last digits b y R/ N → 0 at N → ∞ , indicating that the last digit of primes would o ccur with the same frequency at large n umbers. The abov e t w o examples of equally lik ely outcomes lead to the same results: as the size of the sample ( N ) in- creases, the range of relativ e frequency ( R / N ) decreases, follo wing the p ow er law scaling as R / N ∼ N β . Here the β exp onents were found as approximately − 0 . 6 for coin tossing exp eriments with Prob( j ) = 50% for t wo outcomes and last digits of primes with Prob( j ) = 25% for four outcomes. (The difference in the pre-factor is mostly uninteresting in the p o w er-law scaling [22]). This result shows that randomness can b e v alid only at large n umbers for b oth cases. This result provides exp erimen- tal evidence that tossing coins and finding last digits of prime num b ers are intrinsically identical with resp ect to equally likely outcomes. In conclusion, w e introduced a simple expression for randomness at large num b ers. F rom statistical analyses 1 10 100 1000 10000 10 -2 10 -1 10 0 N umber of tos se s Range of relative frequency 10 1 10 2 10 3 10 4 10 5 10 6 10 -3 10 -2 10 -1 Range of relative frequency N um ber of primes Coin T osses  Last Digits  − 0.6237  ±0.0272  − 0.5832  ±0.0094  (a)  (b)  FIG. 3: Analogy b et ween coin tosses and last digits of primes. The range of relative frequency ( R/ N ) (a) for coin tosses be- t ween heads (with error bars coming from one standard devi- ations from five exp eriments) and tails and (b) for last digits among four last digits (with no error bars b ecause the prime n umbers are deterministic). In b oth cases, a p ow er-la w scal- ing of R/ N ∼ N β is found with β = − 0 . 6237 (the standard error = ± 0 . 0272) for coin tosses and β = − 0 . 5832 (the stan- dard error = ± 0 . 0094) for last digits. The trend line is tak en as R/ N = 3 . 1461 N − 0 . 6237 (the adjusted R 2 = 0 . 94085) in (a) and R/ N = 0 . 5294 N − 0 . 5832 (the adjusted R 2 = 0 . 98607) in (b). This result shows that randomness can b e v alid only at large n umbers for b oth cases. of coin tosses and last digits of primes, we show ed that the range of relativ e frequency b et w een equally likely out- comes ( R/ N ) decreases as the total rep etition num b er ( N ) increases. A p ow er-law scaling for R/ N versus N in b oth cases w as found as R/ N ∼ N β ( β ≈ − 0 . 6), imply- ing that the frequency of each outcome b ecomes equal ( R/ N → 0) as the total num b er of repletion increases ( N → ∞ ). The condition of R / N → 0 at N → ∞ ex- plains why randomness is v alid only at large n umbers. This result consequen tly supp orts that finding last dig- its of primes is intrinsically iden tical to tossing coins in statistics: b oth cases are the same problems of equally lik ely outcomes. Finally our finding of the p ow er-la w 4 relation b et ween the range of relative frequency among equally likely outcomes and the total num b er of rep eti- tion w ould b e significant to understand the v alidit y of randomness at large num bers (as known as the la w of large num bers), which would b e imp ortant in statistics, ph ysics, and mathematics. Ac kno wledgments. This researc h w as sup- p orted by Basic Science Research Program through the National Researc h F oundation of Korea (NRF) funded b y the Ministry of Education (Grant No. NRF-2016R1D1A1B01007133 and Grant No. 2019R1A6A1A03033215). [1] J. F ord, Phys. T oday 36 , 4047 (1983). [2] P . Diaconis, S. Holmes, and R. Mon tgomery , SIAM R ev. 49 , 211235 (2007). [3] J. Strzalko, et al. Phys. R ep. 469 , 5992 (2008). [4] L. Mahadev an and E. H. Y ong, Phys. T o day 64 , 6667 (2011). [5] M. F alcioni, L. P alatella, S. Pigolotti, and A. V ulpiani, Phys. R ev. E 72 , 016220 (2005). [6] T. E. Murphy and R. Roy , Natur e Photonics 2 , 714715 (2008). [7] C. F errie and J. Combes, Phys. R ev. L ett. 113 , 120404 (2014). [8] T. OHagan, Signific anc e 1 , 132133 (2004). [9] B. Hay es, Am. Scientist 99 , 282287 (2011). [10] V. Z. V ulovic and R. E. Prange, Phys. R ev. A 33 , 576582 (2008). [11] J. Nagler and P . Ric h ter, Phys. R ev. E 78 , 036207 (2008). [12] J. Strzalko, J. Grabski, A. Stefanski, and T. Kapitaniak, Int. J. Bifur c ation Chaos 20 , 1175 (2010). [13] M. Kapitaniak, J. Strzalko, J. Grabski, and T. Kapita- niak, Chaos 22 , 047504 (2012). [14] M. Le Bellac, Pr o g. Biophys. Mol. Biol. 110 , 97105 (2012). [15] A. Granville and G. Martin, Am. Math. Monthly 113 , 133 (2006). [16] J.-C. Pain, Phys. R ev. E 77 , 012102 (2008). [17] B. Luque and L. Lacasa, Pr o c. R. So c. A 465 , 21972216 (2009). [18] L. Shao and B.-Q. Ma, Physica A 389 , 31093116 (2010). [19] T. T ao, in A n Invitation to Mathematics , D. Schleic her and M. Lackmann, (eds.) (Springer-V erlag, Berlin Hei- delb erg, 2011). [20] S. P . Hozo, B. Djulb ego vic, and I. Hozo, BMC Me d. R es. Metho dol. 5 , 13 (2005). [21] X. W an, W. W ang, J. Liu, and T. T ong, BMC Me d. R es. Metho dol. 14 , 135 (2014). [22] M. E. J. Newman, Contemp. Phys. 46 , 323351 (2005). [23] S. A. F rank, J. Evol. Biol. 22 , 15631585 (2009).

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment