Using GPU Simulation to Accurately Fit to the Power-Law Distribution

This article describes a methodology for fitting experimental data to the discrete power-law distribution and provides the results of a detailed simulation exercise used to calculate accurate cutoff values used to assess the fit to a power-law distri…

Authors: Efstratios Rappos, Stephan Robert

Using GPU Simulation to Accu rately Fit to the P o w er-La w Distribution Efstra tios Rapp os and Stepha n Rob ert 1 Institut des tec hnologies de l’i nformation et de la comm unication Haute Ecole d’Ing ´ enierie et de G estion du Can ton de V aud (HEIG-VD) Haute Ecole Sp´ ec ialis ´ ee de Suisse occidentale Institute for Information and Co mmunicatio n T echnolo gies School of Business and Engineering V a ud (HEIG-VD) Universit y of Applied Sciences of W estern Swi tzerland Abstract This article describes a metho dology for fi t ting exp erimental data to the discrete p ow er-la w dis- tribution and provides the results of a detailed sim ulation exercise used to calculate accurate cutoff v alues used to assess the fit t o a p ow er-la w distribution when using the maximum likeli ho o d estimation for the exp onent of t he distribution. Using massivel y parallel programming computing, we were able to acceler- ate by a factor of 60 th e computational t ime required for these calculations across a range of parameters and constru ct a series of detailed tables containing the test va lues to b e u sed in a Kolmogoro v-Smirnov goo dness-of-fit test, allo wing for an accurate assessment of the p o w er-la w fit from empirical data. Keyw o rds: Zipf distribution; p o w er law; Kolmogoro v-Smirnov t est; parallel compu t ing simulation; graph - ics pro cessing unit programming P A CS: 02.50.Ng– Distribution th eory and Monte Carlo stud ies; 05.10.Ln–Mon te Carlo metho ds; 89.75.- k–Complex systems MSC 2010: 62F03, 68W10, 62P10, 62P30, 62P35, 62Q05 Ma y 2013 1 Intro duction Po w er-law distributions and their extensions characterize many ph ysical, biolo gical and so cial phenomena [2, 8, 1 2, 7, 9, 11] but the pro ces s of accur ately fitting a p ow er-law distribution to empirical da ta is not straig htf orward, and in some ca ses very imprecise methods a re known to b e used, namely ‘estima ting ’ the p ow er-law exponent and fit via linea r reg ression on a lo g-log plot [2]. A popular metho d to fi t a pow er-law is by calculat- ing t he maximum lik eliho o d estimator (MLE) for the distribution exp onent and then using the Kolmo gorov- Smirnov (KS) test to a ssess the go o dness-of-fit by com- paring against sim ulatio n-derived cutoff v alues. The prac- ticalities of this appro a ch ar e des crib ed in [2] and [3]. T o pro duce these cutoff v alues, a large num ber of s ta- tistical simulations needs to b e run. How ever, gene r ic ta- bles cannot alw ays b e used accurately , as the cutoff v alues depe nd on the sa mple size and the es timated v alue o f the exp onent o f the data. Pro ducing suc h ta bles for the power-law is computa- tionally c ha llenging. The most complete set of tables to date was pro duced by [3 ]; ho w ever, pres umably due to limitations of the co mputer technology of the time, aggre- gate v alues were obtaine d acro ss a range of v a lues for the estimated ex po nent. W e extend this work b y providing the calculated cutoff tables for a v ariety of sample sizes and v alues for the exp onent, a task that would require ov er 2.5 years of computational time on a t ypical PC. W e also describe the metho dology and provide computer co de which ena bles r e searchers to calc ula te the corresp o nding tables for v alue s of the exp o nent other than the ones we considered. Recent techn ologica l de velopments in the field of Graphics Pro cess ing Units (GPU), hav e resulted in consumer-level graphical car ds being able to assist with computationally in tensive tasks, becaus e their massively parallel design ca n o utp er form traditional CPU alg o- rithms. The use of g raphics ca rds to improv e the com- putational p ower for sim ulation methods has been stud- ied in many ar eas s uch as Mo nte Carlo techniques [6] and Bay esian estima tio n [10]. W e demonstrate the use of GP U alg orithms for the estimation of the KS cutoff v alues for assessing the go o dness-o f-fit of p ow er-law data. The use o f pa rallel metho ds allows m uc h larger simulations to be pro duced 1 Contact information: Haute Eco le d’Ing´ enierie et de Gestion du Can ton de V aud, HEIG-VD, Rout e de Cheseaux 1, CH-1401 Yv erdon- les-Bains, Switzerland; Efstratios.Rapp os@heig-vd.ch, Steph an.Robert@heig-vd.ch 1 in a shorter time, pro ducing more accurate results and higher pr ecision. F urthermore, we cons ider the case of the truncated power-law distribution where there is an upp er limit to the distribution v alues. This v ariation allows for ca ses where the exp onent γ < 1 to be fitted, as is the case in some phenomena such as the world-wide-web [1]. W e consider t wo v ersions o f the discrete pow er-low dis- tribution, known as the Zipf distribution, descr ib e d by: p ( k ) = k − γ ζ ( γ ) (1) where – k is a p ositive integer 1, 2 , 3, . . . ; – p ( k ) is the pro bability of obser ving the v alue k ; – γ > 1 is the p ow er-law e xp onent; – ζ ( γ ) is an appro pr iate scaling facto r. In the traditiona l version of the pow er-law the v alue of the integer k is un bounded ( k ≥ 1 ) and in that case the sca ling factor is the Riemann zeta function ζ ( γ ) = P ∞ k =1 k − γ and for convergence we must hav e γ > 1. If we assume that the r ange of v alues for k is finite i.e., k = 1 , 2 , . . . , K , then in this truncated Zipf distribution the scaling factor is ζ ( γ ) = P K k =1 k − γ and we o nly require the exp o nent to b e γ > 0 for conv ergence. 2 Estimating the p ow er-la w exp on e n t from the data The maximum likeliho o d estimato r for the p ower-la w pa- rameter is des crib ed in [3] and applies to b oth v aria tions of the Zipf distribution. If the obse r ved dataset consists of N observ ations x 1 , x 2 , . . . , x N , the b est estima te for γ is the v alue that satisfies the equation ζ ′ ( γ ) ζ ( γ ) = − 1 N N X i =1 log( x i ) (2) where ζ ( γ ) is either the sca ling factor describ e d in the previous section. The ab ove differen tial equa tion c a n ea s- ily b e solved for γ using the standa rd Newton-Raphso n metho d. 3 A KS g o o dne s s - of-fit test fo r p o w er- la w distributions The Kolmogor ov-Smirnov test is a traditional statistical test for go o dness-of-fit, r e lying on calculating the s ta tistic K = sup x | F ∗ ( x ) − S ( x ) | (3) where F ∗ is the hypothesized cumulativ e distribution function and S is the empirical cumulativ e distribution based o n the sample data, whic h is then compared with sp ecific cutoff v a lues. There a re alternative approaches, such a s the genera l Khmaladze transfo r mation [4, 5], but are o utside o f the scop e of this ar ticle. The standard ta- bles of cutoff v alues for the KS test ca nnot be directly used when the model parameter s (the γ in o ur case) hav e bee n estimated from the data , and b esp o ke tables hav e to be created using Mo nt e-Carlo simulation. Moreov er, the tables to b e used als o dep end o n the estimated v a lue o f γ and the sample size. Cutoff v alues provided in [3] were obtained b y sim u- lating 1 0,000 Zipf distributions with a rando m exp onent γ = 1 . 5 to 4 . 0, for 14 logarithmica lly-spaced c ho ices of the sample size. Whilst this metho d pr o duces rea sonable r e- sults, we cannot ig nore the fact that the KS cutoff v a lues depe nd on the calc ulated v a lue of γ and therefore average v alues do not work well for cas es where the p ow er-law fit is marg inal. W e extend the results by providing the corres p o nd- ing test v alues, simulating 50,000 Zipf distributions for 15 similar choices of sample size, and in each case for 1 2 p os- sible v alues of γ . In addition, we consider the case of the truncated distribution where observ atio ns are bo unded at K = 20 , 50 , 100 , 5 0 0 a nd 100 0. W e r ep eat each exp er iment 10 times for each case and tabulate the av erage v alue o b- tained in e ach c a se. In total, this results to a total o f ov er 10,000 sepa r ate simulations compar ed to the 14 used in the above-men tioned resear ch, each one containing five times the num b er of p oints. 4 A CUDA algo rithm fo r the calculation of the KS test values T o a chieve this lev el o f exp erimentation, the simulations were per fo rmed in a parallel computing environmen t con- sisting of t w o GTX59 0 g raphics pro cessing units (GPU) on a PC using the CUD A/ C progra mming languag e. This approach carr ies o ut the calc ulations in a high-end co m- puter g raphics card r ather than in the CPU and the inher- ent para lle l a r chitecture of the GPU makes it well suited for simulation exp erimentation, allowing for a 60 times faster pro gram executio n sp eed compar e d to CPU c a lcu- lations. Indeed, we were able to pr o duce these sim ulation results in just ov er 373 hours of computational time; using traditional CP U progr amming this would hav e taken 2.5 years. The algo rithm, av ailable as a s upplemen tary mater ial to this article, separa tes the simulations in to 7 82 blocks of 6 4 simulations (threads) ea ch. The last 48 s imu lations are discarded to give the required 50,000 simulations. The progra m is r ep eated for the differen t v alues of N , K a nd γ . Care is taken in the co de to e ns ure an efficient execution, for example, the natural logar ithms of the first K integers are pr e-computed and store d in an ar ray: this sp eeds up considerably the calculation since the terms k − γ , which 2 app ear in ζ ( γ ) and its deriv ativ es, can b e calculated as e − γ ln k . Car e should also b e taken, as expla ined in the at- tached co de, to adjust a compiler para meter when running the co de in order to ensure all calc ula tions are car ried out in double-precis ion rather than single-prec is ion b y default and av oid numerical underflow in the calcula tions. T able 1 presents the test v alues to use for the pure Zipf distribution (whic h co rresp onds to a truncated Zipf distri- bution with K = ∞ ) for v ar io us choices of the estimated v alue o f the ex po nent γ . T ables 2 to 6 pres ent the cor re- sp onding tables for the truncated p ower-la w distr ibution with K = 20, 50 , 100 , 500 and 10 00 resp ectively . These refinements extend the accur acy of the imple- men tation. W e note the v ar iation in the c utoff v a lues of T able 1 dep ending on the exp onent γ : for example, the 90% cutoff v alue for a sa mple size of 1,000 ra nges from 0.0056 when γ = 4 to 0.056 9 when γ = 1 . 25, a difference of a factor of 10. In co ntrast, the cor resp onding figure in [3], calcula ted for an ‘average’ exp o nent is re p o rted to be 0.0186 . This demonstra tes the imp o rtance of using cutoff tables that ar e pa rticular not o nly to the sp ecific sample size but also the v alue o f the exp onent γ . In practice, the v a lue of γ calc ula ted from the data will probably not be an exa ct matc h with an y of th e tabu- lated v alues. Ideally , to achiev e the b est level of accura cy , a meticulo us resea rcher w ould hav e to cre a te a besp o ke table containing the cutoff v alues that co rresp ond to the exact v alue of γ a s calcula ted from the sa mple. Neverthe- less, our tables provide a useful appro ximation fo r cases where this level o f pr ecision is no t r equired, a nd a s imple gauge o f how go o d the p ow er-law fit is required. In an y case, marg inal cases aside, using these tables with a close approximate v alue for γ can b e a lo t more pr ecise than log-log plots or the Pearson’s test. Finally , it is worth no ting that the tables prese nt ed ap- ply only when the exp onent γ has be e n calcula ted using the MLE metho d descr ibe d in Section 2 and w ould not b e relev ant if a different metho d was us ed ins tea d. The wa y to use these ta ble s in practice is describ ed in [2] and [3]. Assuming one has a s e t of discrete o bserv a tions and wis hes t o test if they follow th e Zipf distribution, they would first calculate the maximum likelihoo d estimator for the exp onent γ us ing (2). Then, they would calc ula te the test statistic (3) by deter mining the max im um deviatio n of the empiric a l c umulative distribution function against the theor etical Zipf one. This tes t statistic will then b e compared with the cut- off v alue in the tables that corr e s p o nds to the v alues of N , K and estimated γ of the obs e r ved dataset. If the test v alue is less than the tabulated v alue, ther e is insufficient evidence to r eject the hypo thesis that the da ta follow a Zipf distribution, a t the r equired level of significanc e . As men tioned ear lie r , fo r maximum accuracy a b esp o ke cutoff v alue would ideally need to b e ca lculated matching exactly the v alues o f N , K , γ of the s a mple. This can b e achieved using the acco mpa nying co de. 5 Conclusions W e pr esented the results of a detailed simulation to cal- culate the cutoff v alue s o f the Kolmogorov-Smirnov test when used to asses s the fit of empirical data to the dis- crete Zipf o r pow er-law distribution. W e carry o ut a muc h larger set of simulations that the state- of-the art and fur- ther extend previous r esearch by brea king down the cutoff tables acco rding to the estimated v alue of the Zipf exp o- nent and further consider tw o versions o f the Zipf distr i- bution. This le vel of c o mplexity w as only p ossible us ing Gr aph- ical Pr o cessing Unit (GPU) alg orithms to massively par- allelize the sim ulations. In doing so, we pro duced a 60- fold fas ter simulation algo r ithm co mpared with traditional progra mming tec hniques, which demonstrates the huge po tent ial v alue of GPU techniques in improving the p er- formance of statistical s im ulations and other co mplex al- gorithms. The pr ovided computer co de is als o of b e nefit to any res earcher who needs, for mo r e accurac y , to crea te their own Kolmog orov-Smirnov cuto ff v alue which is sp e- cific to the sample siz e and estimated exp onent of their datasets. 6 Supplementa ry Materials CUD A/C co de: The annex con tains the CUDA pro gram that can b e used to r eplicate the res ults presen ted in this article. The instructions for compilatio n and use are included in the co de. References [1] L Breslau, P Ca o, L F an, G Phillips, a nd S Shenker. W eb caching and Zipf-like distributions: Evidence and implications. In Pr o c e e dings of the IEEE INFO- COM 99 Confer enc e on Computer Communic ations: the F u tur e is Now , volume 1–3 , pages 1 26–13 4, 19 99. [2] Aaron Clauset, Cosma Rohilla Sha liz i, and M. E. J. Newman. P ow e r-law distributions in e mpirical data. SIAM Revi ew , 51 (4):661–7 03, December 200 9. [3] M. L. Goldstein, S. A. Morris, and G. G. Y en. P rob- lems with fitting to the p ow er-law distribution. Eu - r op e an Physic al Journal B , 4 1:255– 258, 20 04. [4] E.V. Khmaladze . Martingale approach to the theory of go o dness o f fit tests. The ory of Pr ob ability and its Applic ations , 26 (2):240– 257, 19 8 1. [5] E.V. Khmaladze. Go o dnes s of fit pro ble m and sca n- ning innov a tion martingales. The Annals of Statis- tics , 2 1(2):798 – 829, 1993. [6] Anthon y Lee, Christopher Y au, Michael B. Giles, Ar- naud Doucet, and Christopher C. Holmes. On the utilit y of gra phics cards to p erform massively par- allel simulation of adv anced Monte Ca rlo metho ds. 3 T able 1: K S tes t statistic for the pure p ower-la w distribution γ = 1 . 25, Quantiles γ = 1 . 5, Quan tiles γ = 1 . 75, Quantiles γ = 2 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2792 .3092 .36 68 .4315 .2410 .271 0 .3308 .3994 .2115 .2411 .3027 .3761 .1835 .2116 .2739 .3515 20 .2043 .2262 .2692 .3198 .1702 .1913 .2342 .2865 .1489 .1694 .2127 .2656 .1289 . 1484 .1897 .2426 30 .1716 .1896 .2251 .2668 .1392 .1564 .1916 .2346 .1217 .1383 .1739 .2173 .1054 . 1207 .1541 .1960 40 .1522 .1678 .1993 .2353 .1207 .1354 .1665 .2048 .1054 .1195 .1504 .1893 .0911 . 1045 .1331 .1697 50 .1391 .1532 .1808 .2136 .1080 .1211 .1491 .1834 .0943 .1071 .1343 .1688 .0815 . 0934 .1189 .1526 100 .1073 .1174 .1373 .1610 .0766 .0860 .1058 .1307 .0666 .0756 .0949 .1192 .0576 .0658 .0836 .1062 500 .0662 .0708 .0799 .0907 .0352 .0396 .0485 .0597 .0298 .0338 . 0424 .0533 .0258 .0294 .0373 .0468 1000 .0569 . 0602 .0667 .0742 .0257 .0288 .0353 .0433 .0211 .0239 . 0299 .0376 .0182 .0208 .0264 .0334 2000 .0505 . 0529 .0575 .0629 .0192 .0215 .0261 .0317 .0149 .0169 . 0212 .0266 .0129 .0147 .0187 .0236 3000 .0478 . 0497 .0535 .0579 .0164 .0183 .0220 .0266 .0122 .0138 . 0173 .0217 .0105 .0120 .0152 .0192 4000 .0461 . 0478 .0511 .0548 .0148 .0164 .0196 .0237 .0106 .0120 .0150 .0189 .0091 .0104 .0131 .0167 5000 .0450 . 0466 .0495 .0529 .0137 .0151 .0181 .0217 .0095 .0107 . 0135 .0168 .0081 .0093 .0118 .0149 10000 .0424 .0435 .0456 .0479 . 0109 .0119 .0140 .0165 .0067 .0076 .0095 .0120 .0058 . 0066 .0083 .0105 20000 .0406 .0413 .0428 .0445 . 0090 .0098 .0112 .0130 .0048 .0054 .0068 .0085 .0041 . 0047 .0059 .0075 50000 .0390 .0395 .0405 .0415 . 0074 .0078 .0087 .0098 .0031 .0035 .0044 .0055 .0026 . 0029 .0037 .0047 γ = 2 . 5, Quantiles γ = 3 . 0, Quan tiles γ = 3 . 5, Quantiles γ = 4 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .1375 .1591 .2141 .2876 .1005 .1281 .1695 .2351 .0819 .0992 .1365 .1921 .0819 . 0819 .1092 .1530 20 .0963 .1117 .1472 .1936 .0724 .0849 .1123 .1528 .0534 .0643 .0879 .1245 .0440 . 0533 .0735 .0995 30 .0781 .0903 .1175 .1540 .0573 .0678 .0897 .1188 .0430 .0532 .0699 .0950 .0311 . 0409 .0550 .0763 40 .0674 .0782 .1012 .1314 .0501 .0584 .0771 .1014 .0372 .0444 .0593 .0800 .0281 . 0351 .0463 .0639 50 .0603 .0697 .0901 .1173 .0448 .0521 .0679 .0898 .0330 .0391 .0526 .0704 .0264 . 0306 .0408 .0552 100 .0427 .0491 .0630 .0815 .0315 .0365 .0471 .0608 .0235 .0273 .0357 .0467 .0178 .0207 .0277 .0369 500 .0190 .0219 .0280 .0353 .0141 .0162 .0207 .0261 .0105 .0121 . 0155 .0196 .0079 .0092 .0118 .0151 1000 .0135 . 0155 .0198 .0251 .0100 .0115 .0146 .0185 .0074 .0086 . 0109 .0137 .0056 .0064 .0083 .0105 2000 .0095 . 0109 .0140 .0177 .0070 .0081 .0103 .0130 .0052 .0060 . 0077 .0097 .0039 .0046 .0058 .0073 3000 .0078 . 0089 .0114 .0144 .0057 .0066 .0084 .0106 .0043 .0049 . 0063 .0079 .0032 .0037 .0047 .0059 4000 .0067 . 0077 .0098 .0125 .0050 .0057 .0073 .0092 .0037 .0043 . 0054 .0068 .0028 .0032 .0041 .0052 5000 .0060 . 0069 .0088 .0112 .0045 .0051 .0065 .0082 .0033 .0038 .0048 .0061 .0025 .0029 .0037 .0046 10000 .0043 .0049 .0062 .0079 . 0031 .0036 .0046 .0058 .0023 .0027 .0034 .0043 .0018 . 0020 .0026 .0032 20000 .0030 .0035 .0044 .0056 . 0022 .0026 .0033 .0041 .0017 .0019 .0024 .0030 .0012 . 0014 .0018 .0023 50000 .0019 .0022 .0028 .0035 . 0014 .0016 .0021 .0026 .0010 .0012 .0015 .0019 .0008 . 0009 .0012 .0015 Journal of Computational and Gr aphic al St atistics , 19(4):769 –789 , 2010. [7] Jo anne L e e , W endy K. T am Cho, and Geor ge G. Judge. Stigler’s appro ach to r ecov ering the distri- bution o f fir st s ignificant digits in natural da ta sets. Statistics & Pr ob ability L et ters , 80(2):82– 88, Jan uary 2010. [8] M. E. J. Newman. Po wer laws, Pareto distributions and Zipf ’s law. Contemp or ary Physics , 46(5):3 23– 351, 20 05. [9] Andrew I. Su and Jo hn B. Hog enesch. Pow e r -law-lik e distributions in biomedica l publicatio ns a nd r esearch funding. Genome Biolo gy , 8:4 04, April 200 7 . [10] Marc A. Suchard, Quanli W ang, Cliburn Chan, Ja- cob F relinger, Andrew Cron, and Mik e W est. Under- standing GP U progr amming for statistica l computa- tion: Studies in massively para llel mass ive mix tur es. Journal of Co mputational and Gr aphic al Statistics , 19(2):419 –438 , 201 0. [11] Hsiaw-Chan Y eh. Six multiv ariate Zipf distr ibutions and their r elated pro pe rties. Statistics & Pr ob ability L ett ers , 56(2):1 31–14 1, Ja nuary 200 2. [12] Peter Z ¨ ornig and Gabriel Altmann. Unified represen- tation of Zipf distributions. Computational S tatistics and Data Analysi s , 19:4 6 1–47 3, 1995 . 4 T able 2: KS test statistic fo r the truncated p ow er-law distribution with K = 20 K = 20: γ = 0 . 25, Quantiles γ = 0 . 5, Quan tiles γ = 0 . 75, Quantiles γ = 1 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2486 .2751 .3286 .3915 .2387 .2640 .3159 .3770 .2266 .2503 .2990 .3595 .2128 . 2353 .2812 .3387 20 .1752 .1943 .2336 .2796 .1684 .1865 .2239 .2684 .1599 .1767 .2114 .2530 .1504 . 1662 .1983 .2380 30 .1436 .1594 .1914 .2310 .1376 .1525 .1828 .2197 .1303 .1442 .1722 .2062 .1224 . 1354 .1615 .1932 40 .1245 .1381 .1664 .2006 .1195 .1323 .1587 .1915 .1131 .1252 .1495 .1796 .1061 . 1174 .1400 .1682 50 .1114 .1237 .1490 .1792 .1067 .1183 .1425 .1702 .1012 .1119 .1339 .1602 .0949 . 1049 .1252 .1504 100 .0788 .0876 .1054 .1267 .0755 .0838 .1007 .1212 .0716 .0792 .0947 .1137 .0671 .0742 .0886 .1059 500 .0352 .0392 .0471 .0569 .0338 .0375 .0451 .0543 .0320 .0354 . 0424 .0508 .0300 .0332 .0396 .0474 1000 .0249 . 0277 .0333 .0403 .0239 .0265 .0318 .0387 .0226 .0250 . 0299 .0361 .0212 .0235 .0280 .0334 2000 .0176 . 0196 .0236 .0285 .0169 .0187 .0225 .0271 .0160 .0177 . 0212 .0254 .0150 .0166 .0198 .0236 3000 .0144 . 0160 .0192 .0233 .0138 .0153 .0184 .0221 .0130 .0144 . 0173 .0208 .0122 .0135 .0161 .0193 4000 .0125 . 0139 .0167 .0202 .0119 .0133 .0159 .0192 .0113 .0125 .0150 .0180 .0106 .0117 .0140 .0167 5000 .0111 . 0124 .0149 .0181 .0107 .0119 .0142 .0171 .0101 .0112 . 0134 .0160 .0095 .0105 .0125 .0150 10000 .0079 .0088 .0106 .0128 . 0076 .0084 .0101 .0121 .0072 .0079 .0095 .0114 .0067 . 0074 .0089 .0106 20000 .0056 .0062 .0075 .0091 . 0053 .0059 .0071 .0086 .0051 .0056 .0067 .0081 .0047 . 0052 .0063 .0075 50000 .0035 .0039 .0047 .0057 . 0034 .0037 .0045 .0054 .0032 .0035 .0042 .0051 .0030 . 0033 .0039 .0047 K = 20: γ = 1 . 25, Quantiles γ = 1 . 5, Quan tiles γ = 1 . 75, Quantiles γ = 2 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .1985 .2205 .2649 .3222 .1836 .2053 .2509 .3115 .1695 .1901 .2347 .3001 .1531 . 1727 .2183 .2869 20 .1400 .1554 .1865 .2263 .1295 .1444 .1761 .2173 .1189 .1336 .1653 .2070 .1077 . 1226 .1535 .1962 30 .1143 .1267 .1519 .1833 .1058 .1179 .1434 .1769 .0968 .1089 .1345 .1688 .0880 . 0998 .1249 .1586 40 .0989 .1096 .1313 .1587 .0915 .1020 .1241 .1524 .0838 .0943 .1164 .1467 .0761 . 0863 .1081 .1377 50 .0884 .0981 .1175 .1423 .0817 .0912 .1109 .1363 .0749 .0843 .1040 .1310 .0681 . 0771 .0962 .1228 100 .0624 .0693 .0833 .1005 .0577 .0643 .0783 .0963 .0530 .0596 .0735 .0916 .0480 .0544 .0680 .0855 500 .0279 .0310 .0371 .0449 .0258 .0288 .0350 .0432 .0236 .0265 . 0327 .0409 .0215 .0243 .0303 .0379 1000 .0197 . 0219 .0263 .0317 .0182 .0203 .0248 .0305 .0167 .0188 . 0232 .0291 .0152 .0172 .0215 .0271 2000 .0140 . 0155 .0186 .0224 .0129 .0144 .0174 .0214 .0118 .0133 . 0164 .0204 .0107 .0122 .0152 .0190 3000 .0114 . 0126 .0151 .0183 .0105 .0117 .0143 .0175 .0096 .0108 . 0134 .0166 .0088 .0099 .0124 .0156 4000 .0099 . 0109 .0132 .0159 .0091 .0102 .0124 .0153 .0083 .0094 . 0116 .0145 .0076 .0086 .0107 .0135 5000 .0088 . 0098 .0118 .0142 .0082 .0091 .0111 .0136 .0075 .0084 .0104 .0129 .0068 .0077 .0096 .0121 10000 .0062 .0069 .0083 .0100 . 0058 .0064 .0078 .0096 .0053 .0059 .0073 .0091 .0048 . 0054 .0068 .0085 20000 .0044 .0049 .0059 .0071 . 0041 .0045 .0055 .0068 .0037 .0042 .0052 .0065 .0034 . 0038 .0048 .0060 50000 .0028 .0031 .0037 .0045 . 0026 .0029 .0035 .0043 .0024 .0027 .0033 .0041 .0021 . 0024 .0030 .0038 K = 20: γ = 2 . 5, Quantiles γ = 3 . 0, Quan tiles γ = 3 . 5, Quantiles γ = 4 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .1240 .1447 .1867 .2474 .0956 .1170 .1519 .2007 .0821 .0955 .1310 .1726 .0821 . 0821 .1074 .1455 20 .0871 .1003 .1290 .1662 .0695 .0801 .1048 .1363 .0524 .0630 .0852 .1129 .0440 . 0523 .0722 .0955 30 .0706 .0814 .1030 .1330 .0547 .0651 .0839 .1086 .0421 .0519 .0669 .0878 .0310 . 0411 .0539 .0732 40 .0612 .0702 .0891 .1137 .0476 .0552 .0721 .0928 .0363 .0438 .0567 .0748 .0280 . 0351 .0456 .0617 50 .0547 .0627 .0797 .1015 .0427 .0493 .0635 .0822 .0326 .0388 .0506 .0661 .0263 . 0302 .0403 .0531 100 .0386 .0442 .0559 .0712 .0301 .0347 .0443 .0563 .0231 .0267 . 0346 .0447 .0178 .0206 .0272 .0360 500 .0172 .0197 .0248 .0312 .0134 .0154 .0195 .0245 .0103 .0119 .0151 .0190 .0078 .0091 .0117 .0149 1000 .0122 . 0140 .0176 .0222 .0095 .0109 .0138 .0173 .0073 .0084 . 0106 .0134 .0055 .0064 .0082 .0104 2000 .0086 . 0099 .0124 .0157 .0067 .0077 .0098 .0123 .0051 .0059 . 0075 .0095 .0039 .0045 .0058 .0073 3000 .0070 . 0080 .0101 .0128 .0055 .0063 .0080 .0100 .0042 .0048 . 0061 .0077 .0032 .0037 .0047 .0059 4000 .0061 . 0070 .0088 .0111 .0047 .0054 .0069 .0086 .0036 .0042 . 0053 .0066 .0028 .0032 .0041 .0051 5000 .0054 . 0062 .0079 .0099 .0042 .0049 .0062 .0078 .0032 .0037 .0047 .0059 .0025 .0029 .0036 .0046 10000 .0039 .0044 .0056 .0070 . 0030 .0034 .0044 .0054 .0023 .0026 .0033 .0042 .0017 . 0020 .0026 .0032 20000 .0027 .0031 .0039 .0049 . 0021 .0024 .0031 .0039 .0016 .0019 .0024 .0030 .0012 . 0014 .0018 .0023 50000 .0017 .0020 .0025 .0031 . 0013 .0015 .0020 .0024 .0010 .0012 .0015 .0019 .0008 . 0009 .0011 .0014 5 T able 3: KS test statistic fo r the truncated p ow er-law distribution with K = 50 K = 50: γ = 0 . 25, Quantiles γ = 0 . 5, Quan tiles γ = 0 . 75, Quantiles γ = 1 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2685 .2960 .3508 .4146 .2572 .2833 .3362 .3985 .2421 .2660 .3157 .3755 .2258 . 2479 .2928 .3504 20 .1917 .2113 .2513 .2986 .1834 .2021 .2405 .2853 .1724 .1895 .2248 .2665 .1606 . 1763 .2080 .2475 30 .1567 .1730 .2058 .2451 .1501 .1653 .1968 .2349 .1410 .1550 .1837 .2193 .1312 . 1441 .1701 .2024 40 .1357 .1498 .1788 .2136 .1300 .1434 .1707 .2042 .1222 .1344 .1593 .1902 .1137 . 1248 .1473 .1758 50 .1208 .1336 .1597 .1906 .1160 .1280 .1529 .1819 .1092 .1201 .1427 .1698 .1015 . 1116 .1321 .1568 100 .0857 .0948 .1134 .1355 .0821 .0907 .1081 .1292 .0771 .0849 .1007 .1203 .0717 .0788 .0932 .1107 500 .0383 .0424 .0507 .0607 .0367 .0405 .0483 .0578 .0345 .0380 . 0451 .0538 .0321 .0352 .0417 .0494 1000 .0271 . 0300 .0358 .0430 .0259 .0286 .0341 .0410 .0244 .0269 . 0318 .0382 .0227 .0249 .0294 .0350 2000 .0192 . 0212 .0254 .0304 .0183 .0203 .0242 .0289 .0172 .0190 . 0225 .0269 .0160 .0176 .0208 .0248 3000 .0156 . 0173 .0207 .0249 .0150 .0165 .0197 .0237 .0141 .0155 . 0184 .0219 .0131 .0144 .0170 .0202 4000 .0136 . 0150 .0179 .0216 .0130 .0143 .0171 .0205 .0122 .0134 .0159 .0190 .0113 .0125 .0148 .0175 5000 .0121 . 0134 .0161 .0192 .0116 .0128 .0153 .0183 .0109 .0120 . 0143 .0171 .0101 .0112 .0132 .0156 10000 .0086 .0095 .0113 .0136 . 0082 .0091 .0108 .0129 .0077 .0085 .0101 .0120 .0072 . 0079 .0093 .0110 20000 .0061 .0067 .0080 .0096 . 0058 .0064 .0077 .0092 .0055 .0060 .0071 .0085 .0051 . 0056 .0066 .0078 50000 .0038 .0042 .0051 .0061 . 0037 .0041 .0048 .0058 .0034 .0038 .0045 .0054 .0032 . 0035 .0042 .0050 K = 50: γ = 1 . 25, Quantiles γ = 1 . 5, Quan tiles γ = 1 . 75, Quantiles γ = 2 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2104 .2323 .2775 .3335 .1953 .2180 .2670 .3307 .1805 .2035 .2546 .3234 .1634 . 1863 .2384 .3105 20 .1492 .1646 .1962 .2373 .1383 .1541 .1877 .2342 .1273 .1438 .1794 .2252 .1155 . 1318 .1667 .2139 30 .1219 .1345 .1603 .1931 .1131 .1261 .1539 .1902 .1040 .1174 .1468 .1845 .0944 . 1075 .1362 .1727 40 .1056 .1165 .1388 .1672 .0978 .1091 .1331 .1652 .0901 .1017 .1268 .1610 .0817 . 0930 .1178 .1512 50 .0943 .1041 .1241 .1497 .0874 .0976 .1192 .1472 .0805 .0909 .1135 .1435 .0730 . 0832 .1051 .1348 100 .0666 .0737 .0879 .1061 .0618 .0690 .0843 .1046 .0569 .0643 .0801 .1014 .0516 .0587 .0741 .0941 500 .0298 .0329 .0393 .0475 .0276 .0308 .0377 .0470 .0254 .0287 . 0358 .0452 .0231 .0263 .0332 .0417 1000 .0211 . 0233 .0278 .0336 .0195 .0218 .0267 .0332 .0180 .0203 . 0254 .0321 .0163 .0186 .0234 .0297 2000 .0149 . 0164 .0196 .0238 .0138 .0154 .0189 .0235 .0127 .0143 . 0179 .0225 .0115 .0131 .0166 .0209 3000 .0122 . 0134 .0160 .0195 .0113 .0126 .0154 .0190 .0104 .0117 . 0146 .0184 .0094 .0107 .0135 .0171 4000 .0105 . 0116 .0139 .0169 .0098 .0109 .0134 .0166 .0090 .0101 . 0127 .0159 .0082 .0093 .0117 .0148 5000 .0094 . 0104 .0124 .0150 .0087 .0097 .0119 .0148 .0080 .0091 .0113 .0142 .0073 .0083 .0105 .0133 10000 .0067 .0074 .0088 .0106 . 0062 .0069 .0085 .0104 .0057 .0064 .0080 .0101 .0052 . 0059 .0074 .0094 20000 .0047 .0052 .0062 .0075 . 0044 .0049 .0060 .0074 .0040 .0045 .0057 .0071 .0036 . 0042 .0052 .0066 50000 .0030 .0033 .0039 .0047 . 0028 .0031 .0038 .0047 .0025 .0029 .0036 .0045 .0023 . 0026 .0033 .0042 K = 50: γ = 2 . 5, Quantiles γ = 3 . 0, Quan tiles γ = 3 . 5, Quantiles γ = 4 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .1307 .1508 .2037 .2672 .0993 .1243 .1612 .2154 .0819 .0983 .1350 .1860 .0819 . 0819 .1085 .1510 20 .0920 .1067 .1373 .1791 .0715 .0836 .1096 .1461 .0532 .0642 .0872 .1217 .0440 . 0530 .0732 .0984 30 .0747 .0863 .1107 .1433 .0566 .0672 .0881 .1146 .0428 .0530 .0693 .0929 .0311 . 0409 .0550 .0759 40 .0646 .0744 .0955 .1227 .0494 .0574 .0750 .0981 .0371 .0442 .0587 .0782 .0281 . 0351 .0463 .0635 50 .0578 .0665 .0852 .1094 .0441 .0512 .0666 .0867 .0329 .0390 .0521 .0688 .0264 . 0305 .0407 .0548 100 .0408 .0468 .0598 .0770 .0311 .0359 .0461 .0591 .0234 .0272 . 0354 .0460 .0178 .0207 .0276 .0367 500 .0182 .0209 .0266 .0335 .0139 .0160 .0203 .0256 .0104 .0121 .0154 .0194 .0079 .0092 .0118 .0150 1000 .0129 . 0148 .0188 .0238 .0098 .0113 .0143 .0181 .0074 .0085 . 0108 .0136 .0055 .0064 .0083 .0104 2000 .0091 . 0105 .0133 .0168 .0069 .0080 .0101 .0128 .0052 .0060 . 0077 .0096 .0039 .0045 .0058 .0073 3000 .0074 . 0085 .0109 .0137 .0057 .0065 .0083 .0104 .0043 .0049 . 0062 .0078 .0032 .0037 .0047 .0059 4000 .0064 . 0074 .0094 .0119 .0049 .0056 .0072 .0090 .0037 .0043 . 0054 .0067 .0028 .0032 .0041 .0052 5000 .0058 . 0066 .0084 .0107 .0044 .0050 .0064 .0081 .0033 .0038 .0048 .0060 .0025 .0029 .0037 .0046 10000 .0041 .0047 .0059 .0075 . 0031 .0036 .0045 .0057 .0023 .0027 .0034 .0043 .0018 . 0020 .0026 .0032 20000 .0029 .0033 .0042 .0053 . 0022 .0025 .0032 .0040 .0017 .0019 .0024 .0030 .0012 . 0014 .0018 .0023 50000 .0018 .0021 .0027 .0034 . 0014 .0016 .0020 .0026 .0010 .0012 .0015 .0019 .0008 . 0009 .0012 .0015 6 T able 4: KS test statistic for the tr unca ted p ow er-law distribution with K = 100 K = 100: γ = 0 . 25, Quantiles γ = 0 . 5, Quan tiles γ = 0 . 75, Quantiles γ = 1 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2772 .3052 .3602 .4243 .2657 .2922 .3460 .4089 .2490 .2732 .3235 .3846 .2313 . 2531 .2978 .3546 20 .1990 .2190 .2596 .3076 .1905 .2095 .2483 .2940 .1781 .1955 .2312 .2736 .1649 . 1806 .2124 .2508 30 .1631 .1798 .2132 .2538 .1562 .1719 .2037 .2427 .1462 .1604 .1896 .2247 .1351 . 1480 .1737 .2053 40 .1416 .1559 .1855 .2218 .1355 .1491 .1772 .2113 .1268 .1393 .1644 .1963 .1172 . 1283 .1506 .1785 50 .1265 .1397 .1662 .1975 .1213 .1337 .1588 .1887 .1135 .1246 .1474 .1750 .1049 . 1148 .1349 .1594 100 .0893 .0986 .1174 .1402 .0857 .0945 .1122 .1337 .0802 .0882 .1043 .1240 .0742 .0813 .0956 .1130 500 .0400 .0441 .0526 .0629 .0383 .0422 .0502 .0600 .0358 .0394 . 0466 .0555 .0331 .0363 .0427 .0506 1000 .0283 . 0312 .0372 .0444 .0271 .0299 .0355 .0425 .0253 .0278 . 0329 .0393 .0234 .0256 .0302 .0355 2000 .0200 . 0221 .0263 .0315 .0192 .0211 .0251 .0301 .0179 .0197 . 0233 .0278 .0166 .0182 .0213 .0252 3000 .0163 . 0180 .0214 .0257 .0156 .0172 .0205 .0245 .0146 .0161 . 0190 .0226 .0135 .0148 .0174 .0205 4000 .0141 . 0156 .0186 .0223 .0135 .0149 .0178 .0212 .0127 .0139 .0165 .0196 .0117 .0128 .0151 .0178 5000 .0126 . 0140 .0166 .0199 .0121 .0134 .0159 .0190 .0113 .0125 . 0148 .0175 .0105 .0115 .0135 .0159 10000 .0090 .0099 .0118 .0141 . 0086 .0094 .0112 .0134 .0080 .0088 .0104 .0124 .0074 . 0081 .0095 .0113 20000 .0063 .0070 .0083 .0100 . 0061 .0067 .0080 .0095 .0057 .0062 .0074 .0088 .0052 . 0057 .0067 .0080 50000 .0040 .0044 .0053 .0063 . 0038 .0042 .0050 .0060 .0036 .0039 .0047 .0056 .0033 . 0036 .0043 .0051 K = 100: γ = 1 . 25, Quantiles γ = 1 . 5, Quan tiles γ = 1 . 75, Quantiles γ = 2 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2161 .2381 .2832 .3405 .2022 .2259 .2770 .3420 .1869 .2115 .2663 .3366 .1697 . 1941 .2491 .3212 20 .1536 .1691 .2012 .2424 .1433 .1599 .1955 .2443 .1323 .1496 .1877 .2349 .1196 . 1370 .1737 .2232 30 .1256 .1384 .1644 .1977 .1172 .1310 .1602 .1991 .1082 .1224 .1535 .1931 .0978 . 1116 .1420 .1803 40 .1088 .1199 .1425 .1723 .1015 .1134 .1388 .1727 .0937 .1059 .1329 .1684 .0846 . 0966 .1229 .1573 50 .0973 .1072 .1274 .1543 .0907 .1014 .1243 .1548 .0838 .0948 .1188 .1511 .0757 . 0864 .1096 .1407 100 .0689 .0758 .0904 .1090 .0642 .0717 .0880 .1096 .0592 .0670 .0840 .1062 .0535 .0610 .0773 .0982 500 .0307 .0339 .0404 .0487 .0287 .0320 .0394 .0493 .0264 .0299 . 0375 .0473 .0239 .0273 .0345 .0434 1000 .0217 . 0240 .0286 .0347 .0203 .0227 .0279 .0348 .0187 .0212 . 0266 .0336 .0169 .0193 .0244 .0311 2000 .0154 . 0169 .0202 .0244 .0143 .0160 .0197 .0246 .0132 .0150 . 0188 .0237 .0120 .0137 .0173 .0219 3000 .0126 . 0138 .0165 .0198 .0117 .0131 .0161 .0200 .0108 .0122 . 0153 .0193 .0098 .0111 .0141 .0178 4000 .0109 . 0120 .0143 .0173 .0101 .0113 .0140 .0175 .0094 .0106 . 0133 .0168 .0085 .0096 .0122 .0155 5000 .0097 . 0107 .0128 .0154 .0091 .0102 .0125 .0156 .0084 .0095 .0119 .0150 .0076 .0086 .0109 .0138 10000 .0069 .0076 .0090 .0110 . 0064 .0072 .0088 .0109 .0059 .0067 .0084 .0106 .0054 . 0061 .0077 .0098 20000 .0049 .0054 .0064 .0077 . 0045 .0051 .0062 .0078 .0042 .0047 .0059 .0074 .0038 . 0043 .0055 .0069 50000 .0031 .0034 .0040 .0049 . 0029 .0032 .0039 .0049 .0026 .0030 .0038 .0047 .0024 . 0027 .0035 .0044 K = 100: γ = 2 . 5, Quantiles γ = 3 . 0, Quan tiles γ = 3 . 5, Quantiles γ = 4 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .1335 .1548 .2093 .2716 .1001 .1263 .1655 .2272 .0819 .0989 .1356 .1894 .0819 . 0819 .1088 .1521 20 .0942 .1094 .1410 .1849 .0721 .0844 .1111 .1502 .0534 .0643 .0878 .1233 .0440 . 0532 .0734 .0991 30 .0764 .0883 .1140 .1481 .0571 .0676 .0892 .1168 .0429 .0531 .0698 .0943 .0311 . 0409 .0550 .0762 40 .0660 .0762 .0981 .1263 .0498 .0580 .0762 .0997 .0372 .0444 .0592 .0793 .0281 . 0351 .0463 .0637 50 .0590 .0680 .0874 .1132 .0445 .0518 .0674 .0883 .0330 .0391 .0525 .0699 .0264 . 0306 .0408 .0551 100 .0417 .0479 .0612 .0789 .0314 .0363 .0467 .0600 .0234 .0273 . 0356 .0465 .0178 .0207 .0276 .0368 500 .0186 .0214 .0273 .0345 .0140 .0161 .0205 .0259 .0105 .0121 .0154 .0195 .0079 .0092 .0118 .0151 1000 .0132 . 0152 .0193 .0245 .0099 .0114 .0145 .0183 .0074 .0085 . 0109 .0137 .0056 .0064 .0083 .0105 2000 .0093 . 0107 .0136 .0173 .0070 .0081 .0103 .0129 .0052 .0060 . 0077 .0097 .0039 .0045 .0058 .0073 3000 .0076 . 0087 .0111 .0141 .0057 .0066 .0084 .0105 .0043 .0049 . 0062 .0079 .0032 .0037 .0047 .0059 4000 .0066 . 0076 .0096 .0122 .0049 .0057 .0072 .0091 .0037 .0043 . 0054 .0068 .0028 .0032 .0041 .0052 5000 .0059 . 0068 .0086 .0109 .0044 .0051 .0065 .0082 .0033 .0038 .0048 .0061 .0025 .0029 .0037 .0046 10000 .0042 .0048 .0061 .0077 . 0031 .0036 .0046 .0057 .0023 .0027 .0034 .0043 .0018 . 0020 .0026 .0032 20000 .0029 .0034 .0043 .0055 . 0022 .0026 .0032 .0041 .0017 .0019 .0024 .0030 .0012 . 0014 .0018 .0023 50000 .0019 .0021 .0027 .0035 . 0014 .0016 .0020 .0026 .0010 .0012 .0015 .0019 .0008 . 0009 .0012 .0015 7 T able 5: KS test statistic for the tr unca ted p ow er-law distribution with K = 500 K = 500: γ = 0 . 25, Quantiles γ = 0 . 5, Quan tiles γ = 0 . 75, Quantiles γ = 1 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2877 .3160 .3718 .4371 .2773 .3048 .3596 .4237 .2588 .2837 .3351 .3972 .2383 . 2599 .3046 .3595 20 .2073 .2279 .2695 .3183 .1999 .2195 .2595 .3068 .1859 .2038 .2406 .2841 .1705 . 1860 .2172 .2556 30 .1706 .1876 .2221 .2632 .1643 .1807 .2134 .2533 .1530 .1676 .1976 .2345 .1400 . 1527 .1783 .2101 40 .1483 .1632 .1935 .2307 .1429 .1571 .1863 .2216 .1330 .1458 .1720 .2042 .1216 . 1326 .1546 .1823 50 .1330 .1465 .1738 .2061 .1281 .1410 .1669 .1983 .1193 .1307 .1545 .1831 .1090 . 1188 .1389 .1631 100 .0946 .1041 .1236 .1468 .0912 .1003 .1188 .1412 .0848 .0930 .1097 .1301 .0773 .0844 .0985 .1157 500 .0423 .0466 .0553 .0656 .0408 .0449 .0532 .0632 .0380 .0417 . 0492 .0583 .0346 .0378 .0441 .0518 1000 .0299 . 0330 .0390 .0466 .0289 .0317 .0376 .0448 .0269 .0294 . 0347 .0414 .0245 .0267 .0311 .0366 2000 .0212 . 0233 .0277 .0329 .0204 .0224 .0266 .0317 .0190 .0208 . 0246 .0293 .0173 .0189 .0220 .0259 3000 .0173 . 0190 .0225 .0270 .0167 .0183 .0217 .0259 .0155 .0170 . 0201 .0238 .0141 .0154 .0180 .0211 4000 .0150 . 0165 .0196 .0234 .0144 .0159 .0188 .0225 .0134 .0147 .0174 .0207 .0122 .0134 .0156 .0183 5000 .0134 . 0147 .0175 .0209 .0129 .0142 .0168 .0201 .0120 .0132 . 0155 .0185 .0110 .0119 .0139 .0164 10000 .0095 .0104 .0124 .0147 . 0091 .0101 .0119 .0142 .0085 .0093 .0110 .0130 .0077 . 0084 .0099 .0116 20000 .0067 .0074 .0088 .0104 . 0065 .0071 .0084 .0100 .0060 .0066 .0078 .0092 .0055 . 0060 .0070 .0082 50000 .0042 .0047 .0055 .0066 . 0041 .0045 .0053 .0063 .0038 .0042 .0049 .0058 .0035 . 0038 .0044 .0052 K = 500: γ = 1 . 25, Quantiles γ = 1 . 5, Quan tiles γ = 1 . 75, Quantiles γ = 2 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2259 .2487 .2961 .3548 .2150 .2409 .2954 .3637 .1987 .2251 .2834 .3560 .1780 . 2055 .2621 .3370 20 .1609 .1772 .2108 .2533 .1525 .1708 .2101 .2591 .1405 .1595 .2002 .2510 .1255 . 1438 .1832 .2343 30 .1317 .1450 .1728 .2069 .1248 .1400 .1722 .2118 .1149 .1304 .1637 .2045 .1025 . 1173 .1493 .1899 40 .1142 .1257 .1500 .1808 .1081 .1212 .1490 .1853 .0996 .1129 .1418 .1798 .0887 . 1015 .1291 .1646 50 .1022 .1125 .1339 .1618 .0968 .1084 .1336 .1658 .0891 .1011 .1268 .1597 .0794 . 0908 .1153 .1481 100 .0723 .0797 .0953 .1151 .0685 .0767 .0947 .1182 .0629 .0715 .0897 .1130 .0561 .0640 .0813 .1032 500 .0324 .0357 .0426 .0518 .0306 .0343 .0422 .0528 .0281 .0319 . 0401 .0505 .0251 .0287 .0363 .0456 1000 .0229 . 0252 .0301 .0366 .0217 .0243 .0299 .0374 .0199 .0226 . 0283 .0357 .0177 .0203 .0257 .0325 2000 .0162 . 0178 .0213 .0258 .0153 .0172 .0212 .0264 .0141 .0160 . 0200 .0252 .0125 .0143 .0182 .0230 3000 .0132 . 0146 .0174 .0210 .0125 .0140 .0173 .0214 .0115 .0131 . 0164 .0205 .0102 .0117 .0148 .0187 4000 .0114 . 0126 .0151 .0183 .0108 .0121 .0150 .0188 .0100 .0113 . 0142 .0179 .0089 .0101 .0128 .0162 5000 .0102 . 0113 .0135 .0163 .0097 .0108 .0134 .0167 .0089 .0101 .0127 .0160 .0079 .0091 .0115 .0145 10000 .0072 .0080 .0095 .0115 . 0068 .0077 .0095 .0117 .0063 .0071 .0090 .0113 .0056 . 0064 .0081 .0103 20000 .0051 .0056 .0067 .0082 . 0048 .0054 .0067 .0083 .0045 .0051 .0063 .0079 .0040 . 0045 .0057 .0072 50000 .0032 .0036 .0043 .0052 . 0031 .0034 .0042 .0052 .0028 .0032 .0040 .0050 .0025 . 0029 .0036 .0046 K = 500: γ = 2 . 5, Quantiles γ = 3 . 0, Quan tiles γ = 3 . 5, Quantiles γ = 4 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .1362 .1584 .2123 .2809 .1004 .1278 .1688 .2332 .0819 .0992 .1363 .1915 .0819 . 0819 .1091 .1529 20 .0959 .1115 .1457 .1914 .0723 .0848 .1122 .1525 .0534 .0642 .0879 .1245 .0440 . 0533 .0734 .0994 30 .0779 .0898 .1168 .1522 .0573 .0678 .0897 .1183 .0430 .0532 .0699 .0949 .0311 . 0409 .0550 .0764 40 .0672 .0778 .1005 .1301 .0501 .0584 .0770 .1011 .0372 .0444 .0593 .0799 .0281 . 0351 .0463 .0639 50 .0601 .0694 .0895 .1161 .0447 .0520 .0678 .0897 .0330 .0391 .0526 .0703 .0264 . 0306 .0408 .0552 100 .0425 .0489 .0627 .0808 .0315 .0365 .0471 .0607 .0235 .0273 . 0357 .0467 .0178 .0207 .0277 .0369 500 .0190 .0218 .0278 .0351 .0141 .0162 .0207 .0261 .0105 .0121 .0155 .0196 .0079 .0092 .0118 .0151 1000 .0134 . 0154 .0197 .0250 .0100 .0115 .0145 .0185 .0074 .0086 . 0109 .0137 .0056 .0064 .0083 .0105 2000 .0095 . 0109 .0139 .0177 .0070 .0081 .0103 .0130 .0052 .0060 . 0077 .0097 .0039 .0046 .0058 .0073 3000 .0077 . 0089 .0113 .0144 .0057 .0066 .0084 .0106 .0043 .0049 . 0063 .0079 .0032 .0037 .0047 .0059 4000 .0067 . 0077 .0098 .0124 .0050 .0057 .0073 .0092 .0037 .0043 . 0054 .0068 .0028 .0032 .0041 .0052 5000 .0060 . 0069 .0088 .0112 .0045 .0051 .0065 .0082 .0033 .0038 .0048 .0061 .0025 .0029 .0037 .0046 10000 .0042 .0049 .0062 .0079 . 0031 .0036 .0046 .0058 .0023 .0027 .0034 .0043 .0018 . 0020 .0026 .0032 20000 .0030 .0034 .0044 .0056 . 0022 .0026 .0033 .0041 .0017 .0019 .0024 .0030 .0012 . 0014 .0018 .0023 50000 .0019 .0022 .0028 .0035 . 0014 .0016 .0021 .0026 .0010 .0012 .0015 .0019 .0008 . 0009 .0012 .0015 8 T able 6: KS test statistic for the tr unca ted p ow er-law distr ibution with K = 10 00 K = 1000: γ = 0 . 25, Quant iles γ = 0 . 5, Quan tiles γ = 0 . 75, Quantiles γ = 1 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2901 .3184 .3745 .4403 .2806 .3082 .3635 .4279 .2616 .2868 .3387 .4000 .2402 . 2617 .3062 .3604 20 .2091 .2299 .2717 .3207 .2023 .2223 .2627 .3103 .1881 .2063 .2435 .2873 .1719 . 1874 .2188 .2568 30 .1722 .1894 .2239 .2656 .1664 .1830 .2164 .2565 .1549 .1697 .2001 .2375 .1413 . 1539 .1795 .2110 40 .1497 .1647 .1952 .2322 .1448 .1592 .1887 .2244 .1347 .1476 .1743 .2073 .1227 . 1337 .1557 .1841 50 .1343 .1479 .1754 .2077 .1298 .1429 .1691 .2009 .1208 .1325 .1565 .1851 .1100 . 1198 .1398 .1645 100 .0957 .1053 .1248 .1484 .0925 .1017 .1204 .1429 .0860 .0943 .1112 .1319 .0781 .0851 .0992 .1164 500 .0430 .0472 .0561 .0666 .0416 .0457 .0541 .0642 .0387 .0424 . 0500 .0594 .0350 .0382 .0445 .0522 1000 .0304 . 0334 .0395 .0470 .0294 .0323 .0382 .0455 .0273 .0299 . 0353 .0420 .0248 .0270 .0315 .0369 2000 .0215 . 0236 .0280 .0333 .0208 .0228 .0271 .0322 .0193 .0212 . 0250 .0297 .0175 .0191 .0222 .0260 3000 .0175 . 0193 .0228 .0273 .0170 .0186 .0221 .0263 .0158 .0173 . 0204 .0242 .0143 .0156 .0182 .0213 4000 .0152 . 0167 .0198 .0237 .0147 .0161 .0191 .0228 .0137 .0150 .0177 .0210 .0124 .0135 .0158 .0185 5000 .0136 . 0149 .0177 .0211 .0131 .0144 .0171 .0204 .0122 .0134 . 0158 .0188 .0111 .0121 .0141 .0165 10000 .0096 .0106 .0125 .0149 . 0093 .0102 .0121 .0144 .0086 .0095 .0112 .0132 .0078 . 0085 .0100 .0117 20000 .0068 .0075 .0089 .0105 . 0066 .0072 .0086 .0102 .0061 .0067 .0079 .0094 .0055 . 0060 .0070 .0083 50000 .0043 .0047 .0056 .0067 . 0042 .0046 .0054 .0065 .0039 .0042 .0050 .0059 .0035 . 0038 .0044 .0052 K = 1000: γ = 1 . 25, Quant iles γ = 1 . 5, Quan tiles γ = 1 . 75, Quantiles γ = 2 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .2296 .2529 .3016 .3594 .2197 .2464 .3019 .3696 .2022 .2295 .2884 .3610 .1800 . 2077 .2662 .3408 20 .1635 .1801 .2145 .2581 .1558 .1747 .2144 .2643 .1427 .1622 .2036 .2551 .1268 . 1455 .1854 .2369 30 .1338 .1475 .1759 .2119 .1275 .1430 .1759 .2162 .1168 .1325 .1668 .2079 .1036 . 1185 .1510 .1914 40 .1161 .1279 .1525 .1841 .1105 .1239 .1522 .1894 .1012 .1148 .1444 .1824 .0895 . 1026 .1306 .1664 50 .1039 .1144 .1363 .1646 .0988 .1108 .1366 .1683 .0906 .1028 .1290 .1624 .0802 . 0917 .1166 .1496 100 .0736 .0811 .0970 .1173 .0699 .0784 .0968 .1209 .0640 .0726 .0913 .1146 .0566 .0647 .0821 .1045 500 .0329 .0363 .0434 .0525 .0313 .0351 .0432 .0541 .0286 .0324 . 0407 .0511 .0253 .0290 .0367 .0461 1000 .0233 . 0257 .0307 .0372 .0221 .0248 .0306 .0382 .0202 .0230 . 0288 .0363 .0179 .0205 .0259 .0328 2000 .0165 . 0182 .0217 .0262 .0156 .0176 .0217 .0269 .0143 .0162 . 0204 .0256 .0127 .0145 .0184 .0232 3000 .0134 . 0148 .0177 .0214 .0128 .0143 .0176 .0219 .0117 .0133 . 0166 .0208 .0103 .0118 .0150 .0189 4000 .0116 . 0128 .0153 .0186 .0110 .0124 .0153 .0191 .0101 .0115 . 0144 .0181 .0089 .0102 .0129 .0164 5000 .0104 . 0115 .0137 .0166 .0099 .0111 .0137 .0170 .0090 .0103 .0129 .0162 .0080 .0092 .0116 .0147 10000 .0074 .0081 .0097 .0117 . 0070 .0078 .0097 .0120 .0064 .0073 .0091 .0115 .0057 . 0065 .0082 .0104 20000 .0052 .0057 .0069 .0083 . 0049 .0056 .0068 .0085 .0045 .0051 .0064 .0081 .0040 . 0046 .0058 .0073 50000 .0033 .0036 .0043 .0052 . 0031 .0035 .0043 .0054 .0029 .0032 .0041 .0051 .0025 . 0029 .0037 .0046 K = 1000: γ = 2 . 5, Quantiles γ = 3 . 0, Quan tiles γ = 3 . 5, Quantiles γ = 4 . 0, Quantiles N 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 0.9 0.95 0.99 0.999 10 .1368 .1587 .2131 .2843 .1005 .1279 .1692 .2340 .0819 .0992 .1364 .1918 .0819 . 0819 .1092 .1530 20 .0961 .1116 .1464 .1924 .0723 .0849 .1123 .1527 .0534 .0642 .0879 .1245 .0440 . 0533 .0734 .0994 30 .0780 .0900 .1172 .1531 .0573 .0678 .0897 .1185 .0430 .0532 .0699 .0950 .0311 . 0409 .0550 .0763 40 .0673 .0780 .1009 .1307 .0501 .0584 .0770 .1013 .0372 .0444 .0593 .0800 .0281 . 0351 .0463 .0639 50 .0602 .0696 .0898 .1167 .0447 .0520 .0679 .0897 .0330 .0391 .0526 .0703 .0264 . 0306 .0408 .0552 100 .0426 .0490 .0629 .0812 .0315 .0365 .0471 .0608 .0235 .0273 . 0357 .0467 .0178 .0207 .0277 .0369 500 .0190 .0218 .0279 .0352 .0141 .0162 .0207 .0261 .0105 .0121 .0155 .0196 .0079 .0092 .0118 .0151 1000 .0134 . 0155 .0198 .0251 .0100 .0115 .0146 .0185 .0074 .0086 . 0109 .0137 .0056 .0064 .0083 .0105 2000 .0095 . 0109 .0139 .0177 .0070 .0081 .0103 .0130 .0052 .0060 . 0077 .0097 .0039 .0046 .0058 .0073 3000 .0077 . 0089 .0114 .0144 .0057 .0066 .0084 .0106 .0043 .0049 . 0063 .0079 .0032 .0037 .0047 .0059 4000 .0067 . 0077 .0098 .0125 .0050 .0057 .0073 .0092 .0037 .0043 . 0054 .0068 .0028 .0032 .0041 .0052 5000 .0060 . 0069 .0088 .0112 .0045 .0051 .0065 .0082 .0033 .0038 .0048 .0061 .0025 .0029 .0037 .0046 10000 .0042 .0049 .0062 .0079 . 0031 .0036 .0046 .0058 .0023 .0027 .0034 .0043 .0018 . 0020 .0026 .0032 20000 .0030 .0034 .0044 .0056 . 0022 .0026 .0033 .0041 .0017 .0019 .0024 .0030 .0012 . 0014 .0018 .0023 50000 .0019 .0022 .0028 .0035 . 0014 .0016 .0021 .0026 .0010 .0012 .0015 .0019 .0008 . 0009 .0012 .0015 9 Listing 1: CUDA/C co de 1 2 #includ e < stdio.h > 3 #includ e < stdlib.h > 4 #includ e < time.h > 5 #includ e ”cuda runtime.h” 6 #includ e ”device launch pa rameters.h” 7 #includ e < cuda runtime api.h > 8 #includ e < curand.h > 9 #includ e < curand k ernel.h > 10 11 / ∗ 12 −−−−−−−−−−−−− − −−−−−−−−−−−−− − −−−−−−−−−−−−− −−−−−−−−−−−−− − −−−−−−−−−−−−−−−−−−−−−−−−− 13 14 CUD A C program used fo r the results of the a rticle 15 16 DISCRETE TRUNCA TED ZIPF DISTRIBUTION: 17 18 Calculates the quantiles for a given value of K, gamma, and random seed. 19 20 Syntax is: 21 program.ex e K Gamma Random Seed integer 22 23 The value of N is fixed i n the co de . 24 25 − K must b e less than 32766 (in the pap er, it’s 20 , 30, 50, 100, 500, 1 0 00). 26 − The value of N is fixed at the start of the co d e b elow. 27 In the paper, it’s 10, 20, 30, 40, 50, 100, 500, 1 0 00, 2000, 30 0 0, 4000, 5000, 10 0 00, 20000. 28 − Gamma should b e > 0.25 (for meaningful resul ts) 29 30 −−−−−−−−−−−−− − −−−−−−−−−−−−− − −−−−−−−−−−−−− −−−−−−−−−−−−− − −−−−−−−−−−−−−−−−−−−−−−−−−−− 31 32 T echnical no te: Imp ortant when compil ing this CU DA program: 33 34 − The p rogram requires a GPU that supp orts CUDA, and the (freely do wnloadable) CUDA develop er softw are installed 35 (for Vi s ual Studio it is available as a n add − on) 36 − The p rogram requires a CUDA GPU tha t sup p orts ’double’ floa tin g − p oint numb ers. Some GPU only sup p ort ’float’ − this is not g o o d enough 37 and will p roduce inco rrect results (zeros, i nfinities) due to the accumulation of roun d ing errors 38 − By default, CUDA may demote ’double’ to ’float’ to conserve resources. If a compilation wa rning: 39 ’double is not s upp orte d, demoting to flo a t’ is p roduced, the foll owing compila tion paramete rs need to b e ad justed: 40 code generation = compute 20, s m 20 41 compiler optio n s = − arch=sm 20 42 (20 refers t o the cude computational ab ility l evel of the card; level 13 or more supp orts ’ double’) 43 − The standard CUDA libara y curand.lib must be included (us ed for random numb er generation) 44 45 −−−−−−−−−−−−− − −−−−−−−−−−−−− − −−−−−−−−−−−−− −−−−−−−−−−−−− − −−−−−−−−−−−−−−−−−−−−−−−−−−−− 46 47 Dr. Efs tratios Rapp os and Prof. Ste phan Robert 48 HEIG − VD 49 Switzerland 50 51 Efstratios.Rapp os at heig − vd < dot > ch 52 Stephan.Rob ert a t heig − vd < dot > ch 53 54 May 201 3 55 56 ∗ / 57 58 59 // The number of points in each simulation (s ample siz e N) 60 61 #define N 2000 62 63 #define CUDA GPU DEVICE 2 // If you have multiple NVIDIA cards, sp ecify which to use. Sta rt with 0 = ”first card”, 1 = ”second card” etc. 64 65 #define BLOCKS 782 66 #define THREADS PER BLOCK 64 67 68 #define SIMULA TIONS BLOCKS ∗ THREADS PER BLOCK //numb er o f simula tions (a multipl e of NTHREADS) 69 #define SIMULA TIONS REQUIRED 50000 70 71 / ∗ 72 SIM = 1 ∗ 64 = 64 73 SIM = 2 ∗ 64 = 12 8 74 SIM = 4 ∗ 64 = 25 6 75 SIM = 5 ∗ 64 = 32 0 76 SIM = 7 ∗ 64 = 44 8 77 SIM = 8 ∗ 64 = 51 2 78 SIM = 16 ∗ 64 = 1 024 79 SIM = 32 ∗ 64 = 2 048 80 SIM = 63 ∗ 64 = 4 032 81 SIM = 79 ∗ 64 = 5 056 82 SIM = 157 ∗ 64 = 10048 10 83 SIM = 313 ∗ 64 = 20032 84 SIM = 782 ∗ 64 = 50048 85 ∗ / 86 87 88 // Nothing really to change from here on. 89 90 //#define MAX POINTS 32767 91 92 int sort dbl( const void ∗ x, cons t void ∗ y) { 93 double t = ( ∗ ( double ∗ )x − ∗ ( double ∗ )y); 94 return ( int ) ( (t > 0) − (t < 0)) ; 95 } 96 97 device dou ble NewtonRaphson( doub le initial guess, double RHS data, int K, cons t dou ble ∗ LOGS ); 98 device dou ble KolmogorovSmirnoff sho rt( const un signed short ∗ da ta, int K, dou b le gamma, cons t dou ble ∗ LOGS ); 99 100 host void che ck cuda(cudaError t c udaStatus, c har ∗ mess age, b o ol &fa il) { 101 102 if (cudaStatus) { 103 printf(”Error i n %s (%d) − %s \ n ”, mess a ge, cudaStatus, c udaGetErro rString(cudaStatus)); 104 fai l = true ; 105 system (”paus e”); 106 } 107 } 108 109 110 global void setup kernel ( int seed, curand State ∗ state ) { 111 112 int id = threadIdx .x + blo ckIdx .x ∗ THREADS PER BLOCK ; 113 unsigned long long seed1 = seed; 114 // Each thre ad g ets the same seed, but a different sequence number, no offset 115 curand init (see d1 , id , 0, &state[id]); 116 } 117 118 global void generate k ernel ( curandState ∗ state , d o u ble ∗ dev results, const int K, cons t double gamma, cons t double ∗ L OGS) { 119 120 int id = threadIdx .x + blo ckIdx .x ∗ THREADS PER BLOCK; 121 curandState l o calState = state[id]; // C opy state to local memory for efficiency 122 unsigned short p oints[N]; 123 int i, t; 124 double x, c; 125 126 for (i=0;i < N;i++) 127 points[i] = 0; 128 129 int KMAX=0; 130 c = 0.0; 131 132 for (i=1; i < =K; i++) 133 c = c + exp( − ( double ) gamma ∗ LOGS[i]); // c = c + (1.0 / pow((double) i, (double ) gamma)); 134 c = 1.0 / c; 135 136 for (t=0; t < N; t++) { 137 x = curand uniform doub l e (& localState ); 138 double sum prob = 0; 139 fo r (i=1; ; i++) { 140 sum prob = sum prob + c ∗ exp( − ( double )gamma ∗ LOGS[i] ); 141 if (sum prob > = x) { 142 p oints[t]= i; 143 if (i > KMAX) KMAX = i ; 144 break ; 145 } 146 } 147 } 148 149 // We store the value of KMAX, the max observation i n the current generated series. 150 // As all observations are < =K a nyw ay (an inp ut parameter), we wil l have KMAX < = K, 151 // H owe ver, when using lo ops 1 to K, we can lo op u p to KMAX o nly, rather than K , as there are n o observations in the range KMAX to K (more efficient ). 152 153 // Copy state back to global memory 154 state[id] = localState ; 155 156 // We now have p oi n ts[] 157 158 // FIRST FIND Ma ximum Li ke liho o d Estimator 159 // RH S 160 161 int NPOIN TS=0; 162 double RHS data = 0.0; 163 164 for (t=0;t < N;t++) { 165 // check: should n ever happ en 11 166 if ((poi nts[t] < 1) || (p oints[t] > 32766)) { printf(” \ n \ n ERROR p oi nts[%d] < 1 (=%d) ! \ n \ n”,t,p oints[t]); de v results[id] = − 1.; return ; } 167 168 RHS data += L OGS[p oints[t]] ; 169 NPOINTS++; 170 } 171 172 173 // If all p o ints are = 1, adjust RH S so that it’s > 0..(for RHS=0 t he estimated gamma is infini ty ). 174 // adjust by a factor of ln(2) − − as if one point was 2 instead of 1. This g i ven a max ga mma of ˜ 15 for 50,000 total points 175 // This only a ffects very small values of N, eg 10, 20, 30, 40. 176 177 // if(RHS data < =0) { printf(”RHS is < =0 ! (%f) \ nPOINTS=”,RHS data); 178 // for(t =0;t < N;t++) printf(”%d,”,p oints[t]); printf(”. −\ n”); 179 // } 180 181 if (RHS data < =0) 182 RHS data += L OGS[2]; 183 184 RHS data = RHS da ta / ( double ) N POINTS; 185 186 // N ewton − Ra p hson to o b tain estimated value for gamma 187 double estimated gamma = Newt onRaphson( 0.5 , RHS data , K, L O GS); //must b e K, not KMAX 188 189 //Kolmogorov − Smirnoff test statistic 190 double KStest = Kol mogorovSmirnoff short(points, K, estimated gamma, LOGS); 191 dev results[id] = KStest; 192 } 193 194 in t main( int argc, char ∗ argv[]) { 195 196 if (argc != 4) { 197 p rintf(”s yntax is p rogram.exe K GAMMA SEED \ nby e \ n”); 198 sy stem(”pause”); 199 return 1; 200 } 201 202 const in t K = atoi(a rgv[1]); 203 if (K==0) { 204 p rintf(”Cann ot read v alue f or K \ nby e \ n”); 205 sy stem(”pause”); 206 return 1; 207 } 208 209 const dou ble gamma = atof(argv[2]); 210 if (gamma < 0.1) { 211 p rintf(”Cann ot read v alue f or Gamma, o r Gamma < 0.1 \ nby e \ n”); 212 sy stem(”pause”); 213 return 1; 214 } 215 216 const in t see d = atoi(argv [3]); 217 if (seed < 1) { 218 p rintf(”Cann ot read v alue f or SEED, o r SEED < 1 \ nby e \ n” ); 219 sy stem(”pause”); 220 return 1; 221 } 222 223 if (K > 32766) { 224 p rintf(”Value for K mus t be < 3276 6 \ nb y e \ n”); //mu s t b e < 32,767 as with the f a st implementation, the CUD A sample p oi n ts are coded ’short’ 225 sy stem(”pause”); 226 return 1; 227 } 228 229 if (SIMULA TIONS REQUIRED > SIMULA TIONS) { 230 p rintf(”SIMULA TIO NS REQUIRED must be < = SIMULA TIONS \ nby e \ n”); 231 sy stem(”pause”); 232 return 1; 233 } 234 235 cudaError t cudaStatus ; 236 237 int i; 238 239 // Pre − compute Logarithms for 1 −− K, K < 32 767, for faster ex ecution 240 241 double ∗ LOGS; 242 243 LOGS = new dou ble [K+2]; 244 for (i=0;i < =(K+1);i++) 245 LOGS [i] = log(( double ) i); 246 247 cudaGetDeviceC ount (&i); 248 pr intf(”Found %d Grap hics cards that su p p ort CUDA \ n”,i); 12 249 pr intf(”Checking capabilities of chosen GPU device (CUD A GP U DEVICE = % d): \ n”,CUD A GPU D EVICE); 250 251 cudaDeviceProp properties ; 252 cudaGetDeviceP rop erties (&properties, CUDA GPU DEVICE); 253 p rintf(” Name: %s \ n”, properties.na me); 254 p rintf(” T otal global Memory: %d \ n ”, ( int ) p ro p erties.totalGlob a lMem); 255 p rintf(” Sha red Memory p er blo ck: % d \ n”, ( int ) p roperties.sharedMemP erBlo ck ); 256 p rintf(” T otal Const M emo ry: %d \ n”, ( int ) properties.totalConstMem); 257 p rintf(” Multip rocessors: %d \ n”, prop erties.multiPro cessorCount); 258 p rintf(” Max # threads per mul tiprocessor: %d \ n”, p roperties.maxThreadsPerMultiProcessor); 259 p rintf(” Max #threads p er p er block: %d \ n”, ( int ) properties.maxThreadsPerBlock); 260 p rintf(” Compute capability: % d.%d \ n”, properties.major, properties.mi nor); 261 p rintf(” Kernel ti meout enabled: %d \ n”, p roperties.ke rnelExecTimeoutEnabled ); 262 263 b ool fail = false ; 264 265 cudaStatus = c udaSetDevice(CUDA GPU DEVICE); check cuda(cudaStatus, ”setde vice”, fail); 266 267 time t t1 = clo ck(); 268 269 // Copy logarithms to C UDA device 270 double ∗ dev logs = 0; 271 cudaStatus = c udaMallo c(( void ∗∗ )&dev logs, (K+2) ∗ s izeof ( double )); 272 cudaStatus = c udaMemcpy(dev logs, LO GS, (K+2) ∗ sizeof ( double ), cudaM emcp yHostT oDevice); 273 274 delete [] LO GS; 275 276 // Simul ation Setup 277 double KStest sim[SIMULA TIONS]; // store s the K − Smirnoff statisi tc 278 for (i=0;i < SIMU LA TIONS;i++) 279 KS test sim[i] = − 1.0 ; 280 281 pr intf(”Generating % d p ow er − law distrib utions, with N =%d, K=%d, gamma=%f \ n”, SIMULA TION S, N, K, gamma ); 282 // generate #SIMULA TIONS random seed valu es using the CUDA random g enerator 283 curandState ∗ dev States ; 284 cudaMallo c(( void ∗∗ )&devStates, SI MULA TIONS ∗ sizeof (curandState)); 285 286 setup k ernel <<< BLOCKS, THREADS PER BLOCK >>> ( seed, de vStates ); 287 288 cudaStatus = c udaDeviceSynchronize(); check cuda(cudaStatus, ” cudaMemcp y 2”, fail); 289 cudaStatus = c udaGetLastError(); check cuda(cudaStatus, ” lastError i”, fail); 290 291 double ∗ dev results; 292 cudaMallo c(( void ∗∗ )&dev results, SIMULA TIONS ∗ s i zeof ( double )); check cuda(cudaStatus, ”cudaMemcpy 4”, fai l); 293 cudaMemset(dev results, 0, SIMU LA TIONS ∗ siz eof ( double )); check cuda(cudaStatus, ”cudaMemcpy 5”, fail); 294 cudaStatus = c udaGetLastError(); check cuda(cudaStatus, ”las tErro r ii”, fail); 295 296 generate k ernel <<< BLOCKS, THREADS PER BLOCK >>> ( devStates , d ev results, K, gamma , dev logs ); 297 298 cudaStatus = c udaGetLastError(); check cuda(cudaStatus, ” lastError iii”, fail); 299 cudaStatus = c udaDeviceSynchronize(); check cuda(cudaStatus, ”cudaDeviceSunchronize 6”, fail); 300 cudaStatus = c udaGetLastError(); check cuda(cudaStatus, ” lastError iv 2”, fa il); 301 302 303 cudaMemcp y (KStest sim, dev results, SIMULA TIO NS ∗ sizeof ( double ), cudaMemcpy DeviceT oHo s t); che ck cuda(cudaStatus, ”cudaMemcpy 7”, fail); 304 305 cudaF ree(dev logs); 306 cudaF ree(dev results); 307 cudaF ree(devStates); 308 309 310 // error catching − shou ld never happen 311 //for(i=0;i < SIMULA TIONS;i++) { 312 // if(KStest sim[i] < 0.000) { 313 // printf(” \ n \ n KStest sim is < 0.0 (%f)!! \ n”, KStest sim[i]); 314 // return 1; 315 // } 316 // } 317 318 319 //Calculate Qu antiles 320 321 // As # s i mulations is a mul ti p le of 64, we must disca rd s ome sumu l aitons to have the required numb er 322 323 double KStest2[SIMULA TION S REQUIRED]; 324 325 for (i=0;i < SIMU LA TIONS REQUIRED;i++) 326 KS test2[i] = KSt est sim[i]; 327 328 qsort(KSt est2, S IMULA TIONS REQUIRED, s izeof ( double ), sort dbl); 329 330 pr intf(”Quantile 90 %% is at %6.4f \ n”, KStest2[ SIMULA TIONS REQUIRED ∗ 9/10 ]); 331 pr intf(”Quantile 95 %% is at %6.4f \ n”, KStest2[ SIMULA TIONS REQUIRED ∗ 95/100 ]); 13 332 pr intf(”Quantile 99 %% is at %6.4f \ n”, KStest2[ SIMULA TIONS REQUIRED ∗ 99/100 ]); 333 pr intf(”Quantile 99.9%% is at %6.4f \ n”, KSte st2[ SIMULA TION S REQUIRED ∗ 999/1000 ]); 334 335 time t t2 = clo ck(); 336 double duration = ( double )(t2 − t1) / C LOCKS PER SEC; 337 338 // if output to a tex t file is de sired 339 FILE ∗ fout; 340 fout = fopen(”output.txt”, ”a +”); 341 342 fprintf(fout, ”%d & %d & %6.2f & %6.4f & %6.4f & %6.4f & %6.4f & %10.4 f \\\\\ n”,K, N, g amma, 343 KS test2[ SIMULA TIONS REQUIRED ∗ 9/10 ], 344 KS test2[ SIMULA TIONS REQUIRED ∗ 95/100 ], 345 KS test2[ SIMULA TIONS REQUIRED ∗ 99/100 ], 346 KS test2[ SIMULA TIONS REQUIRED ∗ 999/1000 ], 347 duration ); 348 349 fclose(fout); 350 pr intf(”Time taken: %10.4f seconds \ n”, duration); 351 return 0; 352 } 353 354 355 356 // Newton − Rapshson al gorithm : produces the estima te the pow er − law exp onent gamma from the data 357 358 device dou ble NewtonRaphson( doub le initial guess, double RHS data, int K, cons t dou ble ∗ LOGS ) { 359 360 c onst doubl e absolute tolerance = 0.00001; // the required lev el of acc uracy in the estimation of ga mma 361 int t; 362 double x, xne w; 363 x = xnew = initial guess; // i n itial guess es for gamma 364 double A, B , C; 365 366 do { 367 368 x = xnew; 369 dou ble f, f1 ; 370 f = 0 .0; 371 372 A=0.0; B=0.0; C=0.0; 373 374 for (t=1;t < =K;t++) { 375 376 double p owt = e xp( − x ∗ LOGS[t]); 377 378 A += ( − p owt ∗ LOGS[t] ); 379 B += powt; // C 380 C += powt ∗ LOGS[t] ∗ LOGS[t] ; 381 } 382 383 //f(x) 384 f = A/B + RHS data ; 385 386 //f’(x) − the derivative 387 f1 = C/B − A /B ∗ A/B; 388 xnew = x − f / f1; 389 } 390 391 w hile (( abs(x − xnew) > absolute tolerance)); 392 393 return xnew; 394 } 395 396 397 398 // Kolmogo rov − Smi rn off test: returns the test value o f the test 399 device dou ble KolmogorovSmirnoff sho rt( const un signed short ∗ da ta, int K, dou b le gamma, cons t dou ble ∗ LOGS ) { 400 401 402 double c = 0.0; 403 int i; 404 405 int t; 406 double xnew = ga mma; 407 408 409 fo r (t=1;t < =K;t++) 410 c += exp( − xnew ∗ LOGS[t]); 411 412 c = 1.0 / c; 413 414 double actual pre v, theoretical pre v; 14 415 int actual; 416 double KStest = − 2.0; 417 418 int NPOINTS = N; 419 420 fo r (t=1;t < =K;t++) { //K here is the max observation 421 422 if (t==1) { 423 424 theo retical pre v = c ∗ exp( − xnew ∗ LOGS[t]) ; 425 actual = 0; 426 for (i=0;i < NPOINTS;i++) { 427 if (data[i] == t) 428 actual++; 429 } 430 actual pre v = ( double ) actual / ( doub le ) NPOINTS; 431 } 432 els e { 433 434 theo retical pre v += c ∗ exp( − xnew ∗ LOGS[t]) ; 435 actual = 0; 436 for (i=0;i < NPOINTS;i++) { 437 if (data[i] == t) 438 actual++; 439 } 440 actual pre v += ( double ) actual / ( doub le ) NPOINTS; 441 } 442 443 // Find SUP 444 if (abs(theoretical pre v − actual prev) > K Stest ) 445 KStest = abs(theo retical pre v − actual prev); 446 } 447 return KStest; 448 } 15

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment