Probability, Statistics and Planet Earth, I: Geotemporal covariances

The study of covariances (or positive definite functions) on the sphere (the Earth, in our motivation) goes back to Bochner and Schoenberg (1940--42) and to the first author (1969, 1973), among others. Extending to the geotemporal case (sphere cross …

Authors: N. H. Bingham, Tasmin L. Symons

Probabilit y , Statistics and Planet Earth, I: Geotemp oral co v ariances. N. H. Bingham a nd T asmin L. Symons September 17 , 201 8 Abstract The study of co v a riances (or p ositiv e definite functions) on the sphere (the Earth, in our motiv at ion) goes bac k to Bo c hner and Sc ho en- b erg (1940–4 2) and to the first auth o r (1 969, 1973), among others. Extending to the geotemporal case (sphere cross line, for p osition and time) w as for a long time an obstacle to geostati stical mo delling. T he c haracterisatio n qu e stion here w as raised by the authors and Mija- to vi ´ c in 20 16, and answ ered b y Berg and P orcu in 2017. Extens ions to m ultiple p ro ducts (of spheres and lin es) follo ws similarly (Guella, Menegatto and Peron, 2016). W e su rv ey results of this t yp e, and related applications e.g. in numerical w eather pr ediction. MSC2010 Classification : 60-02; 60G6 0; 62H11; 60G 15. 1. Intro duction It has been kno wn f or millennia that the Earth is spherical, and since then our natural interes t in predicting w eather and climate phenomena has driv en the study of the interaction b et w een spherical geometry and randomness, surv ey ed here and in the sequel. When one considers sto c hastic pro cesses para metrised by the Earth (plus time) the first question is naturally one o f existenc e. F or the prototypic al case of Brownian motion on S d × R , see t he recen t surv ey [BinMS]. The application of probability and statistics to the Earth natura lly b e- gins with means and (co-)v ariances , b efore progressing to distributions, with Gaussianit y a s a first approx imation. The study of means, and their c hanges o v er time, is ob viously of prime imp ortance b ecause of climate c hange. W e 1 defer t his and man y other matters to Part I I of this surv ey , and address co v ariances here. In the case of Gaussian pro cesses, c haracterised completely by their mean and co v ariance, the question is o ne o f p ositiv e definiteness on spheres. In or- der to bring in time, w e need to w ork in a geotempora l con text – on a pro duct space, ‘sphere cross (half-)line’, where the spherical fa ctor is p osition on the Earth and t he (half-)line factor is time. Until recen tly , the study o f p ositive definite functions was fo cussed on spheres and other t w o-p oint homogeneous Riemannian manifolds, but this is to o restrictiv e – the in teraction across space and time is of prime imp o rtance f or the study of t he Earth’s atmo- sphere and o ceans, from the short term (micro: w eat her) to the medium term (meso: prediction of next ye ar’s crops etc.) to the long term (macro: climate, and climate change ) . Our treatment hinges on three results. The first, f r om 194 0-42, is classi- cal; the second and t hir d, from 2 0 16 and 2017, are m uc h more recen t. 2. The Bo c hner-Sc ho en b erg theorem The study o f p ositiv e definite functions on general metric spaces dates bac k to Sc ho enberg in 1938 [Sc h1], and to spheres S d in particular to Bo c hner and Sc ho en b erg 1940-42 ([Bo cS] in 1 940; [Sc h3] in 1940, [Sc h4] in 1942; Bo c hner [Bo c2] in 1941, [Bo c3] in 1942). Later extensions to probability theory w ere made by Kennedy [Ken], Lamperti [La m], G angolli [Gan] and Bingham [Bin1, 2]. The imp ortance of this classical area has b een widely recognised b y the geostatistical comm unity recently , notably by Gneiting [Gne1, 2, 3]. It is mathematically almost immaterial whether one c ho oses to w ork with co v ariances or with correlations. The first is more con v enient for a pplications, our motiv ation here, as they retain the natural units of the data; the la tter are scalars, and scale-free. They differ here merely b y a scale-factor, which w e write b elo w as c ∈ (0 , ∞ ). Another p ossibilit y is the incremen tal v ariance (or v a riogram), whic h mus t b e of negativ e t yp e . This is preferable for some purp oses; see [BinMS] fo r bac kground and references. W e restrict atten tion to the isotr o pic case, where the p o sitiv e definite function f is of the form f ( x , y ) = g ( d ( x , y )) ( x , y ∈ S d ) , where d denotes geo desic distance on the sphere; here g is called the isotr opic p art of f . F or con v enience, w e adopt the usual normalisation of taking the 2 sphere to ha v e radius 1; th us f is defined o n S , g on [ − 1 , 1], a nd x = d ( x , y ) ∈ [ − 1 , 1]. It turns out that the general case can b e simply built up fro m a coun table set of sp ecial cases. The relev ant f here ar e sp h eric al harmoni c s , or (zonal) s p heric al functions ( § 8.1 , [BinMS], [Hel1,2]). The relev an t g here are ort ho gonal p o lynomials (f or ba ckground on whic h see e.g. [Sze ]), the Ge genb auer (o r ultr aspheric a l ) p olynomials , of index λ = 1 2 ( d − 1) , on S d ⊂ R d +1 as ab ov e. F or the 2-sphere in 3 -space (i.e. the Earth), these reduce ( λ = 1 2 ) t o the familiar L e gendr e p olynom ials , P n ( x ). F or the circle (1-sphere) in the plane, they r educe to the Tchebycheff p olynom ials T n ( x ) = cos ( n cos − 1 x ) ( x ∈ [ − 1 , 1]) . All t hese are orthogonal p o lynomials with ortho gonality in t erv a l [ − 1 , 1] [Sze]. Bo chn er-Sc ho en b erg Theorem . F or S d , the general isotropic p ositiv e- definite function, equiv alen tly , the g eneral isotropic co v ariance, is giv en (to within a scale factor) b y a c onvex c omb i n ation of Ge genb auer p olynomials P λ n ( x ): c ∞ X n =0 a n P λ n ( x ) , a n ≥ 0 , ∞ X n =0 a n = 1 , ( B S ) with Legendre p olynomials when d = 2. In probabilistic language, this says that the general case is a mixtur e o f spherical harmonics. In particular, taking the mixing law to concen tr a te at one p oin t: the Gegenbauer (Legendre) p olynomials are themselv es p ositiv e definite [Sch4, 177]. T he gist of the Bo c hner-Sc ho en b erg theorem is th us that the F ourier-Ge genb auer c o efficients a n o ccurring in this mixture ar e non-ne gative . Ask ey and Bingham [AskB, § 2] ex tended this result to Gaussian pro cesses on the other compact symmetric spaces of rank 1 (tw o-p o in t homogenous Rie- mannian manifolds). 3. The Berg-P orcu theorem 3 Recall some basic closure prop erties of the class P ( M ) of p ositiv e definite functions on a symmetric space M (or semigroup, as in Berg et al. [BerCR]): P ( M ) is closed under p ositiv e scaling and conv ex comb inations ( b oth touc hed on in the a b ov e). F urther, P ( M ) is closed under p oin t wise pro ducts, whic h corresp onds to con v olution (of probability measures), and to tensor pro ducts (of represen ta tions); see e.g. [AskB, 5.2]. Similarly for product space s (as here): by the Sc hur pro duct theorem ( see e.g. [HorJ, 458]), if f 1 ∈ P ( M 1 ), f 2 ∈ P ( M 2 ), then f 1 f 2 ∈ P ( M 1 × M 2 ). W e turn now from sphere S d to sphere cross line S d × R . As b efore, w e confine attention to the isotropic case for the space v ariable on the sphere, and similarly , we restrict to t he statio nary case for the time-v ariable on the line. The Bo chner-Sc ho enberg theorem giv es us a full kno wledge of P ( S d ), and we kno w P ( R ) ( o r P ( R + )) from Bo c hner’s theorem [Bo c1] of 1 933 fo r c haracteristic functions on the line. By t a king pro ducts, we o btain elemen ts of P ( S d × R ). These are cov a r ia nces on sphere cross line, but where the spatial and temp oral asp ects are indep enden t. These – called sep ar able space-time co v ariances b y Gneiting [Gne2] – are ob viously unrealistic: for space-time mo delling, one needs non -sep ar abl e co v ariances, not of pro duct form. By the ab ov e, we can write do wn a large class of suc h functions b y for ming c on v e x c ombinations of sep ar abl e ones – the class P ( S d × R ) b eing closed under con- v ex comb inations. In view of the Bo chne r-Schoen b erg theorem, the nat ural conjecture here is that t his give s the general case; in an y case, it give s a ric h enough class to b e amply flexible enough for mo delling purp oses (at least in the first instance). With this in mind, the question w as raised by Mijatovi ´ c and the a uthors [BinMS, 4.4] in 2 016 of extending the results of Ask ey and Bingham [AskB] to sphere cross line. This has b een done by Berg and P orcu [BerP], 2017. Their result confirms the conjecture ab ov e: Berg-P orcu Theorem . The class of isotro pic stationary sphere-cross-line co v ariances coincides with the class of con v ex com binatio ns of se parable ones: c ∞ X n =0 a n P λ n ( x ) φ n ( t ) , a n ≥ 0 , ∞ X n =0 a n = 1 . ( B P ) with the φ n c haracteristic functions on the line. Comparing this with the Bo c hner-Sc ho en b erg theorem, one can readily conjecture what will ha pp en with multiple pro ducts, of a num b er of spheres 4 and o f (half- )lines, represen ting time(s), heigh t(s), etc. This is the conten t of the Guel la-Mene gatto-Per on the or em , to whic h w e no w turn. 4. Multiple pro ducts: the Guella-Menegatto-P eron theorem An enormous quantit y of geostatistical da ta is collected for use in w eather forecasting ( § 6 b elo w) – curren t operatio nal w eather forecast mo dels use O (10 7 ) observ ations of rainfall, temperature, press ure, snow depth (when relev an t), wind direction, sp eed and strength of gusts, visibility , cloud (type, amoun t, heigh t), su nshine, etc, whic h are then comb ined with a dynamical mo del based on atmospheric fluid dynamics [V al]. The multiv aria te nature of this data motiv ates the extension o f the w ork ab ov e to pro ducts of spheres. This was done for t w o spheres in 2016 b y G uella, Mene gatto and P eron [GueMP1]. Their result extends, b y their metho ds, to multiple pro ducts of spheres, and also to m ultiple pro ducts of lines, by the metho ds ab o v e. W e summarise all this as follow s: Guella-Mene gatto-Peron Theorem . F o r a m ultiple pro duct of N spheres and M (half-)lines, the g eneral p ositiv e definite function isotropic in the space v ariables and stationary in the time v a riables is a multiple of a con vex com bination of separable o nes, c ∞ X n 1 ,...,n M + N =0 a n 1 P λ 1 n 1 ( x 1 ) · · · P λ r n N ( x r ) φ n N +1 ( t 1 ) · · · φ n N + M ( t s ) , (GMP) where a n ≥ 0 , P ∞ n =0 a n = 1, as in the ab o v e. The authors o f [GueMP1] remark (p.672-3) that they do not hav e an y practical applications to offer. In fact, their result is v ery timely and ve ry ric h in practical applications, to whic h we turn in § 6. The 1-sphere is relev a n t here for capturing se a s onal effects, wind direction etc. A related extension of the Berg- Porcu theorem to the multiv ar iate case, motiv ated as here b y Planet Earth, has recen tly b een giv en by Alegria, P orcu, F urrer and Mateu [AlePFM]. 5. Implemen tation: three kinds of truncation 5 As with a n y result in volving an infinite op eration (summation, integra- tion etc.), ( B P ) calls for trunction in general b efore implemen tation can tak e place. Index trunc ation . In F ourier analysis, the trigonometric functions (or complex expo nen t ia ls) o ccur as eigenfunctions with eigen v alue − n 2 . This enables most F ourier series to b e tr uncated fo r n umerical use after comparativ ely few terms; for bac k- ground, see e.g. the journal Applied a nd Computatio nal Harmonic Analy- sis . With the ultraspherical p olynomials, t he corresp onding eigen v alues are − n ( n + 2 λ ) [Sze, (4.7.4)], whic h increase as fast. So , truncation of the F ourier- Gegen bauer series after a suitable nu m b er of t erms is as straightforw ar d a s in the F ourier case. As one may expect the dominance of the ear ly harmonics to b e reflected in decreasing size of the co efficien ts a n , one ma y use the tec hnique of monotone r e gr essi o n to build this in to the mo del. F or details o n this, see [BarBBB], [RobWD], and for the p ow er of this t echniq ue in actio n, see e.g. [BinP]. Sp atial trunc ation . The arg umen t x in ( B S ), ( B P ) , ( GM P ) is in [ − 1 , 1], and is x = cos( d ( x 1 , x 2 )) , with d geo desic distance and x i p oin ts on the surface of the Ear t h, nor- malised as a b ov e to ha ve radius 1 (cf. e.g. [AskB, 133]). Using the further normalisation W n ( x ) := P λ n ( x ) /P n (1) , the spherical functions W n are uniformly b ounded [AskB, 130], and | W n ( x ) | ≤ W n (1) = 1 for all n and x . Th us the spatial t erms P λ n ( x ) in ( B S ), ( B P ), ( GM P ) are largest (in mo dulus) when their arg umen ts, the t wo spatial p oin ts x i , are close. So these g ive the most significant terms, and it mak es sense to restrict also to nearby pairs of p oin ts when t runcating (how close will dep end on con text; see b elo w). T e mp or al trunc ation . W eather (as distinct from seasonal or climatic aspects) is notorio usly 6 short-term: one migh t say that the system generating the Earth’s w eather has short-t erm memory . It thus mak es sense also to t r uncate the expansions to drop terms where the t ime-separatio n b et w een the tw o space-time p oints is to o long. One p opular metho d in geostatistical modelling is to mo del the time deca y b y a mem b er of the Cauch y scale family with densit y and ch aracteristic function f ( x ) = c π (1 + c 2 x 2 ) , φ ( t ) = exp {−| t | /c } . Th us temp oral deca y lik e a n in v erse square seems a ppro priate here – one ma y exp ect b oth short- and long-range dep endence. Combination . F or practical use (b elow), a ll three fo rms of truncation – index, spatial and temp oral – will b e needed; whic h order they should b e done in, and how, will dep end o n the t yp e o f data b eing mo delled and the purp ose of the study . But ( B P ) and ( GM P ) giv e one all the flexibilit y one could ask for here. 6. Applications: Kriging and N umerical W eather Prediction (NWP) These strikingly simple and v ery recen t results, the Berg-P orcu and G uella- Menegatto-P eron theorems, are of profound imp ortance in sev eral ma j or geo- statistical areas. Kriging. Kriging (na med after t he mining engineer D . G. Krig e) is the statistical tec hnique f o r probabilistically interpolat ing a (space/ space-time) field from measured observ ations. An estimated p oin t Z ( x ) is f ormed of a w eighted a v erage (con v ex com bination) of the o bserv ations ( Z ( x i )) n i =1 , with closer ob- serv ations typically giv en more w eigh t. Clearly , then, the success of the in terp olation pro cedure is dep enden t on the correct sp ecification of the spa- tial structure of the field (in practice, kno wledge of the incremen tal v a r ia nce is necessary). F or details and references, see Cress ie [Cre], Stein [Ste], and for kriging on the sphere, Y oung [Y ou]. V ariationa l Data Assimilation Cen tra l to the pro cess of constructing w eather forecasts is data assimila- tion – the a r t and science of com bining an imp erfect mo del with inaccurate 7 observ ations; see the b o oks b y Kalnay [Kal], Ev ensen [Ev e2] and Reic h and Cotter [ReiC] for o v erviews of the p ossible tec hniques here. T o initialise the num erical mo del whic h generates a weather forecast, o ne needs the b est p ossible approx imation of the current state of the atmosphere ( o r o cean, o r b oth) – in the parlance of NWP this is kno wn a s the analysis x a . Giv en a prior (or bac kground ) x b , data assimilation see ks x a | y o , where the y o is a noisy set of observ ations. Ba yes ’s theorem giv es this a pro babilistic flav our: p( x | y o ) = p( y o | x )p B ( x ) p( y o ) . W e seek t o maximis e p( x | y o ). Assuming x b = x a + ǫ B , ǫ B ∼ N(0 , B ) and y o = H ( x ) + ǫ O , ǫ O ∼ N(0 , R ) this amounts to minimising J ( x ) = ( x − x b ) T B − 1 ( x − x b ) + ( y o − H ( x )) T R − 1 ( y o − H ( x )) . ( J ) This is kno wn as the cost function of 3D-V ar data a ssimilation – see the textb o oks b y Daley [Dal] and more recen tly Kalnay [Ka l] for details here. Impro v ements in this area hav e formed one strand of the ‘quiet rev olution’ in num erical w eather prediction o v er the past decade [BauTB]. In particular, the uptak e of four-dimensional assimilation (4D-V ar) in op era t ional settings allo ws fo r increased use of satellite data – here, observ ations are allo w ed to arriv e thro ug hout a time in terv al, rather than (as in 3D- V ar) b eing assumed coinciden tal with the desired time o f analysis x a . This is esp ecially imp or- tan t for forecasting in the Southern Hemisphere, with its smaller landmass (and therefore few er traditional weather observ ation p o sts). There are sev- eral complications to discuss, w e fo cus on t hr ee: Size . With millions of observ ations ( O (10 7 )/da y) and billions of mo del v ari- ables ( O (10 9 )), b oth B and R are h uge. Indeed, the co v ariance matrix B has O (10 18 ) en tries, and is simply to o big to b e stored on a computer. In prac- tice, (an appro ximation to) B is decomp osed in to more manageable matrices. Observ ation error correlations . Un til very recen tly [SteDN] the errors in the observ ations hav e b een as- sumed to b e indep enden t – that is, the observ ation error co v aria nce matrix R has b een assumed to b e diag onal. T o satisfy this assumption the (exp ensiv e to obtain) data ma y b e thinned or ‘sup erobb ed’ (a kind of av eraging). With 8 the uptake of hig her- resolution measuremen t instrumen ts this situatio n has ev olved f r om undesirable to un t enable, and efforts are b eing made to under- stand and imple men t non-diagonal R matrices, with nota ble con tributions b y Desroziers et al. [DesBCP], and a sc ho ol at the Univ ersit y of Reading [SteDN, W alDN, W alLDN], the latter of which is applied to the Ens em ble Kalman Filter (see b elo w). Since the o bserv ations errors are Gaussian pro cesses para metrised b y S 2 , or S 2 × R , the theorems abov e give a complete c haracterisation of the possible forms of the cov aria nce structure o f these erro rs. Nonlinearit y . The op erator H ab o v e, whic h maps mo del space in to observ ation space, is assumed to b e linear. This is an unrealistic assumption in the con text of w eat her and climate. A p ossible solution is the Ensem ble Kalman Filter (see b elo w), but this still requires the prior to b e G aussian. T o mo v e forward more in ven tiv e approach es are needed – some ideas and dev elopments in this direction are discusse d by v a n Leeu w en et al. in [v a nLCR, Part 1 ], including using v a riational data assimilation to arriv e at go o d prop osal densities, and new t yp es of appro ximate particle filters whic h b orrow from the ens em ble Kalman filter. Ensemble Kalma n filter A recen t dev elopment in num erical weather prediction is the idea of a probabilistic, rather than deterministic, f o recast. “W eather forecasts to day in v o lv e an ensem ble of n umerical weathe r predictions, pro viding an inher- en t ly proba bilistic assessmen t” [BauTB, p. 48]. The ensem ble Ka lman filter (EnKF), in tro duced by Ev ensen in 19 94, is a com bination of the Kalma n filter of con trol engineering (for background on whic h see e.g. Whittle [Whi]) with Mon te Carlo metho ds: an ensem- ble of sim ulatio ns ( particles) are run, enabling one to appro ximate a true co v ariance matr ix by a sample co v ariance matrix obtained fr o m these. F or a monogr a ph treatmen t se e e.g. Ev ensen [Ev e2], and for an earlier review pap er, [Ev e1]; it in v o lv es a com bination of the Kalman filtering ideas ab o v e, Ba y esian up dating, traditional v ariational tec hniques (Euler-Lagrang e equa- tions) [Ev e2, Ch. 9], and v ariational data assimilation (3D-V ar fo r space , 4D-V a r for space-time). T he metho d pro v ed its pr a ctical w orth early in w o rk by Ev ensen and v an Leeu w en [Ev evL] o n the Agulhas Curren t (off the SE coast of S. Africa). The ensem ble approac h pro duces a spread of results, 9 or scenarios, whic h capture the uncertain t y inheren t in w eather forecasting and allow a probabilistic in terpretation b y f orecasters. The dominan t store of energy in the en vironmen t is in the o ceans; this is the energy source that driv es the w eather. EnKF has b een wide ly used for study of the o ceans; see e.g. [Ev e2, Ch. 16]. It is already finding use in atmospheric studies; its use in coupled o cean-atmosphere in teraction is crucially imp ortan t, but y et to b e fully dev elop ed. F or an application of data assimilation t o three-dimensional h ydro dynam- ics, see W olf et al. [W olSBW]. 7. Chordal and geo desic distance. A function f : R d +1 × R d +1 → R is p ositiv e definite [BerCR] if n X i,j =1 c i c j f ( x i , x j ) ≥ 0 for all finite collections x 1 , . . . , x n ∈ R d +1 and real constan ts c 1 , . . . , c n ∈ R . These f a r e the F ourier(- Stieltjes) tra nsfor ms of (p ositiv e, finite) measures, b y Bo chner’s theorem. If, further, f ( x, y ) is a function purely o f the (Eu- clidean) distance t = || x − y || b etw een its t w o argumen ts, rather than the orien tation o f x − y , w e sa y f is isotropic . Suc h functions arise b y radial- ization of t he F ourier transform; they hav e an in tegral represen tation due to Sc ho en b erg [Sch 2]: f ( t ) = Z ∞ 0 Γ  d − 1 2   2 r t  d − 1 2 J ( d − 1) / 2) ( r t ) d µ ( r ) , where µ is a proba bilit y measure on R + and J ( d − 1) / 2 is a Besse l function of the first kind (cf. Bo c hner and Chandrasekharan [Bo cC, I I.7]). The se mo dified Bessel functions arise whenev er one radializes a c haracteristic function, as in Kingman’s random w alks with spherical symmetry [Kin]. A natural question here is whether one can restrict these functions from R d +1 to the sphere S d whilst retaining p ositive definiteness. A recip e due to Y adrenk o [Y a d] allows for the construction of a p ositiv e definite function ˆ f on S d from a p ositiv e definite function f on R d +1 via the transformation ˆ f ( θ ) = f  2 sin θ 2  , θ ∈ [0 , π ] . 10 Although this construction is con v enien t, one predictably pays a price for ignoring the geometry of the sphere. Gneiting [Gne3] highlighted this ap- proac h’s sev ere restrictions – t here is a lo w er bo und of inf t> 0 1 /t sin t ≈ − 0 . 21 on ˆ f when d = 2 (the geostatistical case). Moreo v er, G neiting argues that, since sin θ ≈ θ when θ is small, this construction is do omed to conflict with spherical geometry . It is preferable, therefore, to construct our p ositiv e def- inite functions on the sphere itself, using g eo desic (or great circle) distance in preference to Euclidean distance, as w e hav e done a b ov e. 8. Complemen ts 8.1. S pheric al harm onics, spheric al function s , Ge genb auer p olynomia l s . Spherical harmonics play the role in harmonic a nalysis on the sphere that trigonometric functions (o r complex exp onen tials) pla y on the line. They ha v e b een extensiv ely studie d, in classics suc h as MacRob ert [MacR], Hob- son [Hob] and more recen tly in e.g. Muller [Mul], Andrews, Aske y and Roy [AndAR, Ch. 9]. Spherical functions ( the term ‘zonal’ used to b e added, reflecting the isotrop y restriction w e used) arise similarly , as the name suggests. But they can b e defined m uc h more generally , in the con text of symmetric spaces; see e.g. [Hel1,2], [BinMS] f o r bac kground a nd references. In the compact case, as with spheres, thes e form a countable set. These reduce, as here, to t he Gegen bauer (or fo r us, Legendre) p olynomials. In turn, the addition form ula for the Gegen bauer p olynomials reflects (and is most easily prov ed via) the action of the group S O ( d ) on the sphere S d = S O ( d + 1) /S O ( d ) a s a coset space (symmetric space, as ab ov e); see [BinMS]. The reason that the mathematics here is dominated by orthogonal p oly- nomials rests directly on the g eometry of the sphere: it arises as a quotien t of compact Lie gr o ups; harmonic analysis on these rests on the P eter-W eyl theorem, f or whic h one has the Sc hur orthog onalit y relatio ns; see e.g. [App, § 2.2.1]. 8.2. T he F unk-He cke formula and F ourier-Ge genb auer c o efficie n ts . The F unk-Hec k e formula ([Mul], [AndAR Ch. 9]) relates ( in the notation of § 2) the integral in g ov er [ − 1 , 1] in the F ourier-Gegen ba uer co efficien ts to an inte gral in f ov er t he sphere S d . It is the p ositivity of these that is the crux here. 11 8.3. S trict p ositive de finiteness . Strict p ositiv e definiteness is defined in the usual w a y: the relev an t (non- negativ e) sums are p ositive ex cept when all the co efficien ts v anish. Che n, Menegatto and Sun [CheMS ] and Barb o sa and Mene gatto [BarM] c harac- terise strict p ositiv e definiteness here: infinitely many co efficien ts a n need to b e p ositiv e (for b oth eve n and o dd n in the case of spheres, a complication reflecting antipo dal p oin ts). Similar studies ha v e b een carried out by Guella, Menegatto and Pe ron ([GueM], [GueMP2,3]). In t eresting though t hese re- sults are, they are o f limited practical imp ortance: the strictnes s criterion (infinitely man y p ositiv e co efficien ts) do es not surviv e the t r uncatio n needed for practical implemen tatio n ( § 6). 8.4. The nugget effect . The nugget effect refers to the phenomenon of a non-zero incremen tal v ariance at zero separation, initially observ ed (and named after) the large amoun t of v a riabilit y in observ atio ns of go ld n uggets in v ery small geographic areas. The existence of the n ugget – or mor e precisely , whether it is an ar t e- fact of measuremen t error or whether it is inheren t micro- scale v aria bilit y in the pro cess b eing observ ed – is a sub ject of some scien tific debate. See [Cla] for mo r e details o f this interes ting effect, and [GneGG] for an example of the n ugget effect b eing fitted in a spatio-temp ora l cov aria nce mo del. 8.5. Lagra ng ian framew orks . In fluid dynamics o ne has a choice of o ne’s frame of reference – one may either sit on the riv er ba nk and w atc h the w ater pass (the Euclidean frame- w o rk), or one may follow the b o dy of w ater in a b oa t (the Lagrangian f rame- w o rk). In geostat istical cases, when symmetry is an inappropriate assump- tion, a Lag rangian co v ar ia nce structure ma y b e more suitable, o r, as sug- gested by G neiting [Gne2], a con v ex com bination of a symmetric co v ariance and a Lagrangian one. The basic idea of a Lagrangian co v ariance is to ta k e a stationary spatial random field with cov ariance C s ( h ) and follow it along a s it mov es with (random) v elo city V . The co v ariance of this field then has the for m C ( h , t ) = E V ( C s ( h − t V )) . Gneiting et al. [GneGG] discuss p o ssible choices for V , in particular not ing that prev ailing winds may b e mo delled b y the simplest case V = v , a con- stan t. They also raise the in teresting p ossibilit y of a dynamic ve lo city V ( t ), 12 whic h w ould yield nonstationary cov ariances. This framew ork has recen tly b een extended to the S d × R case by Alegria and P o r cu [AleP1], who also discuss the dimple effect (where the field is more strongly correlated when one steps forw ards in time than than space) for tr ansp ort mo dels. 8.6. Seasonality . T aking R as the time domain o f a spatio-temp oral field is na tural enough, but do ing so restricts us to cases where there is no cyclical comp onent to av oid violating the assumption of stationarity . An in triguing approa c h b y Alegria and P orcu [AleP2] comb ines t he idea of the Guella-Menegatto-P eron multiple pro duct S d × S d ( § 4) with the w ork o f Berg and P orcu o n S d × R ( § 3) to mo del spatio-temp o r a l fields o n S d × S 1 × R – with the temp oral comp onent decompo sed in to S 1 × R the cyclical com- p onen t can b e dealt with separately in a v ery natural w a y . The approac h is complemen ted b y an extension to the Lagrangian framework, as done in the S d × R case by the same a uthors [AleP1]. 8.7. Mixtur e m o dels . All three o f the Bo c hner-Sc ho en b erg, Berg- P orcu and Guella- Menegatto- P eron theorems giv e the relev an t classes of co v ariances a s (m ultiples of ) mix - tur es (con v ex com binations), of an n th harmonic – n th degree Legendre (o r Gegen bauer) factor, times other suc h factors – separable co v ariances, in the BP and GMP cases. F or implemen tation, w e must truncate the infinite mixture to a finite one. Suc h (finite) mixtur e mo dels hav e b een extensiv ely studied in statistics, and a great deal is kno wn ab out them. Our main source is the b o ok of McLac hlan and Pee l [McLP]; see also Titterington, Smith and Mak ov [TitSM], Lindsa y [Lind]. F or lar g e data sets as here, computer imple men tation is essen tial. The main to ol here is EMMIX [McLP], a v ersion fo r mixture mo dels of the EM (exp ectation-maximization) algorithm of Dempster, Laird and Rubin [DemLR]; see also Meng and v an D yk [Men vD]. One adapts this here from mixtures of distributions to mixtures of co v ariance matrices b y replacing v ari- ational distance b y a suitable matrix norm. W e note that the relev ant co v ariance matrices are of blo ck diagonal form, reflecting the separabilit y mentioned a b o v e. Their in v ersion tog ether is th us no harder than their inv ersion separately . W e are a lso spared ha ving to mo del off-diagonal blo ck submatrices represen ting, say , space-time inte raction: this is done for us b y the Legendre-p olynomial (‘sphe rical harmonic’) structure 13 in ( B S ) , ( GM P ). In regression, o ne ty pically decides how many parameters one is prepared to use, and then tests the h yp othesis that the last one used is statistically significan t (Kolo dzieczyk’s theorem: see e.g. [BinF, Th. 6.5]). In deciding ho w man y misture comp onents (harmonics) t o retain here, one can pro ceed as there, or use a Bay esian approach, putting a pr io r on the n umber o f com- p onen ts. F or v arious choices here, see [McLP , § 4.7 ]. Mixture ideas are used in the Euclidean case by Ma [Ma] and P orcu and Mateu [P orM]. 8.8. L ar ge c ovarianc e matric es . In ( J ) of § 6, one needs to inv ert large sample-cov aria nce matrices B , R . If not sparse enough, this is a c hallenging numerical task; how ev er, there is a great deal of b oth relev an t theory a nd practical exp erience. F or the latter, w e refer to the sources cited in § 6, and the curren t practices of dat a thinning and ‘sup erobbing’ (a v eraging ov er groups of data, and inflating cov ar ia nces) in NWP . F or the former, one needs the c ondition numb er (ratio of la rgest to smallest eigen v a lues: see e.g. Wilkinson [Wil, § 4.3], Golub and V an Loa n [GolVL, Ch. 8]); in this conte xt, see e.g. Bai and Lin [BaiL], Lin, Bai and Krishnaiah [L inBK]. Among a w ealth of o ther sources, we men tion here also P ourahmadi [P ou1,2], Ledoit and W olf [LedW1,2], W on et al. [W onLKR], Bic kel and Levina [BicL], El Ka roui [ElK]. 8.9. S tatistics on spher es and spheric al r e gr ession . Statistics on the 1- sphere (circle) is treated in Fisher [Fis]. One motiv a- tion (Ch. 7) is wind dir ection. Statistics on the sphere is tr eated in W atson [W at] and F isher, L ewis and Em bleton [FisLE]. One needs a range of spher- ical analogues o f the standar d distributions in Euclidean space, including the Fisher, Fisher-Bingham (R. A. Fisher and C. Bingham) a nd Ken t distri- butions. A classical mot iv ation here w as to remane n t magnetism in ro c ks, seeking supp ort for W egener’s theory of continen tal drift. F or an application of mixture mo dels to join t sets (parallel fractures) in ro cks , see P eel et al. [P eeWM]. F or spherical regression, similarly applicable to plate tectonics, see Do wns [Dow] and the references cited there. 8.10. Schur p olynomia l s . A recen t a lg ebraic approach to this area, also motiv ated b y suc h things as the Berg-Porcu theorem and large co v ariance matrices, has b een give n by 14 Belton, Guillo t , Khare and Putinar [BelGKP]. It uses Sc hur p olynomials, and prop erties o f symmetric functions; for details, see [BelGK P] and the ref- erences cited there. Ac kno wledgemen t . The second author thanks the EPSR C Mathematics of Planet Earth CDT, based join tly in the Mathematics D epartmen t at Imp erial College, London and the Meteorology D epart ment at Reading Univ ersity , for financial sup- p ort during her do ctoral studies. The first author thanks the same sources for r ekindling his mathematical inte rest in his own do ctora l studies long ago. References [AleP1] A. Alegria and E. P orcu. The dimple problem related to space-time mo delling under a Lagrangian framew ork. arXiv:1611. 09005v1 (2016). [AleP2] A. Alegria and E. P orcu. Space-time g eostatistical mo dels with b oth linear a nd seasonal structures in the temp ora l comp onents. arXiv:1702. 01400 (2017). [AlePFM] A. Alegria, E. P orcu, R. F urrer and J. Mateu. Co v ariance func- tions for m ultiv ariate Gaussian fields ev olving temp ora lly o v er Planet Earth, arXiv:1701. 00601v1 (2017). [AndAR] G . E. Andrews, R. Ask ey and R . R oy , Sp ecial f unctions . Encycl. Math. Appl. 71 , Cam bridge Univers it y Press, 1999 . [App] D. Applebaum, Probability on compact Lie groups . Springer, 2014. [AskB] R. A. Ask ey and N. H. Bingham, Gaussian pro cesses on compact symmetric spaces. Z. W ahrsc hein. v erw. Geb. 37 ( 1 976), 127 -143. [BaiL] Z. D . Bai and Y. Q. Lin, Limit of the smallest eigen v alue of a la rge- dimensional sample cov a riance matrix. Ann. Probab. 21 (199 3 ), 1275-129 4. [BarM] V. S. Barb osa and V. A. Menegatto. Strictly p ositiv e definite ke rnels on t w o-p oin t compact ho mog eneous spaces. Math. Inequal. Appl. 19 ( 2) (2015), 743–7 56. [BarBBB] R. E. Barlo w, D. J. Ba rtholomew, J. M. Bremner and H. D . Brunk, Statistical inference under order restrictions . Wiley , 197 2. [BauTB] P . Bauer, A. Thorp e and G. Brunet. The quiet rev olution of n u- merical w eather prediction. Nature 525 (201 5), 47–5 5 . [BelGKP] A. Belton, D. G uillot, A. Khare and M. Putinar, Sc h ur p olynomi- als and matr ix p ositivity preserv ers. arXiv:1602.04777 (2016) . 15 [BerCR] C. Berg, J. P . R. Christens en a nd P . Ressel, Harmonic analysis on semigroups: Theory of p ositiv e definite and related functions . Springer, 1984. [BerP] C. Berg and E. P orcu, F rom Sc ho enberg co efficien ts to Schoenberg functions. Constructiv e Appro ximation 45 (2017), 217 – 241. [BicL] P . J. Bic k el and E. L evina, R egula r ised estimation of large co v ariance matrices. Ann. Sta t ist. 36 (2008), 188- 227. [Bin1] N. H. Bingham, Limit theorems and semigroups in pro ba bilit y t heory . PhD thesis, Univ ersit y of Cam bridge, 1969. [Bin2] N. H. Bingham, P ositiv e definite functions on spheres. Pro c. Cam- bridge Phil. So c. 73 (1 973), 145 -156. [BinF] N. H. Bing ha m and J. M. F ry , Regression: Linear mo dels in statistics . Springer, 201 0. [BinMS] N. H. Bingham, Alexsandar Mijatovi ´ c and T asmin L. Symons. Bro w- nian manifolds, negativ e ty p e and geo-temp oral co v ariances. Comm unica- tions in Sto chas tic Analysis (H. Hey er F estsc hrift) , 10 (4) (2 016), 421 – 432. [BinP] (with S. M. Pitts): Non-parametric inference for the M / G/ ∞ queue. Ann. Inst. Sta tistical Mathematics 51 (1 999), 71-97 . [Bo c1] S. Bo c hner, Monoto ne F unktionen, Stieltjessc he In tegrale und har- monisc he Analyse. Math. Annalen 108 (1933 ), 378-4 10. [Bo c2] S. Bo c hner, Hilb ert distances and p ositiv e definite functions. Ann. Math. 42 (1 941), 647 -656. [Bo c3] S. Bo chner, Review of [Sch4]. Math. Reviews 3 (1942) p.23 2 (MR000592 2 ). [Bo cC] S. Bo c hner and K. Chandrasekharan, F ourier transforms . Annals of Mathematics Studies 19, Princeton Univ ersit y Press, 19 4 9. [Bo cS] S. Bo c hner and I. J. Sc ho enberg, On p ositive definite functions on compact spaces. Bull. Amer. Math. So c. 46 (1940), 8 81. [Cla] I. Clark. Statistics or geostatistics? Sampling error or n ugget effect? Journal of the Sout hern African Institute o f Mining and Metallurgy 110 (6) (2010), 307–3 12. [CheMS] D . Chen, V. A. Menegatto and X. Sun. A necessary and sufficien t condition for strictly p ositiv e definite functions on spheres. Pro c. Ame r. Math. So c. 131 (9) (20 0 3), 2733– 2740. [Cre] N. Cressie, Statistics for spatial data , r evised ed., Wiley , 199 3. [Dal] R. D aley . Atmospheric Data Analysis . Cam bridge Univ ersit y Press, 1991. [DemLR] A. P . Dempste r, N. M. Laird a nd D. B. Rubin, Maxim um lik eli- ho o d from incomplete data via the EM algorithm (with discussion). J. R oy al 16 Statistical So ciet y B 39 , 1- 3 8. [DesBCP] G. D esroziers, L. Berre, B. Chapnic k and P . P oli. Diagnosis of observ ation, bac kground a nd analysis-error statistics in observ ation space. Q. J. R . Met. So c. 131 (2005), 3385 – 3396. [Do w] T. Down s, Spherical regression. Bio metrik a 90 (2003), 655-668 . [ElK] N. El Karo ui, Sp ectral estimation fo r large-dimensional co v a riance ma- trices using random-matrix theory . Ann. Statist. 36 (20 08), 186-19 7. [Ev e1] G. Ev ensen, The ensem ble Kalman filter: theoretical form ulation and practical implemen tation. Ocean D ynamics 53 (2003), 343- 367. [Ev e2] G. Ev ensen, Data assim ilation: The ensem ble Kalman filter . Springer, 2006. [Ev evL] G . Ev ensen and P . J. v an Leeu w en, Assimilation o f Geosat altimeter data for the Agulhas curren t using the ensem ble Kalman filter with a quasi- geostrophic mo del. Mon thly W eather Review 24 (1996), 85- 9 6. [FinHI] B. Fink enst¨ adt, L. Held and V. Isham, Statistics of spatio-temp oral systems . Chapman & Hall/ CRC, 2007. [Fis] N. I. Fisher, Statistical analysis of circular data . Cambridge Univ ersit y Press, 1993. [FisLE] N. I. Fisher, T. Lewis and B. J. J. Em bleton, Sta tistical analysis of spherical data . Cam bridge Unive rsit y Press, 1987. [Gan] R. Gangolli, P ositiv e definite k ernels o n homogeneous spaces a nd cer- tain sto c hastic pro cesses related to L ´ evy’s Brownian motion of sev eral pa- rameters. Ann. Inst. H. P oincar´ e B (NS) 3 (1967), 121- 2 26. [Gne1] T. Gneiting. Correlation functions for atmospheric data ana lysis. Q. J. R. Met. So c. 135 no. 55 9 (1999), 24 49 – 24 64. [Gne2] T. Gneiting, Non-separable, stationary co v aria nce functions for space- time data. J. Amer. Sta t. Asso c. 87 (2002), 590-600. [Gne3] T. Gneiting, Strictly and non- strictly positive definite functions on spheres. Bernoulli 19 (201 3 ), 1327 - 1349. [GneGG] T. Gneiting, M. G. Genton and P . Guttor p, Geostatistical space- time mo dels, stationar it y , separabilit y and f ull symmetry . In [FinHI], 151 - 175. [GolVL] G. H. Golub and C. F . V an Loa n, Matrix computation , 3rd ed. The Johns Hopkins Univ ersity Press, 1996. [GueM] J. C. Guella and V. A. Menegatto, Strictly p ositiv e definite k ernels on a pro duct of spheres. J. Math. Anal. Appl. 435 (201 6), 286-301 . [GueMP1] J. C. Guella, V. A. Menegatto and A. P . Pe ron, An extension of a theorem of Schoenberg to pro ducts of spheres. Banac h J. Math. Analysis 17 10 (2016), 671-685 . [GueMP2] J. C. Guella, V. A. Menegatto and A. P . P eron, Strictly p ositiv e definite kernels on a pro duct of spheres I I. SIGMA (Symmetry , In tegrabilit y and Geometry , Metho ds and Applications) 12 (2 0 16), 15p. [GueMP3] J. C. Guella, V. A. Menegatto and A. P . P eron, Strictly p ositiv e definite ke rnels on a pro duct of circles. Positivit y 21 (2017), 329 -342. [Hel1] S. Helgason, D ifferen t ial geometry a nd symmetric spaces . Academic Press, 1962. [Hel2] S. Helgason, Differen tial geometry , Lie gro ups and symmetric spaces . Academic Press, 1978. [Hob] E. W. Hobson, The theory of spherical and elllipsoidal ha rmonics . Cam bridge Univers it y Press, 1931. [HorJ] R. A. Hor n and C. R. Johnson, Matrix analysis , 2nd ed., Cam bridge Univ ersity Press, 2013. [Kal] E. Kalna y . A tmospheric mo delling, data assimilation and predictabil- it y . Cam bridge Univ ersit y Press, 200 3. [Ken] M. Kennedy , A sto c hastic pro cess asso ciated with the ultraspherical p olynomials. Pro c. Roy . Irish Acad. Sect. A 61 (1969), 61 – 89. [Kin] J. F. C. Kingman, Random w alks with spherical sym metry . Acta Math. 109 (1963), 11-53. [Lam] J. La mp erti, The a rithmetic of certain semigroups of p ositiv e op era- tors. Pro c. Cam bridge Philos. So c. 64 ( 1969), 16 1 – 166 . [LedW1] O. Ledoit and M. W olf , A w ell-conditioned estimate for large- dimensional co v ariance matrices. J. Multiv ariate Analysis 88 (2004), 36 5 - 431. [LedW2] O. Ledoit and M. W olf, No n-linear shrink age estimation of large- dimensional cov a riance matrices. Ann. Statist. 40 (2012), 1 0 24-1060 . [LinBK] Y. Q. Lin, Z. D. Bai a nd P . R. Krishnaiah, O n t he limit of the largest eigen v a lue of the large-dimensional sample co v ariance matrix. Probab. Th. Rel. Fields 78 (1988), 509-521. [Lind] B. G. Lindsa y , Mixture models: Theory , g eometry and applications . NSF-CBMS Reg. Conf. Ser. Prob. Stat. 5, IMS, 1995. [Ma] C. Ma, Spatio -temp oral co v ariance functions generated by mixtures. Math. Geolo g y 34 (2002), 651-684 . [MacR] T. M. MacRob ert, Spherical harmo nics . Meth uen, 1 9 27. [McLP] G. McLac hlan a nd D. Pe el, Finite mixture mo dels . Wiley , 2 0 00. [Men vD] X. L. Meng and D, v an Dyk, The EM algorit hm – an o ld folk song sung to a new fast tune (with discussion). J. Roy al Statistical So ciet y B 59 18 (1997), 511- 567. [MonAF] P Monestiez, D. Allard and R . F roidev aux. geoENV I I I : Geo- statistics for En vironmen tal Applications. Pro ceedings of the Third Euro- p ean Conference on G eostatistics for En vironmen ta l Applications, held in Avignon, F rance, No v em b er 2224, 20 00 . Springer, 2000. [Mul] C. Muller, Spherical har mo nics . Lecture Notes in Math. 17 (1966) , Springer. [P eeWM] D. Pe el, W. J. Whiten and G. McLac hlan, Fitting mixtures of Ken t distributions to aid in joint set iden t ification. J. Amer. Statist. Asso c. 96 (2001), 56-6 3. [P orM] E. Porcu and J. Mateu, Mixture-based mo delling for space-time data. En vironmetrics 18 (2007 ), 285-302. [P ou1] M. P ourahmadi, Cov ariance estimation: the GLM and regularization p ersp ectiv es. Statistical Science 26 (2011 ), 369-3 87. [P ou2] M. P ourahmadi, High- dimensional cov ariance matrices . Wiley , 2 0 13. [ReiC] S. Reic h a nd C. Cotter, Probabilistic forecasting and Ba y esian data assimilation , Cam bridge Univ ersit y Press, 2015. [RobWD] T. Rob ertson, F. T. W right and R. L. Dyks tra, Order-restricted statistical inference . Wiley , 19 88. [Sc h1 ] I. J. Sc ho enberg, Metric spaces and p ositive definite f unctions. T rans. Amer. Math. So c. 44 (1938), 522-536 (reprin ted in I. J. Sc ho enberg, Se- lected Papers, V ol. 1 (ed. C. de Bo or), Birkh¨ a user, 198 8, 100- 1 14). [Sc h2 ] I. J. Sc ho enberg, Metric spaces and completely monotone functions. Ann. Math. 39 (4) (19 38), 811 – 841. [Sc h3 ] I. J. Sch o enberg, On p ositiv e definite functions o n spheres. Bull. Amer. Math. So c 46 (1 940), 888. [Sc h4 ] I. J. Sch o enberg, P ositiv e definite f unctions on spheres. Duke Math. J. 9 (1 9 42), 96-108 ( Selected P ap ers 1 , 172- 185). [Ste] M. L. Stein, In terp olation of spatial data: some theory fo r kriging . Springer, 201 2. [SteDN] L. M. Stew art, S. L. Dance and N. K. Nic hols, Correlated observ a- tion errors in data assimilation. In t . J. Num. Meth. F luids 56 no. 8 (2007), 1521 – 1527. [Sze] G . Szeg˝ o, Orthogonal p olynomials , Amer. Math. So c. Collo quium Publications XXI I I, AMS, 1939. [TitSM] D. M. Titteringto n, A. F. M. Smith and U. E. Mak o v, Statistical analysis of finite mixture distributions . Wiley , 19 8 5. [V al] G. K. V allis, A tmospheric and o ceanic fluid dynamics: F undamen tals 19 and large-scale circulation . Cambridge Univ ersit y Press, 2006 . [v anLCR] P . J. v an Leeu w en, Y. Cheng and S. Reic h. Nonlinear Data As- similation . Springer, 2015. [W alDN] J. A. W aller, S. L. Dance and N. K. Nic hols. Theoretical in- sigh t in to diag nosing observ ation erro r correlations using o bserv ation- min us- bac kground and observ ation-min us-analysis statistics. Q. J. R. Me t. So c. 142 (2016), 418 – 4 3 1. [W alDLN] J. A. W aller, S. L. Dance, A. S. Lawles s a nd N. K. Nic hols. Es- timating corr elated observ ation error statistics using an ensem ble tra nsform Kalman filter. T ellus A 66 (20 1 4). [W at] G. S. W at son. Statistics on spheres . Wiley , 198 3 . [Whi] P . Whittle, Optimal con trol: Basics and b ey ond . Wiley , 199 6. [Wil] J. H. Wilkinson, The alg ebraic eigen v a lue problem . Oxford Univ erit y Press, 1965. [W olSBW] T. W olf, J. S ´ en ´ egal, L. Bertinas and H. W ack ernagel, Applicatio n of data assimilation t o three-dimensional hyd ro dynamics: The case of the Odra lago on. In [MonAF], 157–168 . [W onLKR] J.-H. W o n, J. Lim, S.-J. Kim and B. Ra ja ratnam, Condition- n um b er regularised co v ariance estimation. J. Roy al Statistical So ciety B 75 (2013), 427- 450. [Y ad] M. ˇ I. Y adrenk o, Spectral theory of random fields. T ranslation Series in Mathematics and Engineering. New Y o rk: Optimisation Softw are, 19 8 3. [Y ou] D. S. Y oung, R andom ve ctors and spatial analysis by geostatistics for geotec hnical a pplicatio ns. Mathematical Geology 19 (1987 ), 4 67-479. N. H. Bingham, Mathematics D epartmen t, Imperial College, L o ndon SW7 2AZ; n.bingham@ic.ac.uk T asmin L. Symons, Mathematics Depart men t, Imp erial College, Lo ndon SW7 2AZ; tls111@ic.ac.uk 20

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment