A Review of Nonparametric Hypothesis Tests of Isotropy Properties in Spatial Data

Submitted to Statistical Science A Review of Nonpa rametric Hyp othesis T ests of Isotrop y Prop erties in Spatial Data Zacha ry D. W eller and Jennifer A. Ho eting Depa rtment of Statistics, Colo rado State Universit y Abstract. An imp o rtant asp ect of mo deling spatially-referenced data is app ropriately sp ecifying the cova riance function of the random ﬁeld. A p ractitioner w o rking with spatial data is presented a numb er of choices rega rding the structure of the dep endence b et ween observations. One of these choices is determining whether or not an isotropic cova riance func- tion is app rop riate. Isotropy implies that spatial dep endence do es not de- p end on the direction of the spatial separation b etw een sampling lo cations. Missp eciﬁcation of isotrop y properties (directional dep endence) can lead to misleading inferences, e.g., inaccurate p redictions and pa rameter es- timates. A resea rcher ma y use graphical diagnostics, such as directional sample variograms, to decide whether the assumption of isotropy is rea- sonable. These graphical techniques can b e diﬃcult to assess, op en to sub- jective interp retations, and misleading. Hyp othesis tests of the assumption of isotropy may b e mo re desirable. T o this end, a numb er of tests of direc- tional dep endence have b een develop ed using b oth the spatial and spectral rep resentations of random ﬁelds. We provide an overview of nonparametric metho ds available to test the hyp otheses of isotropy and symmetry in spa- tial data. We summarize test properties, discuss imp ortant considerations and recommendations in cho osing and implementing a test, compa re sev- eral of the metho ds via a simulation study , and p rop ose a numb er of op en resea rch questions. Several of the reviewed metho ds can b e implemented in R using our package spTest , available on CRAN . MSC 2010 subject classiﬁcations: Primary 62M30, ; secondary 62G10. Key wo rds and phrases: isotrop y, symmetry , nonpa rametric spatial covari- ance. 1. INTRODUCTION Early spatial mo dels relied on the simplifying assumptions that the co v ariance function is stationary and isotropic. With the emergence of new sources of spatial (e-mail: wel lerz@stat.c olostate.e du ) (e-mail: jah@stat.c olostate.e du ) ∗ W eller’s work was supp orted by the National Science F oundation Research Net work on Statistics in the Atmospheric and Ocean Sciences (ST A TMOS) (DMS-1106862). Ho eting’s re- searc h was supp orted by the National Science F oundation (AGS-1419558). † The authors w ould lik e to thank Peter Guttorp and Alexandra Sc hmidt for organizing the P an-American Adv anced Study Institute (P ASI) on spatiotemp oral statistics in June 2014 whic h inspired this work. 1 imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 2 ZA CHAR Y D. WELLER data, for instance, remote sensing via satellite, climate mo del output, or environ- men tal monitoring, a v ariet y of metho ds and mo dels hav e b een developed that relax these assumptions. In the case of anisotrop y , there are a n umber of metho ds for mo deling both zonal anisotropy ( Journel and Huijbregts , 1978 , pg. 179-184; Ec ker and Gelfand , 2003 ; Sc hab en b erger and Gotw ay , 2004 , pg. 152; Banerjee et al. , 2014 , pg. 31) and geometric anisotrop y ( Borgman and Chao , 1994 ; Eck er and Gelfand , 1999 ). Rapid gro wth of computing p ow er has allow ed the imple- men tation and estimation of these mo dels. When mo deling a spatial process, the sp eciﬁcation of the co v ariance function will ha ve an eﬀect on kriging and parameter estimates and the asso ciated uncer- tain ty ( Cressie , 1993 , pg. 127-135). Sherman ( 2011 , pg. 87-90) and Guan et al. ( 2004 ) use numerical examples to demonstrate the adverse implications of incor- rectly sp ecifying isotrop y prop erties on kriging estimates. Given the v ariety of c hoices a v ailable regarding the prop erties of the co v ariance function (e.g., para- metric forms, isotropy , stationarity) and the eﬀect these c hoices can ha v e on in- ference, practitioners may seek methods to inform the selection of an appropriate co v ariance mo del. A n um b er of graphical diagnostics ha v e b een proposed to determine isotropy prop erties. P erhaps the most commonly used metho ds are directional semiv ari- ograms and rose diagrams ( Matheron , 1961 ; Isaaks and Sriv astav a , 1989 , pg. 149- 154). Banerjee et al. ( 2014 , pg. 38-40) suggest using an empirical semiv ariogram con tour plot to assess isotrop y as a more informative metho d than directional sample semiv ariograms. Another tec hnique inv olv es comparing empirical esti- mates of the cov ariance at diﬀerent directional lags to assess symmetry for data on gridded sampling lo cations ( Mo djesk a and Ra wlings , 1983 ). One cav eat of the aforemen tioned metho ds is that they can b e challenging to assess, are op en to sub jectiv e in terpretations, and can b e misleading ( Guan et al. , 2004 ) b ecause they typically do not include a measure of uncertaint y . Exp erienced statisticians ma y ha ve intuition ab out the in terpretation and reliability of these diagnostics, but a novice user ma y wish to ev aluate assumptions via a hypothesis test. Statistical h yp othesis tests of second order prop erties can b e used to supple- men t and reinforce intuition ab out graphical diagnostics and can b e more ob jec- tiv e. Lik e the graphical tec hniques, h yp othesis tests ha ve their own cav eats; for example, a parametric test of isotrop y demands speciﬁcation of the co v ariance function. A nonparametric metho d for testing isotropy a v oids the p otential prob- lems of missp eciﬁcation of the cov ariance function and the requirement of mo del estimation under b oth the null and alternative hypothesis, whic h can b e compu- tationally exp ensiv e for large datasets. F urthermore, nonparametric metho ds do not require the common assumption of geometric anisotrop y . Ho w ever, in aban- doning the parametric assumptions ab out the cov ariance function, implementing a test of isotrop y presen ts sev eral c hallenges (see Section 5 ). A nonparametric test of isotrop y or symmetry can serv e as another form of exploratory data analysis that supplements graphical tec hniques and informs the c hoice of an appropriate nonparametric or parametric mo del. Figure 1 illustrates the pro cess for assessing and mo deling isotropy prop erties. In this article w e review nonparametric h yp othesis tests developed to test the assumptions of symmetry and isotrop y in spatial pro cesses. W e summarize tests in b oth the spatial and sp ectral domain and provide tables that enable con v enien t imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 3 Inv estigating Isotropy Properties Graphical T echniques Hypothesis T ests Empirical Semiv ariogram Contour Plot Rose Diagram Directional Semiv ariograms Parametric Nonparametric More Assessment of Isotropy Needed? Y e s Choice of Nonparametric T est See Figure 2 Model for Cov ariance F unction Nonparametric Cov ariance F unction Parametric Cov ariance F unction D e t e r m i n e I s o t r o p y P r o p e r t i e s N o Parametric Cov ariance F unction Isotropic Anisotropic Isotropic Anisotropic Geometric Anisotropy Zonal Anisotropy Model Estimation Assessment of Candidate Mo dels S p e c t r a l D o m a i n S p a t i a l D o m a i n Type of Anisotropy Model Estimation Under H 0 and H 1 G e o m e t r i c Z o n a l Model Comparison Under H 0 and H 1 e.g., LR T S p e c t r a l D o m a i n S p a t i a l D o m a i n Choose Final Model Fig 1: A ﬂo w c hart illustrating the pro cess of determining and mo deling isotropy in spatial data. The gray b o xes indicate the fo cus of this pap er. imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 4 ZA CHAR Y D. WELLER comparisons of test prop erties. A simulation study ev aluates the empirical size and p ow er of several of the metho ds and enables a direct comparison of metho d p erformance. The sim ulations also lead to new insigh ts into test p erformance and implementation beyond those given in the original works. Finally , we include graphics that demonstrate considerations for c ho osing a nonparametric test and illustrate the pro cess of determining isotropy prop erties. The remainder of this article is organized as follo ws: Section 2 establishes nota- tion and deﬁnitions; Section 3 details the v arious nonparametric hypothesis tests of isotropy and symmetry and includes T ables 1 - 3 which facilitate comparison b et w een tests as well as test selection for users; Section 4 describ es the simula- tion study comparing the v arious methods; Sections 5 and 6 provide discussion and conclusions. Additional details on the simulation study are sp eciﬁed in the App endix. 2. NOT A TION AND DEFINITIONS Here we brieﬂy review key deﬁnitions required for tests of isotropy . F or addi- tional bac kground, see Sc habenberger and Gotw ay ( 2004 ). Let { Y ( s ) : s ∈ D ⊆ R d , d > 1 } b e a second order stationary random ﬁeld (RF). Below w e will assume that d = 2, although many of the results hold for the more general case of d > 2. F or a spatial lag h = ( h 1 , h 2 ), the semiv ariogram function describes dependence b et w een observ atons, Y , at spatial lo cations separated by lag h and is deﬁned as (2.1) γ ( h ) = 1 2 V ar( Y ( s + h ) − Y ( s )) . The classical, moment-based estimator of the semiv ariogram ( Matheron , 1962 ) is (2.2) b γ ( h ) = 1 2 |D ( h ) | X [ Y ( s ) − Y ( s + h )] 2 , where the sum is ov er D ( h ) = { s : s , s + h ∈ D } and |D ( h ) | is the num b er of elemen ts in D ( h ). The set D ( h ) is the set of sampling lo cation pairs that are separated by spatial lag h . The cov ariance function, C ( h ), is an alternative to the semiv ariogram for describing spatial dep endence and is giv en b y C ( h ) = lim || v ||→∞ γ ( v ) − γ ( h ) if the limit exists. Let { s 1 , . . . , s n } ⊂ D b e the ﬁnite set of lo cations at whic h the random pro- cess is observed, providing the random vector ( Y ( s 1 ) , . . . , Y ( s n )) T . The sampling lo cations may follo w one of several spatial sampling designs: gridded lo cations, randomly and uniformly distributed lo cations, or a cluster design. Some authors mak e the distinction b et w een a lattice pro cess and a geostatistical pro cess ob- serv ed on a grid ( F uentes and Reic h , 2010 ; Schabenberger and Gotw a y , 2004 , pg. 6-10). W e do not explore this distinction further and will use the term grid throughout. It is often of in terest to infer the eﬀect of co v ariates on the pro cess, deduce de- p endence structure, and/or predict Y with quantiﬁable uncertaint y at new lo ca- tions. T o ac hiev e these goals, the practitioner m ust sp ecify the distributional prop- erties of the spatial pro cess. A common assumption is that the ﬁnite-dimensional join t distribution is m ultiv ariate normal (MVN), in whic h case we call the RF imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 5 a Gaussian random ﬁeld (GRF). W e are in terested in second order prop erties; th us, hereafter we assume that the mean of the RF is zero. A common simplifying assumption on the spatial dep endence structure is that it is isotropic. Definition 2.1 . A se c ond or der stationary sp atial pr o c ess is isotr opic if the semivario gr am, γ ( h ) , [or e quivalently, the c ovarianc e function C ( h ) ] of the sp atial pr o c ess dep ends on the lag ve ctor h only thr ough its Euclide an length, h = || h || , i.e., γ ( h ) = γ 0 ( h ) for some function γ 0 ( · ) of a univariate ar gument. Isotrop y implies that the dep endence betw een an y t wo observ ations dep ends only on the distance b etw een their sampling lo cations and not on their relative orien- tation. A spatial pro cess that is not isotropic is called anisotropic. Anisotrop y is broadly categorized as either geometric or zonal ( Zimmerman , 1993 ). In practice, if a pro cess is assumed to b e anisotropic, it is usually assumed to b e geometri- cally anisotropic due to its precise formal and functional deﬁnition ( Ec ker and Gelfand , 1999 ). Geometric anisotrop y is gov erned by a scaling parameter, R , and rotation parameter, θ , and implies the anisotropy can b e corrected via a linear transformation of the lag vector or, equiv alen tly , the sampling lo cations ( Cressie , 1993 , pg. 64). Symmetry is another directional prop ert y of the co v ariance (semiv ariogram) function, whic h is often used to describ e the spatial v ariation of pro cesses on a grid. W e discuss symmetry prop erties here as they are a subset of isotrop y , and metho ds for testing isotrop y can often b e used to test symmetry . The follo wing deﬁnitions come from Lu and Zimmerman ( 2005 ) and Scaccia and Martin ( 2005 ) where the notation C ( h 1 , h 2 ) denotes the cov ariance b etw een random v ariables lo cated h 1 columns and h 2 ro ws apart on a rectangular grid, denoted L 2 . Definition 2.2 . A se c ond or der stationary sp atial pr o c ess on a grid is r eﬂe c- tion or axial ly symmetric if C ( h 1 , h 2 ) = C ( − h 1 , h 2 ) for al l ( h 1 , h 2 ) ∈ L 2 . Definition 2.3 . A se c o nd or der stationary sp atial pr o c ess on a grid is diag- onal ly or later al ly symmetric if C ( h 1 , h 2 ) = C ( h 2 , h 1 ) for al l ( h 1 , h 2 ) ∈ L 2 . Definition 2.4 . A se c ond or der stationary sp atial pr o c ess on a grid is c om- pletely symmetric if it is b oth r eﬂe ction and later al ly symmetric, i.e., C ( h 1 , h 2 ) = C ( − h 1 , h 2 ) = C ( h 2 , h 1 ) = C ( − h 2 , h 1 ) for al l ( h 1 , h 2 ) ∈ L 2 . Complete symmetry is a weak er prop ert y than isotropy . Isotropy requires that C ( h 1 , h 2 ) dep ends only on p h 2 1 + h 2 2 for all h 1 , h 2 . The relationship b et ween these prop erties is given b y: (2.3) isotrop y = ⇒ complete symmetry = ⇒ reﬂection symmetry diagonal symmetry . Th us, rejecting a null hypothesis of reﬂection symmetry implies evidence against the assumptions of reﬂection symmetry , complete symmetry , and isotrop y . How- ev er, failure to reject a null h yp othesis of reﬂection symmetry do es not imply an assumption of complete symmetry or isotrop y is appropriate. imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 6 ZA CHAR Y D. WELLER The aforemen tioned prop erties of isotrop y and symmetry w ere deﬁned in terms of examining the spatial random pro cess in the spatial domain, where second order prop erties dep end on the spatial separation, h . Alternativ ely , a spatial random pro cess can b e represen ted in the sp ectral domain using F ourier analysis. F or the purp oses of in vestigating second order prop erties, we are in terested in the sp ectral representation of the cov ariance function, called the sp ectral density function and denoted f ( ω ), where ω = ( ω 1 , ω 2 ). Under certain conditions and assumptions ( F uentes and Reich , 2010 , pg. 62), the sp ectral densit y function is giv en b y (2.4) f ( ω ) = 1 (2 π ) 2 Z R 2 exp( − i ω T h ) C ( h ) d h , so that the cov ariance function, C ( h ), and the sp ectral density function, f ( ω ), form a F ourier transform pair. Prop erties of the cov ariance function imply prop erties of the spectral density . F or example, if the cov ariance function is isotropic ( 2.1 ), then the sp ectral den- sit y ( 2.4 ) dep ends on ω only through its length, ω = || ω || , and we can write f ( ω ) = f 0 ( ω ), where f 0 ( · ) is called the isotropic sp ectral density ( F uentes , 2013 ). Consequen tly , second order properties of a second order stationary RF can be explored via either the cov ariance function or the sp ectral density function. T est statistics quantifying second order properties can be constructed using the pe- rio dogram, an estimator of ( 2.4 ) and denoted by I ( · ). F or a real-v alued spatial pro cess on a rectangular grid Z 2 ⊂ R 2 , a moment-based p erio dogram used to estimate ( 2.4 ) is (2.5) I ( ω 1 , ω 2 ) = 1 (2 π ) 2 n 1 − 1 X h 1 = − n 1 +1 n 2 − 1 X h 2 = − n 2 +1 b C ( h 1 , h 2 ) cos( h 1 ω 1 + h 2 ω 2 ) , where n 1 and n 2 denote the n um b er of ro ws and columns of the grid and b C ( h 1 , h 2 ) is a non-parametric estimator of the cov ariance function. In practice, the peri- o dogram ( 2.5 ) is used to estimate the sp ectral density at the F ourier or harmonic frequencies. The frequency ω = ( ω 1 , ω 2 ) is a F ourier or harmonic frequency if ω j is a multiple of 2 π /n j , j = 1 , 2. F urthermore, the set of frequencies is limited to { ω j = 2 π k j /n j , k j = 1 , 2 , . . . , n ∗ j } , where n ∗ j is ( n j − 1) / 2 if n j is o dd and n j / 2 − 1 if n j is even. 3. TESTS OF ISOTROPY AND SYMMETRY 3.1 Brief Histo ry Matheron ( 1961 ) dev elop ed one of the earliest h yp othesis test of isotrop y when he used normality of sample v ariogram estimators to construct a χ 2 test for anisotrop y in mineral dep osit data. Cabana ( 1987 ) tested for geometric anisotropy using lev el curv es of random ﬁelds. V ecc hia ( 1988 ) and Baczko wski and Mardia ( 1990 ) dev elop ed tests for isotropy assuming a parametric cov ariance function. Baczk owski ( 1990 ) also prop osed a randomization test for isotropy . Despite these early works, little work on testing isotropy w as published during the 1990s, al- though the PhD dissertation wor k of Lu ( 1994 ) would ev en tually ha ve an notew or- th y impact on the literature. Then, in the 2000s, a n um b er of nonparametric tests imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 7 of second-order prop erties emerged. Some of the developmen ts used estimates of the v ariogram or co v ariogram to test symmetry and isotrop y prop erties ( Lu and Zimmerman , 2001 ; Guan , 2003 ; Guan et al. , 2004 , 2007 ; Mait y and Sherman , 2012 ). These w orks generally b orro w ed ideas from tw o b o dies of literature: (a) theory on the distributional and asymptotic prop erties of semiv ariogram estima- tors (e.g., Baczk owski and Mardia , 1987 ; Cressie , 1993 , pg. 69-47; Hall and Patil , 1994 ) and (b) subsampling techniques to estimate the v ariance of statistics de- riv ed from spatial data (e.g., Possolo , 1991 ; Politis and Sherman , 2001 ; Sherman , 1996 ; Lahiri , 2003 ; Lahiri and Zh u , 2006 ). Other nonparametric metho ds used the sp ectral domain to test isotrop y and symmetry ( Scaccia and Martin , 2002 , 2005 ; Lu and Zimmerman , 2005 ; F uen tes , 2005 ). These works generally extended ideas used in the time series literature (e.g., Priestley and Rao , 1969 ; Priestley , 1981 ) to the spatial case. Metho ds for testing isotrop y and symmetry in b oth the spatial and sp ectral domains, under the assumption of a parametric cov ariance function, hav e also been dev eloped recen tly ( Stein et al. , 2004 ; Hask ard , 2007 ; F uentes , 2007 ; Matsuda and Y a jima , 2009 ; Scaccia and Martin , 2011 ). 3.2 Nonpa rametric Metho ds in the Spatial Domain A p opular approach to testing second order prop erties was pioneered in the w orks of Lu ( 1994 ) and Lu and Zimmerman ( 2001 ) who leveraged the asymptotic join t normalit y of the sample v ariogram computed at diﬀeren t spatial lags. The subsequen t w orks of Guan et al. ( 2004 , 2007 ) and Mait y and Sherman ( 2012 ) built up on these ideas and are the primary fo cus of this subsection. Lu ( 1994 ) details metho ds for testing axial symmetry . While Lu and Zimmerman ( 2001 ), Guan et al. ( 2004 ), and Mait y and Sherman ( 2012 ) fo cus on testing isotropy , these methods can also b e used to test symmetry . Finally , Bo wman and Cru- jeiras ( 2013 ) detail a more computational approach for testing isotropy . Both Li et al. ( 2007 , 2008b ) and Jun and Genton ( 2012 ) use an approach analogous to the metho ds from Lu and Zimmerman ( 2001 ), Guan et al. ( 2004 , 2007 ), and Mait y and Sherman ( 2012 ) but for spatiotemp oral data. T able 1 summarizes test prop erties discussed in this section and Section 3.3 . Nonparametric tests for anisotrop y in the spatial domain are based on a n ull h yp othesis that is an approximation to isotrop y . Under the null h yp othesis that the RF is isotropic, it follows that the v alues of γ ( · ) ev aluated at any tw o spatial lags that hav e the same norm are equal, regardless of the direction of the lags. T o fully sp ecify the most general null h ypothesis of isotropy , theoretically , one would need to compare v ariogram v alues for an inﬁnite set of lags. In practice a small n umber of lags are sp eciﬁed. Then it is p ossible to test a h yp othesis consisting of a set of linear contrasts of the form (3.1) H 0 : A γ ( · ) = 0 as a proxy for the null hypothesis of isotrop y , where A is a full row rank matrix ( Lu and Zimmerman , 2001 ). F or example, a set of lags, denoted Λ , commonly used in practice for gridded sampling lo cations with unit spacing is (3.2) Λ = { h 1 = (1 , 0) , h 2 = (0 , 1) , h 3 = (1 , 1) , h 4 = ( − 1 , 1) } , and the corresp onding A matrix under H 0 : A γ ( Λ ) = 0 is (3.3) A =  1 − 1 0 0 0 0 1 − 1  . imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 8 ZA CHAR Y D. WELLER One of the ﬁrst steps in detecting p oten tial anisotrop y is the c hoice of lags, as the test results will only hold for the particular set of lags considered ( Guan et al. , 2004 ). While this choice is sub jectiv e, there are sev eral considerations and useful guidelines for determining the set of lags (see Section 5 ). F or nonparametric tests of symmetry , the n ull hypothesis of symmetry using ( 3.1 ) can be expressed by a coun table set of contrasts for a process observ ed on a grid. T ests of symmetry will be sub ject to similar practical considers as tests of isotropy , and practitioners testing symmetry prop erties will need to c ho ose a small set of lags and form a hypothesis that is a surrogate for sym- metry . F or example, testing reﬂection symmetry of a pro cess observed on the in teger grid would require comparing estimates of C ( · ) ev aluated at the lag pairs { (1 , 0) , ( − 1 , 0) } , { (2 , 0) , ( − 2 , 0) } , { (1 , 1) , ( − 1 , 1) } , etc. The tests in Lu and Zimmerman ( 2001 ), Guan et al. ( 2004 , 2007 ), and Maity and Sherman ( 2012 ) inv olve estimating either the semiv ariogram, γ ( · ), or co- v ariance, C ( · ), function at the set of c hosen lags, Λ . Denoting the set of p oint estimates of the semiv ariogram/co v ariance function at the given lags as b G n , the true v alues as G , and normalizing constan t a n , a cen tral result for all three meth- o ds is that (3.4) a n ( b G n − G ) d − − − → n →∞ M V N ( 0 , Σ ) , under increasing domain asymptotics and mild moment and mixing conditions on the RF. Using the A matrix, an estimate of the v ariance co v ariance matrix, b Σ , and b G n , a quadratic form is constructed, and a p-v alue can b e obtained from an asymptotic χ 2 distribution with degrees of freedom given b y the ro w rank of A . The primary diﬀerences b et ween these works are the assumed distribution of sampling lo cations, the shap e of the sampling domain, and the estimation of G and Σ . These diﬀerences are imp ortan t when choosing a test that is appropriate for a particular set of data (see T ables 1 and 2 and Figure 4 for more information ab out these diﬀerences). Mait y and Sherman ( 2012 ) dev elop a test with the few est restrictions on the shap e of the sampling region and distribution of sampling lo cations. Their test can b e used when the sampling region is any con v ex subset in R d and the distri- bution of sampling lo cations in the region follows any general spatial sampling design. The test in Guan et al. ( 2004 ) also allows conv ex subsets in R d and is de- v elop ed for gridded and non-gridded sampling lo cations but requires non-gridded sampling lo cations to b e uniformly distributed on the domain, i.e., generated by a homogenous Poisson process. The P oisson assumption is dropp ed in Guan et al. ( 2007 ). Lu and Zimmerman ( 2001 ) require the domain to b e rectangular and the observ ations to lie on a grid. Another diﬀerence b etw een metho ds is the form of the nonparametric estima- tor of G . In Lu and Zimmerman ( 2001 ), b G n is computed using the log of the classical sample semiv ariogram estimator ( 2.2 ). Guan et al. ( 2004 , 2007 ) also use the estimator in ( 2.2 ) for gridded sampling lo cations, but use a kernel estimator of γ ( h ) for non-gridded lo cations. Maity and Sherman ( 2012 ) use a k ernel estimator of the cov ariance function. When smo othing ov er spatial lags in R 2 , the kernel is t ypically given as Nadaray a-W atson ( Nadara y a , 1964 ; W atson , 1964 ) pro d- uct kernel, indep endently smo othing o v er horizon tal and v ertical lags. Common c hoices for the k ernel are the Epanechnik ov or truncated Gaussian kernels. The imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 9 T able 1: Prop erties of nonparametric tests of isotropy . “Domain” refers to the domain used to represen t the RF (spatial or sp ectral),“T est Stat Based On” lists the nonparametric estimator used to construct the test statistic “Distb’n” gives the limiting asymptotic distribution of the test statistic, and “GP” denotes whether the test requires data to b e generated from a Gaussian pro cess. Hyp othesis T est Prop erties T est Metho d Isotrop y Symmetry Domain T est Stat Based On Asymptotics Distb’n GP Lu and Zimmerman ( 2001 ) y es y es spatial semiv ariogram inc domain χ 2 y es Guan et al. ( 2004 , 2007 ) y es y es spatial (k ernel) a v ariogram inc domain χ 2 b no Scaccia and Martin ( 2002 , 2005 ) partial y es sp ectral p eriodogram inc domain Z , t no Lu and Zimmerman ( 2005 ) partial y es sp ectral p eriodogram inc domain χ 2 , F no F uentes ( 2005 ) partial no sp ectral spatial p eriodogram shrinking (mixed) χ 2 y es Mait y and Sherman ( 2012 ) yes y es spatial k ernel cov ariogram inc domain χ 2 no Bo wman and Crujeiras ( 2013 ) y es no spatial v ariogram inc domain appro x χ 2 y es V an Hala et al. ( 2014 ) y es y es sp ectral empirical lik eliho od shrinking (mixed) χ 2 no a for gridded sampling lo cations, the estimator in ( 2.2 ) is used while a k ernel v ariogram estimator is used for non-gridded sampling lo cations b p-v alues may need to b e appro ximated using ﬁnite sample adjustments T able 2: T est implementation, part 1. “Subsamp” deﬁnes whether spatial subsampling pro cedures are needed to p erform the test, “S&P sim” denotes whether or not the author(s) of the metho d pro vide a simulation of test size and p ow er (See also T able 3 ). Hyp othesis T est Implemen tation T est Metho d Sampling Domain Shap e Sampling Design Subsamp S&P Sim Soft ware Lu and Zimmerman ( 2001 ) rectangular in R 2 grid no y es a no Guan et al. ( 2004 , 2007 ) con vex subsets in R d grid/unif b /non-unif c y es y es a R pac k age spTest Scaccia and Martin ( 2002 , 2005 ) rectangular in R 2 grid no y es a no Lu and Zimmerman ( 2005 ) rectangular in R 2 grid no y es R pac k age spTest F uentes ( 2005 ) rectangular in R d grid no y es a no Mait y and Sherman ( 2012 ) con vex subsets in R d non-unif c y es y es a R pac k age spTest Bo wman and Crujeiras ( 2013 ) con vex subsets in R d unif b no y es a R pac k age sm V an Hala et al. ( 2014 ) subsets in R d non-unif c no y es a no a sim ulated data are Gaussian only b sampling locations must b e generated b y homogeneous Poisson pro cess, i.e. uniformly distributed on the domain c sampling locations can b e generated b y any general sampling design imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 10 ZA CHAR Y D. WELLER k ernel estimators also require the choice of a bandwidth parameter, w . Cho osing an appropriate bandwidth is one of the c hallenges of implemen ting the tests for non-gridded sampling locations, and the conclusion of the test has the p oten- tial to b e sensitiv e to the choice of the bandwidth parameter (see Section 5 for recommendations on bandwidth selection). Nonparametric tests in the spatial domain also v ary in the estimation of Σ , the asymptotic v ariance-co v ariance of b G n in ( 3.4 ). Lu and Zimmerman ( 2001 ) use a plug-in estimator, which requires the c hoice of a parameter, m , that truncates the sum used in estimation. Spatial resampling methods are another approac h used to estimate Σ. The metho d used for spatial resampling and prop erties of estimators computed from spatial resampling will depend on the underlying spatial sampling design ( Lahiri , 2003 , pg. 281). Guan et al. ( 2004 , 2007 ) use a moving windo w ap- proac h, creating ov erlapping subblocks that co v er the sampling region. Mait y and Sherman ( 2012 ) emplo y the grid based block b o otstrap (GBBB) ( Lahiri and Zhu , 2006 ). The GBBB approac h divides the spatial domain in to regions, then replaces eac h region b y sampling (with replacement) a blo ck of the sampling domain ha v- ing the same shap e and volume as the region, creating a spatial p ermutation of blo c ks of sampling lo cations. When using the resampling metho ds, the user must c hose the windo w or blo c k size and the conclusion of the test has the p oten tial to c hange based on the c hosen size. Irregularly shaped sampling domains can p ose a challenge in using the subsampling methods. F or example, for an irregularly shap ed sampling domain, man y incomplete blo c ks ma y complicate the subsam- pling pro cedure. W e summarize guidelines for choosing the window/block size in Section 5 . Another approach to testing isotropy in the spatial domain is given b y Bowman and Crujeiras ( 2013 ) who take a more empirical and computationally-intensiv e approac h. Their methods are a v ailable in the R soft w are ( R Core T eam , 2014 ) pac k age sm ( Bowman and Azzalini , 2014 ). One cav eat of using the sm pack age is that the metho ds are computationally exp ensive, ev en for mo derate sample sizes. F or example, a test of isotropy on 300 uniformly distributed sampling lo cations on a 10 × 16 sampling domain took approximately 9.5 minutes where the metho ds from Guan et al. ( 2004 ) to ok 1.6 seconds using a laptop with 8 GB of memory and a 2 GHz Intel Core i7 pro cessor. Because of the computational costs, w e do not consider this metho d further. 3.3 Nonpa rametric Metho ds in the Sp ectral Domain F or gridded sampling locations, nonparametric sp ectral metho ds hav e b een dev elop ed for testing symmetry ( Scaccia and Martin , 2002 , 2005 ; Lu and Zim- merman , 2005 ) and stationarity ( F uentes , 2005 ), but none are designed with a primary goal of testing isotrop y . Due to the diﬃculties presen ted by non-gridded sampling lo cations, historically there ha v e b een few er dev elopmen ts using sp ec- tral methods for non-gridded sampling lo cations than for gridded (or lattice) data, but this is an area that has receiv ed more attention recen tly (see, e.g., F uentes , 2007 ; Matsuda and Y a jima , 2009 ; Bandy opadhy ay et al. , 2015 ). Despite the challenges, V an Hala et al. ( 2014 ) hav e prop osed a nonparametric, empirical lik eliho o d approac h to test isotropy and separabilit y for non-gridded sampling lo cations. The primary motiv ation for using the sp ectral domain o v er the spatial domain imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 11 are simpler asymptotics in the sp ectral domain. Unlik e estimates of the v ariogram or co v ariogram at diﬀeren t spatial lags, estimates of the sp ectral density at diﬀer- en t frequencies via the perio dogram are asymptotically independent under certain conditions ( P agano , 1971 ; Sc hab en b erger and Got wa y , 2004 , pg. 78,194). Addi- tionally , in practice, tests of symmetry in the sp ectral domain are generally not sub ject to as many c hoices (e.g., spatial lag set, bandwidth, block size) as those in the spatial domain. Analogous to testing isotrop y in the spatial domain b y using a ﬁnite set of spa- tial lags, tests of symmetry in the sp ectral domain typically in v olve estimating and comparing the sp ectral density ( 2.4 ) at a ﬁnite set of the F ourier frequencies. F or example, axial symmetry ( 2.2 ) of the co v ariance function implies axial sym- metry of the sp ectral densit y , f ( ω 1 , ω 2 ) = f ( − ω 1 , ω 2 ), which can be ev aluated b y comparing I ( ω 1 , ω 2 ) to I ( − ω 1 , ω 2 ) at a ﬁnite set of frequencies. Similarly , the n ull hypothesis of isotropy can b e appro ximated by comparing perio dogram ( 2.5 ) estimates at a set of distinct frequencies with the same norm ( F uentes , 2005 ). Although most of the current sp ectral metho ds are not directly designed to test isotrop y , the hypothesis of complete symmetry can b e used to reject the assump- tion of isotropy due to ( 2.3 ). Ho w ev er, certain types of anisotropy ma y not b e detected b y these tests. F or example, a geometrically anisotropic pro cess having the m a jor axis of the ellipses of equicorrelation parallel to the x -axis is axially symmetric, and the anisotropy wouldn’t b e detected b y a test of axial symmetry . Scaccia and Martin ( 2002 , 2005 ) use the perio dogram ( 2.5 ) to dev elop a test for axial symmetry . They prop ose three test statistics that are a function of the p erio dogram v alues. The ﬁrst uses the av erage of the diﬀerence in the log of the p erio dogram v alues, log[ I ( ω 1 , ω 2 )] − log[ I ( ω 1 , − ω 2 )]. The second and third test statistics use the a v erage of standardized p erio dogram diﬀerences, [ I ( ω 1 , ω 2 ) − I ( ω 1 , − ω 2 )] / [ I ( ω 1 , ω 2 ) + I ( ω 1 , − ω 2 )]. These test statistics are shown to asymptot- ically follow a standard normal or t distribution via the Cen tral Limit Theorem, and the corresp onding distributions are used to obtain a p-v alue. Lu and Zimmerman ( 2005 ) also use the perio dogram as an estimator of the sp ectral densit y to test prop erties of axial and complete symmetry of processes on the integer grid, Z 2 . They use the asymptotic distribution of the p erio dogram to construct tw o potential test statistics. Both test statistics lev erage the fact that, under certain conditions and at certain frequencies, (3.5) 2 I ( ω 1 , ω 2 ) f ( ω 1 , ω 2 ) iid − − − − − − → n 1 ,n 2 →∞ χ 2 2 . Under the null hypothesis of axial or complete symmetry , ( 3.5 ) implies that ratios of p eriodogram v alues at diﬀerent frequencies follow an F (2 , 2) distribution. The preferred test statistic pro duces a p-v alue via a Cram´ er-v on Mises (CvM) go o d- ness of ﬁt test using the appropriate set of p erio dogram ratios. Because rejecting a h yp othesis of axial symmetry implies rejecting a hypothesis of complete sym- metry , Lu and Zimmerman ( 2005 ) recommend a tw o-stage pro cedure for testing complete symmetry . At the ﬁrst stage, they test the h yp othesis of axial symmetry , and if the n ull hypothesis is not rejected, they test the h yp othesis of complete symmetry . T o control the ov erall t yp e-I error rate at α , the tests at each stage can b e p erformed using a signiﬁcance level of α/ 2. Lev eraging the asymptotic indep endence of the p eriodogram at diﬀerent fre- quencies, V an Hala et al. ( 2014 ) propose a spatial frequency domain empirical imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 12 ZA CHAR Y D. WELLER lik eliho o d (SFDEL) approach that can be used for inference about spatial co- v ariance structure. One of the applications of this metho d is testing isotropy . An adv an tage of this method o ver other frequency domain approac hes is that it can be used for non-gridded sampling locations. T o implement the test, the user m ust select the set of lags and, b ecause the sampling lo cations are not grid- ded, the num ber and spacing of frequencies. V an Hala et al. ( 2014 ) oﬀer some guidelines for these choices based on the simulations and theoretical considera- tions (e.g., the frequencies need to b e asymptotically distant). Once these c hoices are made, V an Hala et al. ( 2014 ) maximize an empirical lik eliho o d under a mo- men t constraint corresponding to isotropy and sho w that the log-ratio of the con- strained and unconstrained maximizer asymptotically follo ws a χ 2 distribution. The SFDEL method relies on the asymptotic indep endence of the p erio dogram v alues, and the smallest sample size used in simulations w as n = 600. Th us, it is not clear how the metho d will p erform for smaller sample sizes. F uentes ( 2005 ) in troduces a nonparametric, spatially v arying sp ectral density to represent nonstationary spatial pro cesses. While the metho d can b e used to test the assumption of isotrop y , the test requires a large sample size on a ﬁne grid. F or this reason and also b ecause the test was primarily designed to test the assumption of stationarity , we do not consider it further. 4. SIMULA TION STUD Y W e designed a sim ulation study to c ompare the empirical size, pow er, and computational costs for four of the metho ds. F or gridded sampling lo cations, we compare Lu and Zimmerman ( 2005 )[hereafter, LZ ] to Guan et al. ( 2004 )[hereafter denoted as GSC or GSC -g when w e are sp eciﬁcally referring to the test when ap- plied to gridded sampling lo cations]. F or uniformly distributed sampling lo cations w e compare Maity and Sherman ( 2012 )[ MS ] to Guan et al. ( 2004 , 2007 )[ GSC -u for the metho d used for uniformly distributed sampling lo cations]. W e p erformed the tests on the same realizations of the RF. Data are sim- ulated on rectangular grids or rectangular sampling domains b ecause they are more realistic than square domains and simulations on rectangular domains were not previously demonstrated. W e sim ulate Gaussian data with mean zero and exp onen tial cov ariance functions with no nugget, a sill of one, and eﬀectiv e range v alues corresponding to short, medium, and long range dep endence. W e introduce v arying degrees of geometric anisotropy via co ordinate transformations gov erned b y a rotation parameter θ and scaling parameter R that deﬁne the ellipses of equicorrelation (see Figure 5 in the App endix). The parameter θ quan tiﬁes the angle b etw een the ma jor axis of the ellipse and the x -axis (counter-clockwise ro- tation) while R giv es the ratio of the ma jor and minor axes of the ellipse. W e also p erformed sim ulations that inv estigate the eﬀect of the lag set, blo ck size, and bandwidth. Although some simulations are giv en in the original w orks, our sim ulations serve to provide a direct comparison of the eﬀects of c hanging these v alues and provide further insigh t into how to choose them. See the App endix for additional simulation details and results. Figures 2 and 3 illustrate a subset of the simulation results compring empirical size, pow er, and computational time (full results in Appendix, T ables 5 and 6 ). These simulations indicate that nonparametric tests for anisotropy ha v e higher p o w er for gridded (Figure 5 ) than for non-gridded (Figure 6 ) sampling designs. imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 13 T able 3: T est Implemen tation, part 2. This table contin ues the list of c hoices and considerations for implemen ting a given test. “Samp Size (S/A)” indicates the minimum sample sizes used in simulations (S) and applications (A) provided b y the author(s) of the metho d. Hyp othesis T est Implemen tation T est Metho d Choices Other Considerations Samp Size (S/A) Lu and Zimmerman ( 2001 ) spatial lag set, truncation parameter optimal truncation parameter 100/112 Guan et al. ( 2004 ) gridded design spatial lag set, window size optimal window size, edge eﬀects, ﬁnite sample adjustment 400/289 Guan et al. ( 2004 ) uniform design Guan et al. ( 2007 ) non-uniform design spatial lag set, kernel function, bandwidth parameter, window size optimal bandwidth & window size, edge eﬀects, ﬁnite sample adjustment 400/289 500/584 Scaccia and Martin ( 2002 , 2005 ) test statistic requires gridded sampling lo cations; designed to test symmetry 121/– Lu and Zimmerman ( 2005 ) test statistic requires gridded sampling lo cations; t wo-stage testing pro cedure, designed to test symmetry; relies on asymptotic indep endence 100/– F uentes ( 2005 ) kernel function, bandwidth parameters, frequency set, spatial knots requires ﬁne grid; designed to test stationarit y 5175/5175 Mait y and Sherman ( 2012 ) lag set Λ , kernel function, bandwidth parameter, subblock size, num b er of b ootstrap samples optimal bandwidth & blo ck size 350/584 Bo wman and Crujeiras ( 2013 ) bandwidth parameter computationally in tensive 49/148 V an Hala et al. ( 2014 ) lag set, num b er and spacing of frequencies optimal n umber and spacing of frequencies, relies on asymptotic indep endence 600/– imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 14 ZA CHAR Y D. WELLER In b oth comparisons the metho ds from GSC hav e fav orable empirical p o w er ov er the comp etitor with a comparable empirical size. As the eﬀectiv e range increases, b oth empirical size and p ow er tend to increase for the metho ds from GSC , but they tend to decrease for MS . GSC -g and LZ ha v e similar computation time, while MS is muc h more computationally exp ensive than GSC -u. This diﬀerence is due to the b o otstrapping required by MS . Unsurprisingly , as the strength of anisotrop y increases (measured by R ), p ow er increases for all the metho ds. F or a geometrically anisotropic pro cess, the ma jor and minor axes of anisotropy are orthogonal. In comparing the eﬀect of the orien- tation of isotrop y ( θ ) on the methods, it is important to note that when θ = 0, the ma jor axis of the ellipse deﬁning the geometric anisotropy is parallel to the x -axis and corresp onds to a spatial process that is axially symmetric but not completely symmetric. When θ = 3 π / 8 the ma jor axis of th e ellipse forms a 67.5-degree angle with the x -axis, and the spatial pro cess is neither axially nor completely sym- metric (see Figure 5 in the App endix for contours of equal correlation used in the sim ulation). The original works generally only sim ulate data from a geometrically anisotropic pro cess with the ma jor axis of anisotropy forming a 45-degree angle with the x -axis; hence, our simulation study more carefully explores the eﬀect of c hanging the orien tation of geometric anisotrop y . The metho ds from GSC exhibit higher pow er when θ = 0 than when θ = 3 π / 8. This is due to the fact that the lag set, Λ , from ( 3.2 ) used for the tests contains a pair of spatial lags that are parallel to the ma jor and minor axes of anisotrop y when θ = 0, indicating that an informed choice of spatial lags improv es the test’s abilit y to detect anisotrop y . This same result do es not hold for MS . It is unclear whether this b eha vior is ob- serv ed b ecause the metho d uses the cov ariogram rather than the semiv ariogram, the GBBB rather than moving window approach for estimating Σ , or p erhaps b oth. The simulation results indicate that the LZ test has lo w empirical p o wer; ho wev er, this metho d w as dev elop ed to test symmetry prop erties on square grids, and the choice of a rectangular grid for our sim ulation study do es not allow for a large num ber of p erio dogram ordinates for the second stage of the pro cedure for testing the complete symmetry hypothesis. Results from sim ulations that in v estigate the eﬀects of c hanging the lag set, the blo c k size, and the bandwidth for non-gridded sampling lo cations are display ed in T ables 7 - 9 , resp ectively , in the App endix. F or b oth GSC -u and MS , the lag set in ( 3.2 ) provided an empirical size close to the nominal lev el. Using more lags or longer lags decreased the size and p ow er for GSC -u. This may b e due to the additional uncertain t y induced by estimating the c o v ariance betw een the semiv ariance at more lags and the larger v ariance of semiv ariance estimates at longer lags. F or MS the longer lags lead to an inﬂated size and more lags decreased the p o w er. In this case, the GBBB may not b e capturing the uncertain ty in co v ariance estimates at longer lags with the c hosen blo ck size. The MS test w as not ov erly sensitive to blo c k size with larger blocks leading to sligh tly higher p o w er. MS found that an o v erly large block size was adverse for test size. F or GSC -u the small and normal sized windows performed at nominal size lev els with comparable p o w er while larger windows were detrimental to test size and p ow er. F or GSC -u, w e ﬁnd that choosing a large windo w tends to lead to ov erestimation of the asymptotic v ariance-co v ariance matrix due to fewer blo c ks b eing used to re-estimate the semiv ariance. Finally , the results in vestigating the bandwidth imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 15 selection for GSC -u indicate that choosing an ov erly large bandwidth inﬂates test size while choosing to o small a bandwidth deﬂates test size and p ow er. How ev er, the results also indicate that, for the small sample size, test size and p o wer are less negativ ely aﬀected when appro ximating the p-v alue via the ﬁnite sample adjustmen t. W eller ( 2015b ) demonstrates applications of sev eral of these metho ds on t w o real data sets. The R pac k age spTest ( W eller , 2015a ) implemen ts the tests in LZ , GSC , and MS for rectangular grids and sampling regions and is a v ailable on the Comprehensiv e R Archiv e Netw ork ( CRAN ). Proportion of Rejections 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 GSC-g GSC-g LZ LZ N = 216 N = 375 N = 216 N = 375 (1.11s) (7.29s) (1.45s) (4.99s) Method (Time: 1 Test) R = 0 R = 2 R = 2 θ = 0 θ = 3 π 8 Fig 2: Empirical size and p o w er for Guan et al. ( 2004 ) [ GSC -g] and Lu and Zim- merman ( 2005 ) [ LZ ] for 500 realizations of a mean 0 GRF with gridded sampling lo cations using a nominal level of α = 0 . 05. Colors and shapes indicate the t ype of anisotrop y . Gra y p oints corresp ond to the isotropic case. The results corre- sp ond to a “medium” eﬀectiv e range. Computational time for eac h method is also display ed. 5. RECOMMEND A TIONS Based on the simulation results w e oﬀer recommendations for implemen tation of nonparametric tests of isotropy . The ﬂow chart in Figure 1 along with Figure 4 summarize the steps in the pro cess. T ables 1 - 3 compare the tests. T able 4 summarizes the recommendations provided b elo w. In c ho osing a nonparametric test for isotrop y , the distribution of sampling lo cations on the sampling domain is p erhaps the most important consideration. Data on a grid simpliﬁes estimation b ecause the semiv ariogram or cov ariogram imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 16 ZA CHAR Y D. WELLER T able 4: General Recommendations for T est Implementation. This table con tains a list of general recommendations for test implemen tation. These guidelines will not apply in all situations and will v ary based on a v ariet y of factors including, but not limited to, the sample size, densit y of sampling lo cations, and scale of the problem. See additional discussion in Section 5 . Hyp othesis T est Choices T est Metho d Lag Set a Blo c k Size Bandwidth P-v alue min. n Guan et al. ( 2004 ) gridded design Length: shorter preferred n b < n 1 / 2 n/a ﬁnite sample adjustment 150 Guan et al. ( 2004 , 2007 ) uniform design Orien tation: Eqn ( 3.2 ) n b . n 1 / 2 0 . 6 < w < 0 . 9 b ﬁnite sample adjustment when n < 500, asymptotic χ 2 when n ≥ 500 300 Mait y and Sherman ( 2012 ) Num b er: 4 (2 pairs) n b & n 1 / 2 empirical bandwidth asymptotic χ 2 300 a Prior kno wledge, if a v ailable, should b e used to inform the choice of lags. b Our sim ulations suggest these bandwidth v alues are reasonable when using a Gaussian kernel with truncation parameter of 1.5. imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 17 Proportion of Rejections 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 GSC-u GSC-u MS MS N = 300 N = 450 N = 300 N = 450 (2.17s) (4.44s) (83.40s) (271.22s) Method (Time: 1 Test) R = 0 R = 2 R = 2 θ = 0 θ = 3 π 8 Fig 3: Empirical size and p ow er for Guan et al. ( 2004 ) [denoted GU] and Mait y and Sherman ( 2012 ) [denoted MS] for 200 realizations of a mean 0 GRF with uniformly distributed sampling lo cations using a nominal level of α = 0 . 05. Col- ors and shap es indicate the type of anisotropy . Gra y p oin ts corresp ond to the isotropic case. The results corresp ond to a “medium” eﬀective range. Computa- tional time for each metho d is also display ed. can b e estimated at spatial lags that are exactly observ ed separating pairs of sampling lo cations. A grid also allows the option of using easily implemen ted tests in the sp ectral domain. Sample size requirements for the asymptotic prop erties of tests using the spatial domain to approximately hold will dep end on the dep endence structure of the random ﬁeld. GSC note that conv ergence of their test statistic is slo w in the case of gridded sampling lo cations and obtain an approximate p-v alue via subsampling rather than the asymptotic χ 2 distribution. T ests using the sp ectral domain rely on the asymptotic indep endence of p erio dogram v alues, and correlation in ﬁnite samples can lead to an inﬂated test size ( LZ ). Based on our sim ulations, w e recommend the sample size b e at least 150 for gridded sampling lo cations and at least 300 for non-gridded sampling lo cations. How ever, p ow er tends b e low when the sample size is small and/or the anisotropy is weak (Figures 2 and 3 ). W e focus on implementation of the methods that use the spatial domain for the remainder of this section. W e discuss the choice of lags, blo c k size, and bandwidth for the tests in GSC and MS . Due to the large n umber of c hoices required to implemen t the tests (e.g., blo ck size, bandwidth, kernel function, subsampling metho d), features of the random ﬁeld (e.g., sill, range), and properties of the imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 18 ZA CHAR Y D. WELLER Nonparametric Hyp othesis T est Spatial Sampling Design RF representation GU MS Bowman MS G r i d d e d D e s i g n U n i f o r m D e s i g n G e n e r a l D e s i g n LZ SM F uentes GG Bowman S p e c t r a l D o m a i n S p a t i a l D o m a i n Model for Cov ariance F unction (See Figure 1) Fig 4: Spatial sampling design considerations for choosing a nonparametric hy- p othesis test of isotrop y . The metho d w e recommended for testing isotrop y in eac h situation is giv en in b old including LZ = Lu and Zimmerman ( 2005 ); SM = Scaccia and Martin ( 2005 ); GSC-g = Guan et al. ( 2004 ) for gridded sampling lo- cations; GSC-u = Guan et al. ( 2004 ) for uniformly distributed sampling locations; MS = Mait y and Sherman ( 2012 ); GSC-n = Guan et al. ( 2007 ) for non-uniform sampling lo cations. sampling design (e.g., density of sampling lo cations, shap e of sampling domain), the recommendations w e oﬀer will not apply in all situations. The numerous mo ving parts make it c hallenging to dev elop general recommendations, esp ecially when choosing a bandwidth. When determining the lag set, Λ , for use in ( 3.1 ), the user needs to select (a) the norm of the lags (e.g., Euclidean distance), (b) the orientation (direction) of the lags, and (c) the num ber of lags. Regarding (a), short lags are preferred. Estimates of the spatial dep endence at large lags ma y b e less reliable than estimates at shorter lags because they are based on a smaller num b er of pairs of observ ations and hence more v ariable. Additionally , empirical and theoretical evidence ( Lu and Zimmerman , 2001 ) in- dicates that v alues of γ ( · ) in t w o diﬀeren t directions generally exhibit the largest diﬀerence at a lag less than the eﬀective range, the distance b eyond which pairs of observ ations can b e assumed to b e indep enden t. Finally , there is mathemat- ical supp ort that correctly specifying the cov ariance function at short lags is imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 19 imp ortan t for spatial prediction ( Stein , 1988 ). Considering (b), if the pro cess is anisotropic, the ideal choice of Λ and A contrasts lags with the same norm but orien ted in the direction of weak est and strongest spatial correlation. T ypically , the directions of weak est and strongest spatial correlation will b e orthogonal and th us, lags con trasted using the A matrix should also b e orthogonal. Prior infor- mation, if av ailable, ab out the underlying physical/biological pro cess giving rise to the data can also be used to inform the orientation of the lags ( Guan et al. , 2004 ). If no prior information about potential anisotrop y is a v ailable, lags ori- en ted in the same directions as those in ( 3.2 ) are a go o d starting set. In regards to (c), detecting certain types of anisotrop y requires a suﬃcient num b er of lags but using a large num b er of lags requires a large num b er of observ ations ( Guan et al. , 2004 ). As a general guideline, w e suggest using four lags to construct tw o con trasts. Sev eral tests require selection of a window or blo c k size to estimate the v ariance- co v ariance matrix. The mo ving window from GSC creates ov erlapping subblo c ks of data by sliding the windo w o ver a grid placed on the region. Eac h of these sub- blo c ks are used to re-estimate the semiv ariance. The blo c k size from MS deﬁnes the size of resampled blo c ks when implemen ting the GBBB. The GBBB permutes resampled blo cks to create a new realization of the pro cess o ver the entire domain. Cho osing the window size in GSC requires balancing t wo comp eting goals. First, the mo ving windo w should be large enough to create subblo c ks that are represen- tativ e of the dep endence structure for the entire RF. Second, the window should b e small enough to allo w for a suﬃcien t n umber of subblocks to re-estimate the semiv ariance, as these v alues are used to obtain an estimate of the asymptotic v ariance-cov ariance. A window that is to o large or to o small can p oten tially lead to under- or o v er-estimation of the asymptotic v ariance-cov ariance. F or GSC -u, the windows must b e large enough to ensure enough pairs of sampling lo cations are in each subblo ck to compute an estimate of the semiv ariance without having to ov er-smooth. F or gridded sampling locations, GSC demonstrate go o d empirical size and p ow er b y using mo ving windows with size of only 2 × 2. How ev er, they ﬁnd slo w conv ergence to the asymptotic χ 2 distribution, and a p-v alue is instead computed b y appro ximating the distribution of the test statistic by computing its v alue for each of the subblo cks. Hence, approximating the p-v alue to tw o decimal places will require at least 100 subblo cks ov er the sampling region. This may not b e p ossible in practice. F or example, a 12 × 12 grid of sampling lo cations with mo ving windows of size 2 × 2 results in only 90 subblo cks when correcting for edge eﬀects. The challenge of choosing the blo c k size in MS is sub ject to similar considerations as the windo w size in GSC . The p-v alue for b oth tests will c hange when p erforming the test with diﬀeren t window or blo ck sizes, and the user ma y decide to run the test with diﬀerent blo c k sizes (e.g., MS ). There are a num b er of w orks on resampling spatial data to obtain an estimate of the v ariance of a spatial statistic (e.g., Sherman , 1996 ; Politis and Sherman , 2001 ; Lahiri , 2003 ; Lahiri and Zh u , 2006 ), but they do not directly consider v ariance estimation in the case of a nonparametric estimate of the semiv ariogram/co v ariogram. Denoting the num- b er of p oin ts p er blo c k as n b , Sherman ( 1996 ) prop oses choosing the blo ck size suc h that n b ≈ cn 1 / 2 for a constant, c , when the spatial dep endence do es not ex- hibit a large range. In a num ber of diﬀerent applications of spatial subsampling, c is t ypically chosen to be betw een 0.5 and 2 ( P olitis and Sherman , 2001 ; Guan imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 20 ZA CHAR Y D. WELLER et al. , 2004 , 2006 ). Based on our simulations, we ﬁnd acceptable empirical size and p o w er for GSC -g using small windo ws and appro ximating the p-v alue with the ﬁnite sample adjustmen t. Thus, w e recommend setting n b < n 1 / 2 for GSC -g. F or example, we used windows with size 3 × 2 and 5 × 3 for sampling domains of 18 × 12 and 25 × 15, resp ectiv ely . In the case of uniformly distributed sampling lo cations (see T able 8 in the App endix), the empirical size and pow er from GSC -u w as negativ ely aﬀected b y a large mo ving windo w size; hence, w e recommend set- ting c = 1 and choosing n b . n 1 / 2 . F or the MS test, a small blo c k size negativ ely aﬀected the empirical size and p ow er; thus, w e recommend choosing n b & n 1 / 2 for this test. Bet ween the c hoices of a lag set, blo c k size, and bandwidth, choosing an appro- priate bandwidth to smo oth o ver observ ed spatial lags for non-gridded sampling lo cations is the most c hallenging. F or GSC -u the user needs to choose the form of the smo othing k ernel as well as the bandwidth for b oth the entire grid and the subblocks while MS use an Epanechnik ov k ernel and empirical bandwidth based on a user-sp eciﬁed tuning parameter. If the selected bandwidth is to o large then o ver-smoothing o ccurs. In ov ersmoothing, there is v ery little ﬁltering of the lag distance and direction. The lack of ﬁltering pro duces similar estimates of the spatial dep endence at lags with diﬀerent directions and distances. If the selected bandwidth is too small, then there is very little smo othing and estimates of the spatial dep endence are based on a small num ber of pairs of sampling lo cations and thus highly v ariable. Considering the aforementioned eﬀects of the band- width, the bandwidth should decrease as n increases under the usual increasing domain asymptotics. F or example, simulations (not included) indicated a band- width of w = 0 . 65 maintains nominal size when n = 950, but leads to deﬂated test size and pow er when n = 400 on a smaller domain. Garc ´ ıa-Soid´ an et al. ( 2004 ), Garc ´ ıa-Soid´ an ( 2007 ), and Kim and P ark ( 2012 ) dev elop theoretically optimal bandwidths for nonparametric semiv ariogram estimation, but these works are not applicable here b ecause they fo cus on the isotropic case and require an esti- mate of the second deriv ativ e of the v ariogram. W e hav e found that the empirical bandwidth used by MS tends to pro duce nominal size (see T able 6 ). F or GSC -u w e ﬁnd the most consistent results with a bandwidth in the range of 0 . 60 < w < 0 . 90 when using a normal kernel truncated at 1.5, but these v alues will change when a diﬀerent truncation v alue or k ernel function are emplo yed. F or small sample sizes ( n < 500), our sim ulations demonstrate that test size and p ow er are less aﬀected by the choice of bandwidth when the p-v alue is approximated using a ﬁnite sample adjustment, indicating p o or conv ergence to the asymptotic χ 2 dis- tribution. Thus, the user should consider using the ﬁnite sample adjustment for non-gridded sampling locations when the sample size is small there are at least 100 subblo cks. While it is challenging to choose a bandwidth for GSC -u and the p-v alue of the test is sensitive to this parameter, the metho d exhibits nominal size and substantially higher p o w er than MS when chosen appropriately . 6. CONCLUSIONS There are several imp ortan t av en ues of future research. Metho ds to more for- mally characterize the optimal blo c k size and bandwidth parameters for the tests in the spatial domain would enhance the applicability of the tests. The p erfor- mance of the tests for non-gridded data in Guan et al. ( 2004 ) and Mait y and imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 21 Sherman ( 2012 ) are sensitive to these choices and their optimality remains an op en question. Zhang et al. ( 2014 ) develop a nonparametric metho d for estimat- ing the asymptotic v ariance-co v ariance matrix of statistics derived from spatial data that a v oids c ho osing tuning parameters which could simplify test implemen- tation. A second area of future w ork is further developmen t of nonparametric tests of isotrop y for gridded and non-gridded data in the spectral domain. A third area of further inv estigation is to compare nonparametric to parametric metho ds for testing isotropy , e.g., Scaccia and Martin ( 2011 ). A ﬁnal area of future work is dev elopment of a formal deﬁnition and more careful quan tiﬁcation of p o wer of the tests. F or example, the degree of geometric anisotropy could b e quan tiﬁed using diﬀerent c haracteristics of the co v ariance function, including the ratio of the ma jor and minor axes of the ellipse, degree of rotation of the ellipse from the co ordinate axes, and range of the pro cess. F urthermore, it is important to consider the eﬀects of density and design of sampling lo cations, sample size, and the amount of noise (nugget and sill) in the observ ations on a test’s abilit y to detect anisotropy . There is a v olume of work on tests for isotrop y in other areas of spatial statis- tics. Metho ds for detecting anisotrop y in spatial p oin t process data ha ve b een dev elop ed, e.g., Sc habenberger and Gotw ay ( 2004 , pg. 200-205), Guan ( 2003 ), Guan et al. ( 2006 ), and Nicolis et al. ( 2010 ). F or m ultiv ariate spatial data, Jona- Lasinio ( 2001 ) prop oses a test for isotrop y . Gneiting et al. ( 2007 ) provide a review of p oten tial second order assumptions and mo dels for spatiotemp oral geostatis- tical data, and a num ber of tests for second order prop erties of spatiotemp oral data ha v e b een dev eloped, e.g., F uentes ( 2006 ), Li et al. ( 2007 ), P ark and F uen tes ( 2008 ), Shao and Li ( 2009 ), Jun and Genton ( 2012 ). Li et al. ( 2008a ) construct a test of the cov ariance structure for m ultiv ariate spatiotemp oral data. T ests for isotrop y ha v e also been dev elop ed in the computer science literature (e.g., Molina and F eito , 2002 ; Chorti and Hristopulos , 2008 ; Spiliop oulos et al. , 2011 ; Thon et al. , 2015 ). Appropriately sp ecifying the second order prop erties of the random ﬁeld is an imp ortant step in mo deling spatial data, and a n um b er of mo dels hav e b een dev elop ed to capture anisotrop y in spatial processes. Graphical to ols, suc h as directional sample semiv ariograms, are commonly used to ev aluate the assump- tion of isotrop y , but these diagnostics can b e misleading and op en to sub jective in terpretation. W e ha v e presen ted and review ed a num b er of pro cedures that can b e used to more ob jectively test hypotheses of isotropy and symmetry with- out assuming a parametric form for the co v ariance function. These tests may b e helpful for a no vice user deciding on an appropriate spatial mo del. In abandoning parametric assumptions, these hypothesis testing pro cedures are sub ject and sen- sitiv e to c hoices regarding smo othing parameters, subsampling pro cedures, and ﬁnite sample adjustments. The test that is most appropriate for a set of data will largely dep end on the sampling design. Additionally , there are trade-oﬀs b etw een the empirical p ow er demonstrated by the tests and the n umber of choices user m ust mak e to implemen t the tests (e.g., betw een Guan et al. ( 2004 ) and Maity and Sherman ( 2012 )). W e hav e oﬀered recommendations regarding the v arious c hoices of metho d and their implemen tation and ha ve made the tests a v ailable in the spTest soft ware. Because of the sensitivit y of the tests to the v arious c hoices, w e b elieve that graphical tec hniques and nonparametric h yp othesis tests should imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 22 ZA CHAR Y D. WELLER b e used in a complementary role. Graphical techniques can provide an initial in- dication of isotrop y prop erties and inform sensible choices for a h yp othesis test, e.g., in c ho osing the spatial lag set, while h yp othesis tests can aﬃrm in tuition ab out graphical techniques. REFERENCES Baczk owski, A. (1990). A test of spatial isotropy . In Compstat , pages 277–282. Springer. Baczk owski, A. and Mardia, K. (1987). Approximate lognormalit y of the sample semi-v ariogram under a gaussian pro cess. Communic ations in Statistics-Simulation and Computation , 16(2):571–585. Baczk owski, A. and Mardia, K. (1990). A test of spatial symmetry with general application. Communic ations in Statistics-The ory and Metho ds , 19(2):555–572. Bandy opadhy a y , S., Lahiri, S. N., Nordman, D. J., et al. (2015). A frequency domain empirical lik eliho o d metho d for irregularly spaced spatial data. The Annals of Statistics , 43(2):519–545. Banerjee, S., Carlin, B. P ., and Gelfand, A. E. (2014). Hier ar chic al mo deling and analysis for sp atial data . CRC Press. Borgman, L. and Chao, L. (1994). Estimation of a multidimensional cov ariance function in case of anisotrop y . Mathematic al ge olo gy , 26(2):161–179. Bo wman, A. W. and Azzalini, A. (2014). R p ackage sm : nonp ar ametric smo othing metho ds (version 2.2-5.4) . Universit y of Glasgo w, UK and Universit` a di Pado v a, Italia. Bo wman, A. W. and Crujeiras, R. M. (2013). Inference for v ariograms. Computational Statistics & Data Analysis , 66:19–31. Cabana, E. (1987). Aﬃne pro cesses: a test of isotropy based on level sets. SIAM Journal on Applie d Mathematics , 47(4):886–891. Chorti, A. and Hristopulos, D. T. (2008). Nonparametric iden tiﬁcation of anisotropic (elliptic) correlations in spatially distributed data sets. Signal Pr o c essing, IEEE T r ansactions on , 56(10):4738–4751. Cressie, N. (1993). Statistics for Sp atial Data: Wiley Series in Pr ob ability and Statistics . Wiley- In terscience New Y ork. Ec ker, M. D. and Gelfand, A. E. (1999). Ba yesian mo deling and inference for geometrically anisotropic spatial data. Mathematic al Ge olo gy , 31(1):67–83. Ec ker, M. D. and Gelfand, A. E. (2003). Spatial mo deling and prediction under stationary non-geometric range anisotropy . Envir onmental and Ec olo gic al Statistics , 10(2):165–178. F uentes, M. (2005). A formal test for nonstationarity of spatial sto chastic pro cesses. Journal of Multivariate Analysis , 96(1):30–54. F uentes, M. (2006). T esting for separability of spatial–temporal cov ariance functions. Journal of statistic al planning and infer enc e , 136(2):447–466. F uentes, M. (2007). Approximate lik eliho od for large irregularly spaced spatial data. Journal of the Americ an Statistic al Asso ciation , 102(477):321–331. F uentes, M. (2013). Sp ectral metho ds. Wiley StatsR ef: Statistics R efer ence Online . F uentes, M. and Reich, B. (2010). Sp ectral domain. Handb o ok of Sp atial Statistics , pages 57–77. Garc ´ ıa-Soid´ an, P . (2007). Asymptotic normality of the Nadara ya–Watson semiv ariogram esti- mators. T est , 16(3):479–503. Garc ´ ıa-Soid´ an, P . H., F ebrero-Bande, M., and Gonz´ alez-Manteiga, W. (2004). Nonparametric k ernel estimation of an isotropic v ariogram. Journal of Statistic al Planning and Infer enc e , 121(1):65–92. Gneiting, T., Genton, M., and Guttorp, P . (2007). Geostatistical space-time mo dels, stationarity , separabilit y and full symmetry . Statistic al Metho ds for Sp atio-T emp or al Systems , pages 151– 175. Guan, Y., Sherman, M., and Calvin, J. A. (2004). A nonparametric test for spatial isotrop y using subsampling. Journal of the Americ an Statistic al Asso ciation , 99(467):810–821. Guan, Y., Sherman, M., and Calvin, J. A. (2006). Assessing isotropy for spatial p oin t pro cesses. Biometrics , 62(1):119–125. Guan, Y., Sherman, M., and Calvin, J. A. (2007). On asymptotic properties of the mark v ariogram estimator of a mark ed p oin t pro cess. Journal of statistic al planning and infer enc e , 137(1):148–161. Guan, Y. T. (2003). Nonp ar ametric metho ds of assessing sp atial isotr opy . PhD thesis, T exas A&M Univ ersity . imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 23 Hall, P . and Patil, P . (1994). Prop erties of nonparametric estimators of auto cov ariance for stationary random ﬁelds. Pr ob ability The ory and R elate d Fields , 99(3):399–424. Hask ard, K. A. (2007). A n anisotr opic Mat´ ern sp atial covarianc e mo del: REML estimation and pr op erties. PhD thesis, Universit y of Adelaide. Irvine, K. M., Gitelman, A. I., and Ho eting, J. A. (2007). Spatial designs and properties of spatial correlation: eﬀects on cov ariance estimation. Journal of agricultural, biolo gic al, and envir onmental statistics , 12(4):450–469. Isaaks, E. H. and Sriv astav a, R. M. (1989). Applie d ge ostatistics , volume 2. Oxford Univ ersity Press New Y ork. Jona-Lasinio, G. (2001). Mo deling and exploring m ultiv ariate spatial v ariation: A test procedure for isotrop y of m ultiv ariate spatial data. Journal of Multivariate Analysis , 77(2):295–317. Journel, A. G. and Huijbregts, C. J. (1978). Mining ge ostatistics . Academic press. Jun, M. and Genton, M. G. (2012). A test for stationarity of spatio-temp oral random ﬁelds on planar and spherical domains. Statistic a Sinic a , 22(4):1737. Kim, T. Y. and P ark, J.-S. (2012). On nonparametric v ariogram estimation. Journal of the Kor e an Statistic al So ciety , 41(3):399–413. Lahiri, S. and Zhu, J. (2006). Resampling methods for spatial regression mo dels under a class of stochastic designs. The Annals of Statistics , 34(4):1774–1813. Lahiri, S. N. (2003). R esampling metho ds for dependent data . Springer Science & Business Media. Li, B., Gen ton, M. G., and Sherman, M. (2007). A nonparametric assessmen t of proper- ties of space–time cov ariance functions. Journal of the A meric an Statistic al Association , 102(478):736–744. Li, B., Gen ton, M. G., and Sherman, M. (2008a). T esting the cov ariance structure of multiv ariate random ﬁelds. Biometrika , 95(4):813–829. Li, B., Gen ton, M. G., Sherman, M., et al. (2008b). On the asymptotic joint distribution of sample space–time cov ariance estimators. Bernoul li , 14(1):228–248. Lu, H. and Zimmerman, D. (2001). T esting for isotrop y and other directional symmetry prop- erties of spatial correlation. pr eprint . Lu, H.-C. (1994). On the distributions of the sample c ovariogr am and semivario gr am and their use in testing for isotr opy . PhD thesis, Universit y of Io wa. Lu, N. and Zimmerman, D. L. (2005). T esting for directional symmetry in spatial dep endence using the p erio dogram. Journal of Statistic al Planning and Infer enc e , 129(1):369–385. Mait y , A. and Sherman, M. (2012). T esting for spatial isotrop y under general designs. Journal of statistic al planning and infer enc e , 142(5):1081–1091. Matheron, G. (1961). Precision of exploring a stratiﬁed formation b y b oreholes with rigid spacing-application to a bauxite dep osit. In International Symp osium of Mining R esear ch, University of Missouri , volume 1, pages 407–22. Matheron, G. (1962). T rait ´ e de g´ eostatistique appliqu´ ee . Editions T echnip. Matsuda, Y. and Y a jima, Y. (2009). F ourier analysis of irregularly spaced data on rd. Journal of the R oyal Statistic al So ciety: Series B (Statistic al Metho dolo gy) , 71(1):191–217. Mo djesk a, J. S. and Rawlings, J. (1983). Spatial correlation analysis of uniformity data. Bio- metrics , pages 373–384. Molina, A. and F eito, F. R. (2002). A metho d for testing anisotrop y and quan tifying its direction in digital images. Computers & Gr aphics , 26(5):771–784. Nadara ya, E. A. (1964). On estimating regression. The ory of Pr ob ability & Its Applic ations , 9(1):141–142. Nicolis, O., Mateu, J., and DErcole, R. (2010). T esting for anisotrop y in spatial p oint pro- cesses. In Pr o c e e dings of the Fifth International Workshop on Spatio-T emp or al Mo del ling , pages 1990–2010. P agano, M. (1971). Some asymptotic prop erties of a tw o-dimensional perio dogram. Journal of Applie d Pr ob ability , 8(4):841–847. P ark, M. S. and F uen tes, M. (2008). T esting lack of symmetry in spatial–temp oral pro cesses. Journal of statistic al planning and infer enc e , 138(10):2847–2866. P olitis, D. N. and Sherman, M. (2001). Moment estimation for statistics from marked point pro- cesses. Journal of the R oyal Statistic al So ciety: Series B (Statistic al Metho dolo gy) , 63(2):261– 275. P ossolo, A. (1991). Subsampling a random ﬁeld. L e ctur e Notes-Mono gr aph Series , pages 286– 294. imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 24 ZA CHAR Y D. WELLER Priestley , M. and Rao, T. S. (1969). A test for non-stationarity of time-series. Journal of the R oyal Statistic al So ciety. Series B (Metho dolo gic al) , pages 140–149. Priestley , M. B. (1981). Sp e ctr al analysis and time series . Academic press. R Core T eam (2014). R: A L anguage and Envir onment for Statistic al Computing . R F oundation for Statistical Computing, Vienna, Austria. Scaccia, L. and Martin, R. (2002). T esting for simpliﬁcation in spatial mo dels. In Compstat , pages 581–586. Springer. Scaccia, L. and Martin, R. (2005). T esting axial symmetry and separability of lattice pro cesses. Journal of Statistic al Planning and Infer enc e , 131(1):19–39. Scaccia, L. and Martin, R. (2011). Mo del-based tests for simpliﬁcation of lattice pro cesses. Journal of Statistic al Computation and Simulation , 81(1):89–107. Sc hab enberger, O. and Gotw a y , C. A. (2004). Statistic al metho ds for sp atial data analysis . CRC Press. Shao, X. and Li, B. (2009). A tuning parameter free test for prop erties of space–time cov ariance functions. Journal of Statistic al Planning and Infer enc e , 139(12):4031–4038. Sherman, M. (1996). V ariance estimation for statistics computed from spatial lattice data. Journal of the Royal Statistic al So ciety. Series B (Metho dolo gic al) , pages 509–523. Sherman, M. (2011). Sp atial statistics and sp atio-temp or al data: c ovarianc e functions and dir e c- tional pr op erties . John Wiley & Sons. Spiliop oulos, I., Hristopulos, D. T., Petrakis, M., and Chorti, A. (2011). A multigrid method for the estimation of geometric anisotropy in environmen tal data from sensor netw orks. Com- puters & Ge oscienc es , 37(3):320–330. Stein, M. L. (1988). Asymptotically eﬃcien t prediction of a random ﬁeld with a missp eciﬁed co v ariance function. The Annals of Statistics , 16(1):55–63. Stein, M. L., Chi, Z., and W elty , L. J. (2004). Appro ximating likelihoo ds for large spatial data sets. Journal of the R oyal Statistic al So ciety: Series B (Statistic al Metho dolo gy) , 66(2):275– 296. Thon, K., Geilhufe, M., and Perciv al, D. B. (2015). A multiscale wa velet-based test for isotropy of random ﬁelds on a regular lattice. Image Pr o c essing, IEEE T r ansactions on , 24(2):694–708. V an Hala, M., Bandyopadh y ay , S., Lahiri, S. N., and Nordman, D. J. (2014). A frequency domain empirical lik eliho od for estimation and testing of spatial cov ariance structure. pr eprint . V ecchia, A. V. (1988). Estimation and model identiﬁcation for con tinuous spatial pro cesses. Journal of the Royal Statistic al So ciety. Series B (Metho dolo gic al) , pages 297–312. W atson, G. S. (1964). Smo oth regression analysis. Sankhy¯ a: The Indian Journal of Statistics, Series A , pages 359–372. W eller, Z. (2015a). spTest: Nonp ar ametric Hyp othesis T ests of Isotr opy and Symmetry . R pac k age version 0.2.2. W eller, Z. D. (2015b). spTest: an R pack age implemen ting nonparametric tests of isotropy . submitte d to Journal of Statistic al Softwar e, available on arXiv . Zhang, X., Li, B., and Shao, X. (2014). Self-normalization for spatial data. Sc andinavian Journal of Statistics , 41(2):311–324. Zimmerman, D. L. (1993). Another look at anisotropy in geostatistics. Mathematic al Geolo gy , 25(4):453–470. imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 1 APPENDIX SIMULA TION STUD Y DET AILS AND FURTHER RESUL TS W e deﬁne the isotropic exp onential co v ariance function as (6.1) C ( h ) =  σ 2 exp( − φh ) if h > 0 , τ 2 + σ 2 otherwise where h = || s i − s j || is the distance b et w een sites s i and s j ( Irvine et al. , 2007 ). The corresponding semiv ariogram is γ ( h ) = ( τ 2 + σ 2 ) − σ 2 exp( − φh ), where τ 2 is the n ugget, τ 2 + σ 2 is the sill, and the eﬀectiv e range, ξ , the distance b eyond whic h the correlation b etw een observ ations is less than 0.05, is ξ = − 1 φ log  0 . 05 τ 2 + σ 2 σ 2  . Sim ulations in Section 4 were performed using the exponential co v ariance function ( 6.1 ) with a partial sill, σ 2 , of 1 and no nugget, τ 2 = 0. W e also p erformed sim ulations using diﬀerent n ugget v alues (results not included). As exp ected, in tro ducing a nugget had an adverse eﬀect on empirical test size and p ow er. F or the no nugget simulations, eﬀective ranges, ξ , for isotropic pro cesses w ere c hosen to b e 3, 6, and 12 corresp onding to short, medium, and long range dependence. Geometric anisotropy w as in tro duced by transforming the sampling lo cations according to a scaling parameter, R , and a rotation parameter, θ . Given an ( R , θ ) pair, the co ordinates ( x, y ) are transformed to the “anisotropic” co ordinates, ( x a , y a ) via ( x a , y a ) = ( x, y )  cos θ sin θ − sin θ cos θ   1 0 0 1 R  . A realization from the anisotropic pro cess is then created by sim ulating using the distance matrix from the transformed coordinates and placing the observ ed v alues at their corresp onding untransformed sampling lo cations. Figure 5 sho ws the isotropic exp onen tial correlogram corresp onding to τ 2 = 1 and ξ = 6 and contours of equicorrelation corresponding to the ( R, θ ) v alues used in the sim ulation study . Note that a larger v alue of R corresp onds to a more anisotropic pro cess. F or the sim ulations comparing the GSC-g and LZ ( Lu and Zimmerman , 2005 ) tests in T able 5 , data w ere sim ulated on a subset of the in teger grid, Z 2 . The p-v alues for the GG test were approximated using a ﬁnite sample statistic ( Guan et al. , 2004 ), and w e used the lag set in ( 3.2 ) and A matrix in ( 3.3 ). F or the results inv olving the LZ test, a test of complete symmetry w as performed as an appro ximation to the null hypothesis of isotropy . The p-v alues for the LZ test were obtained using the CvM* statistic. A nominal lev el of α = 0 . 05 w as maintained b y ﬁrst testing reﬂection symmetry at α = 0 . 025 then testing complete symmetry at α = 0 . 025 if the hypothesis of reﬂection symmetry was not rejected. F or the GG test, the moving window dimensions were 3 × 2 (width, heigh t) and 5 × 3 for the parent grids of 18 × 12 and 25 × 15, resp ectiv ely . F or the sim ulations in T able 6 comparing the GU ( Guan et al. , 2004 ) and MS ( Maity and Sherman , 2012 ) tests, data were simulated at random, uniformly distributed sampling lo cations on 10 × 16 and 10 × 20 sampling domains. The lag set, Λ , used for both tests is giv en in ( 3.2 ) with A matrix ( 3.3 ), and the p- v alues for b oth metho ds were obtained using the asymptotic χ 2 2 distribution. F or imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 2 ZA CHAR Y D. WELLER 0 4 8 12 0.0 0.4 0.8 Correlogram Euclidean distance Correlation -2 -1 0 1 2 -2 -1 0 1 2 R = 0 , θ = 0 x lag y lag -2 -1 0 1 2 -2 -1 0 1 2 R = 2 , θ = 0 x lag y lag -2 -1 0 1 2 -2 -1 0 1 2 R = 2 , θ = 0 x lag y lag -2 -1 0 1 2 -2 -1 0 1 2 R = 2 , θ = 3 π /8 x lag y lag -2 -1 0 1 2 -2 -1 0 1 2 R = 2 , θ = 3 π /8 x lag y lag Contours of Equal Correlation (0.65) Effective Range = 6, Geometric Anisotropy Fig 5: Correlogram and contours of equal correlation for the co v ariance mo dels used in the simulation study . imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 3 semiv ariogram estimates in GU, w e use independent (product) Gaussian (normal) k ernels with a truncation parameter of 1.5. The bandwidth for the Gaussian kernel for smo othing o ver lags on the en tire ﬁeld and on mo ving windo ws was chosen as w = 0 . 75. W e used the empirical bandwidth and the pro duct Epanechnik ov k ernel giv en in Mait y and Sherman ( 2012 ) to implement the MS test. F or b oth tests, a grid with spacing of 1 was laid on the sampling region. Using this grid, the moving window dimensions for the GU test w ere 4 × 2 and the blo c k size for the MS test were 4 × 2. F or the MS test, B = 100 resamples using the GBBB w ere used to estimate the asymptotic v ariance-cov ariance matrix. F or the results in T ables 7 - 9 , w e sim ulated mean 0, Gaussian RFs with ex- p onen tial co v ariance function with no n ugget, a sill of one, and medium eﬀectiv e range ( ξ = 6). Sampling lo cations were generated randomly and uniformly o v er a 16 × 10 sampling domain. W e use the lag set and A matrix from 3.2 and 3.3 , re- sp ectiv ely , unless otherwise noted. All tests were p erformed using a nominal level of α = 0 . 05. F or the GU tests, we use product Gaussian kernels with a truncation parameter of 1.5. F or the MS tests, w e use the default pro duct Epanechnik ov k ernels with empirical bandwidth sp eciﬁed in Mait y and Sherman ( 2012 ). The sim ulation results in T able 7 demonstrate the eﬀects of changing the set of lags for the GU and MS tests. F or these simulations, the lag set lab eled “normal” corresp onds to the lag set giv en in ( 3.2 ). The lag set lab eled “long” represents the lags in ( 3.2 ) m ultiplied b y 2.5. Finally , the lag set labeled “more” stands for the lags in ( 3.2 ) with the additional pair of lags { h 5 = (1 . 132 , 0 . 469) , h 6 = ( − 0 . 469 , 1 . 132) } . The lags h 5 and h 6 are a pair of lags the create approximate 22 . 5 ◦ and 112 . 5 ◦ angles, resp ectiv ely , with the x -axis (counter-clock wise rotation) and hav e Euclidean length of approximately 1.22. These were c hosen to supple- men t the lag pairs ( h 1 , h 2 ) which hav e unit length and create 0 ◦ and 90 ◦ angles with the x -axis and ( h 3 , h 4 ) which hav e length √ 2 ≈ 1 . 41 and create 45 ◦ and 135 ◦ angles with the x -axis. The lag sets are plotted in Figure 6 . The A matrix for the “more” lagset was constructed as in ( 3.3 ), where orthogonal lags are contrasted. The p-v alues w ere calculated using the asymptotic χ 2 distribution with degrees of freedom based on the n umber of pairs of lags contrasted. F or the GU metho d, w e used a bandwidth of 0.75. The moving windo w dimensions were 4 × 2. F or the MS metho d, we chose blo c k dimensions of 4 × 2 and used B = 75 resamples using the GBBB to estimate the asymptotic v ariance-co v ariance matrix. T able 8 demonstrates the eﬀects of changing the blo ck size for the GU and MS tests. F or these simulations, the lab els “small”, “normal”, and “large” corresp ond to mo ving windows/blocks of size 3 × 2, 4 × 2, and 5 × 3, resp ectively . Because we sim ulated n = 300 uniformly distributed sampling lo cations on a 16 × 10 domain, w e expect 1.875 sampling lo cations p er unit area. Th us, we exp ect n b = 11.3, 15, and 28.1 points p er blo c k for the small, normal, and large block sizes, resp ectiv ely . W e ﬁnd that the metho ds tend to hav e nominal size when n b ≈ n 1 / 2 = 17 . 3. F or b oth tests, w e used the lags in ( 3.2 ), and the blo c ks are deﬁned by a grid with spacing 0.5 placed on the sampling region (i.e., a 4 × 2 windo w is achiev ed b y setting the window dimensions to 8 × 4 in the spTest softw are). W e p erformed the tests using a nominal lev el of α = 0 . 05, and the p-v alues were calculated using the asymptotic χ 2 distribution. F or the GU metho d, we used a bandwidth of 0.75. F or the MS metho d, we used B = 100 resamples using the GBBB to es- timate the asymptotic v ariance-cov ariance matrix. Finally , T able 9 demonstrates imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 4 ZA CHAR Y D. WELLER -2 -1 0 1 2 0.0 1.0 2.0 Normal x lag y lag -2 -1 0 1 2 0.0 1.0 2.0 Long x lag y lag -2 -1 0 1 2 0.0 1.0 2.0 More x lag y lag Different Lag Sets Fig 6: The lag sets used for the simulations in T able 7 . the eﬀects of c hanging the bandwidth for the GU test. W e use bandwidths of w = 0.65, 0.75, and 0.85. The p-v alues are calculated using b oth the asymptotic χ 2 distribution and using a ﬁnite sample adjustmen t similar to the one used by Guan et al. ( 2004 ) for gridded sampling lo cations. imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 5 T able 5 Empiric al size and p ower for Guan et al. ( 2004 ) [denote d GG] and Lu and Zimmerman ( 2005 ) [denote d LZ] for 500 r e alizations of a me an 0 GRF with gridde d sampling lo c ations using a nominal level of α = 0 . 05 . Computational time for e ach metho d is also include d. (a) Sample size of n = 216 gridded sampling lo cations. 18 cols × 12 rows grid eﬀectiv e range R θ Metho d 3 6 12 0 0 GG 0.05 0.07 0.05 LZ 0.04 0.04 0.08 √ 2 0 GG 0.32 0.42 0.43 LZ 0.06 0.11 0.12 2 0 GG 0.91 0.92 0.94 LZ 0.14 0.13 0.15 √ 2 3 π 8 GG 0.27 0.31 0.34 LZ 0.14 0.12 0.13 2 3 π 8 GG 0.77 0.85 0.86 LZ 0.29 0.33 0.33 Computational Time for 1 T est GG 1.11 seconds LZ 1.45 seconds (b) Sample size of n = 375 gridded sampling lo cations. 25 cols × 15 rows grid eﬀectiv e range R θ Metho d 3 6 12 0 0 GG 0.05 0.06 0.07 LZ 0.06 0.07 0.07 √ 2 0 GG 0.63 0.61 0.69 LZ 0.07 0.09 0.10 2 0 GG 0.98 0.99 0.99 LZ 0.14 0.16 0.15 √ 2 3 π 8 GG 0.47 0.55 0.55 LZ 0.16 0.19 0.18 2 3 π 8 GG 0.97 0.99 0.98 LZ 0.37 0.43 0.45 Computational Time for 1 T est GG 7.29 seconds LZ 4.99 seconds imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 6 ZA CHAR Y D. WELLER T able 6 Empiric al size and p ower for Guan et al. ( 2004 ) [denote d GU] and Maity and Sherman ( 2012 ) [denote d MS] for 200 r e alizations of a me an 0 GRF with uniformly distribute d sampling lo c ations using a nominal level of α = 0 . 05 . Computational time for e ach method is also include d. (a) Sample size of n = 300 uniformly dis- tributed sampling lo cations. 10 height × 16 width domain eﬀectiv e range R θ Metho d 3 6 12 0 0 GU 0.02 0.04 0.05 MS 0.04 0.05 0.04 √ 2 0 GU 0.15 0.20 0.27 MS 0.10 0.09 0.08 2 0 GU 0.43 0.57 0.62 MS 0.21 0.16 0.15 √ 2 3 π 8 GU 0.12 0.13 0.16 MS 0.08 0.07 0.04 2 3 π 8 GU 0.37 0.51 0.51 MS 0.27 0.23 0.21 Computational Time for 1 T est GU 2.17 seconds MS 83.40 seconds (b) Sample size of n = 450 uniformly dis- tributed sampling lo cations. 10 height × 20 width domain eﬀectiv e range R θ Metho d 3 6 12 0 0 GU 0.00 0.04 0.05 MS 0.05 0.07 0.03 √ 2 0 GU 0.15 0.22 0.23 MS 0.07 0.06 0.07 2 0 GU 0.57 0.68 0.75 MS 0.32 0.18 0.14 √ 2 3 π 8 GU 0.09 0.18 0.21 MS 0.12 0.06 0.08 2 3 π 8 GU 0.55 0.58 0.65 MS 0.37 0.23 0.21 Computational Time for 1 T est GU 4.44 seconds MS 162.35 seconds T able 7 Eﬀe cts of changing the lag set. Empiric al size and p ower for Guan et al. ( 2004 ) [denote d GU] and Maity and Sherman ( 2012 ) [ MS ] for 100 r e alizations of a mean 0 GRF with n = 400 uniformly distribute d sampling lo c ations. The lab el “normal” c orr esp onds to the lag set in ( 3.2 ) , while “long” r epr esents using longer lags, and “mor e” denotes using mor e lags (se e Figur e 6 ). 16 width × 10 height domain Lag Set R θ Metho d normal long more 0 0 GU 0.02 0.00 0.01 MS 0.03 0.14 0.03 √ 2 3 π 8 GU 0.19 0.07 0.16 MS 0.11 0.24 0.07 2 3 π 8 GU 0.56 0.17 0.40 MS 0.27 0.33 0.21 imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015 NONP ARAMETRIC HYPOTHESIS TESTS OF ISOTROPY 7 T able 8 Eﬀe cts of changing the window/blo ck size. Empiric al size and p ower for Guan et al. ( 2004 ) [denote d GU] and Maity and Sherman ( 2012 ) [ MS ] for 200 r e alizations of a me an 0 GRF with n = 300 uniformly distribute d sampling lo c ations. The label “normal” c orr esp onds to the window/blo ck size of 4 × 2 , while “smal l” r epr esents using a smal ler window, and “lar ge” denotes using a larger window. 16 width × 10 height domain Windo w/Blo c k Size R θ Metho d small normal large 0 0 GU 0.06 0.04 0.01 MS 0.03 0.04 0.02 √ 2 0 GU 0.17 0.17 0.08 MS 0.06 0.09 0.09 2 0 GU 0.56 0.53 0.22 MS 0.17 0.17 0.18 T able 9 Eﬀe cts of changing b andwidth. Empiric al size and p ower for Guan et al. ( 2004 ) [denote d GU] for 100 r e alizations of a me an 0 GRF with n = 400 uniformly distributed sampling lo c ations using a nominal level of α = 0 . 05 . (a) P-v alue: asymptotic χ 2 distribution 16 width × 10 height domain Eﬀectiv e Range R θ Bandwidth 3 6 12 0 0 0.65 0.00 0.00 0.00 0.75 0.03 0.06 0.04 0.85 0.06 0.11 0.16 √ 2 3 π 8 0.65 0.01 0.01 0.08 0.75 0.08 0.14 0.24 0.85 0.14 0.27 0.35 2 3 π 8 0.65 0.21 0.22 0.25 0.75 0.50 0.54 0.67 0.85 0.70 0.73 0.81 (b) P-v alue: ﬁnite sample 16 width × 10 height domain Eﬀectiv e Range R θ Bandwidth 3 6 12 0 0 0.65 0.02 0.03 0.02 0.75 0.03 0.06 0.06 0.85 0.07 0.10 0.09 √ 2 3 π 8 0.65 0.05 0.06 0.20 0.75 0.09 0.18 0.29 0.85 0.11 0.24 0.31 2 3 π 8 0.65 0.37 0.38 0.53 0.75 0.55 0.58 0.69 0.85 0.63 0.64 0.76 imsart-sts ver. 2014/10/16 file: sts-isotropy6.tex date: Friday 6 th November, 2015

A Review of Nonparametric Hypothesis Tests of Isotropy Properties in Spatial Data

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment