Generalized Spatial and Spatiotemporal Autoregressive Conditional Heteroscedasticity

Generalized Spatial and Spatiotemp oral Autoregressiv e Conditional Heteroscedasticit y Philipp Otto ∗ Departmen t of Quan titativ e Metho ds, Europ ean Univ ersit y Viadrina, F rankfurt (Oder), German y and W olfgang Sc hmid Departmen t of Quan titativ e Metho ds, Europ ean Univ ersit y Viadrina, F rankfurt (Oder), German y and Rob ert Garthoﬀ, Statistisc hes Landesam t des F reistaates Sac hsen, Kamenz, German y Septem b er 5, 2016 Abstract In this pap er, w e introduce a new spatial mo del that incorp orates heteroscedastic v ariance depending on neigh b oring locations. The prop osed process is regarded as the spatial equiv alen t to the temp oral autoregressiv e conditional heteroscedasticity (AR CH) mo del. W e sho w additionally ho w the in tro duced spatial ARCH model can b e used in spatiotemp oral settings. In con trast to the temp oral ARCH mo del, in which the dis- tribution is known giv en the full information set of the prior p erio ds, the distribution is not straigh tforw ard in the spatial and spatiotemp oral setting. Ho wev er, it is possible to estimate the parameters of the mo del using the maximum-lik eliho o d approach. Via Mon te Carlo sim ulations, w e demonstrate the performance of the estimator for a sp eciﬁc spatial weigh ting matrix. Moreo ver, we com bine the kno wn spatial autoregressive mo del with the spatial ARCH mo del assuming heteroscedastic errors. Ev entually , the prop osed autoregressiv e pro cess is illustrated using an empirical example. Sp eciﬁcally , w e mo del lung cancer mortalit y in 3108 U.S. coun ties and compare the in tro duced mo del with tw o b enc hmark approaches. Keywor ds: lung cancer mortalit y , SARspARCH model, spatial ARCH model, v ariance clusters. ∗ Corresp onding author (email: p otto@europa-uni.de) 1 1 In tro duction V arious sp eciﬁcations of spatial autoregressive mo dels ha ve b een prop osed in past and curren t literature (cf. Anselin 2010). In particular, the spatial mo dels in tro duced by Whittle (1954) w ere extended to incorp orate external regressors (see, e.g., Elhorst 2010 for an ov erview), and auto correlated residuals (e.g., Fingleton 2008 a ), resp ectively . Curren tly , these spatial mo dels are widely implemented in statistical softw are pack ages suc h that it is simple to mo del spatial clusters of high and lo w observ ations. Consequently , a wide range of applications can be found in empirical research, including econometrics (e.g., Holly et al. 2010, Fingleton 2008 b ), biometrics (e.g., Shink arev a et al. 2006, Ho et al. 2005, MacNab & Dean 2001) or en vironmetrics (e.g., F ass` o & Finazzi 2011, F ass` o et al. 2007, F uentes 2001). Ho w ever, spatial mo dels that assume spatially dep endent second-order moments, such as the well-kno wn autoregressive conditional heteroscedasticit y (AR CH) and generalized ARCH (GAR CH) mo dels in time series analysis prop osed by Engle (1982) and Bollerslev (1986), ha ve not b een previously discussed. Boro vko v a & Lopuhaa (2012) and Cap orin & P aruolo (2006) in tro duced a temp oral GARCH mo del, which includes temp oral lags inﬂuenced b y neighboring observ ations. Regarding the tw o-dimensional setting, Bera & Simlai (2004) suggested a sp e- cial t yp e of a spatial ARCH mo del, the SARCH(1) pro cess, that results from employin g the information matrix (IM) test statistic in a simple spatial autoregressive (SAR) mo del. F urther- more, Noib oar & Cohen (2005) and Noib oar & Cohen (2007) in tro duced a multidimensional GAR CH pro cess to detect image anomalies. Ho wev er, present extensions consider only sp ecial approac hes, and no general mo del has b een presented. Moreo v er, there is no strict analytical analysis of the in tro duced mo dels, and it app ears that generalization of an ARCH or GARCH mo del to the m ultidimensional setting is not straightforw ard. T o motiv ate the need for a spatial ARCH mo del, w e consider the following empirical example. In Figure 1, the p opulation densit y of all U.S. counties excluding Alask a and Ha waii ( n = 3108) is plotted on the map. The data are from the 2010 census. Obviously , there are clusters of high p opulation densit y around metrop olitan areas and clusters of lo w p opulation densit y elsewhere. This b ehavior can b e mo deled b y a spatial autoregressiv e pro cess; i.e., the observ ations are assumed to be inﬂuenced by their neighbors. The dep endence betw een the observ ations can b e mo deled via a so-called spatial w eigh ting matrix W . Moreo v er, a simple spatial autoregressive pro cess includes an autoregressiv e parameter λ . Fitting the U.S. 2 census data using this t yp e of pro cess leads to a mo del with a p ositiv e spatial correlation of ˆ λ = 0 . 8578; i.e., the pro cess iden tiﬁes clusters of high and lo w v alues. This ﬁnding is not surprising. Ho wev er, if we fo cus on the estimated residuals of the pro cess, we observ e that they are not homoscedastic but rather exhibit clusters of high and low v ariances, whereas the mean of the residuals is zero and not spatially auto correlated. This means that we observe clustering b eha vior in the conditional spatial v ariances but not in the conditional means. F or spatial autoregressiv e pro cesses, the conditional v ariance is also not constan t ov er space. Ho wev er, the conditional v ariance of eac h location is indep enden t of the v ariance of the surrounding locations; it dep ends only on the spatial w eights. Thus, a new nonlinear attempt, a spatial pro cess for conditional heteroscedasticit y , is needed to ac hiev e the required ﬂexibilit y of the mo del. T o illustrate these v ariance clusters, w e computed for eac h count y the sample standard deviation of the residuals lying within a radius of 500 k m (310.686 mi ). In Figure 1, the conditional sample standard deviation is visualized on the map (b elo w, left) and b y means of a simple histogram (b elow, right). Ob viously , w e observ e t wo ma jor clusters of the residual’s v ariance: the v ariance is higher in the Eastern United States compared with the W estern United States. Moreo v er, these tw o clusters are also obvious in the histogram. The estimated v ariance of the error pro cess is ˆ σ ξ 2 = 0 . 9104. Certainly , the app earance of these clusters dep ends on the choice of the distance used for the calculation of the sample standard deviation. Ho wev er, when other distances are used, we observ e the same b ehavior. A further asp ect that should b e men tioned is that for spatial kriging, the underlying spa- tial pro cess is usually assumed to b e stationary and isotropic; i.e., the cov ariance b et w een t wo observ ations dep ends only on the distance b etw een these observ ations, not on the location of the observ ations (see, e.g., Cressie 1993). Thus, Sampson & Guttorp (1992), F uentes (2001), F uentes (2002), and Sc hmidt & O’Hagan (2003), among others, ha v e introduced v arious ap- proac hes to treat nonstationary spatial pro cesses. F or suc h processes, the spatial cov ariance matrix dep ends on both the lo cation of eac h observ ation and the distances b etw een all locations. Moreo v er, Om bao et al. (2008), and Stroud et al. (2001) discussed nonstationary spatiotemp oral mo dels. Whereas the fo cus of these approac hes is mostly to obtain accurate temp oral forecasts or spatial interpolations, the so-called kriging, our pap er aims to prop ose a spatial pro cess with similar prop erties to the temp oral ARCH pro cess prop osed b y Engle (1982), i.e., conditional heteroscedasticit y . Hence, the entries of the spatial cov ariance matrix depend not only on the lo cation and the distance b etw een observ ations, as in nonstationary spatial pro cesses, but also 3 Sigma 0.8 1.0 1.2 1.4 1.6 0 1 2 3 4 Figure 1: Population densit y of the U.S. coun ties (excluding Alask a and Ha waii) in 2010 (ab o ve). The darker the color, the higher the p opulation densit y in the respective area. The sample standard deviation of the estimated residuals within a radius of 500 k m (310.686 mi ) of the ﬁtted spatial autoregressive pro cess ( ˆ λ = 0 . 8578, ˆ σ ξ 2 = 0 . 9104) is shown b elo w. 4 on the v ariance of locations nearb y . In particular, w e compare our spatial ARCH model and the temp oral AR CH mo del with respect to imp ortan t prop erties. Moreo v er, we illustrate the use of the spatial ARCH mo del as a residual pro cess for spatial mo deling of lung cancer mortality in the U.S. counties. The remainder of the pap er is structured as follo ws. In the ensuing section, we introduce the spatial ARCH mo del. Moreov er, w e derive imp ortan t properties of the pro cess. In Sec- tion 2.2, t w o sp eciﬁcations of the spatial weigh ting matrix suitable for empirical research are discussed. F urthermore, we present some results regarding statistical inference, and we discuss an estimation procedure based on the maxim um-likelihoo d principle. In an empirical study , w e demonstrate ho w our results can b e applied. Moreo v er, the results of v arious sim ulation studies are rep orted to yield b etter insigh t into the b eha vior of the spatial AR CH pro cess. Finally , Section 6 concludes the pap er and provides some discussion of p ossible extensions and generalizations of the pro cess. 2 Spatial and Spatiotemp oral Autoregressiv e Conditional Heteroscedasticit y Assume that { Y ( s ) ∈ R : s ∈ D s } is a univ ariate spatial sto c hastic pro cess, where D s is a subset of the q -dimensional set of real n umbers R q , the q -dimensional set of in tegers Z q , or the Cartesian pro duct R v × Z l with v + l = q . Regarding the ﬁrst case, a con tinuous pro cess is presen t if a q -dimensional rectangle of positive volume in D s exists (cf. Cressie & Wikle 2011). Considering the second case, the resulting pro cess is a spatial lattice pro cess. Moreo v er, spatiotemp oral settings are cov ered regarding the q -dimensional set of integers and the pro duct set R v × Z l b ecause the temp oral dimension can b e considered as one dimension of the q - dimensional space. F or instance, a spatiotemp oral lattice pro cess with tw o spatial dimensions w ould lie in the set of three-dimensional in tegers. 2.1 Deﬁnition and Prop erties Let s 1 , . . . , s n denote all lo cations and Y b e the vector of observ ations ( Y ( s i )) i =1 ,...,n . The commonly applied spatial autoregressive mo del assumes that the conditional v ariance of Y ( s i ) 5 dep ends only on the spatial w eigh ting matrix (cf. Cressie 1993, Cressie & Wikle 2011), not the observ ations at the neighboring lo cations. This approach is extended assuming that the condi- tional v ariance can v ary o v er space, resulting in clusters of high and low v ariance. Analogous to the ARCH time series mo del of Engle (1982), the vector of observ ations is given by Y = diag( h ) 1 / 2 ε (1) where ε = ( ε ( s 1 ) , . . . , ε ( s n )) 0 is assumed to b e an indep enden t and identically distributed random error with E ( ε ) = 0 and C ov ( ε ) = I . In addition, the identit y matrix is denoted by I . F urthermore, the vector h = ( h i ) i =1 ,...,n is sp eciﬁed as h = ( h ( s i )) i =1 ,...,n = α + W diag( Y ) Y , (2) where diag( a ) denotes a diagonal matrix with the entries of a on the diagonal. Using the Hadamard pro duct denoted b y ◦ , this equation can b e rewritten suc h that h = α + W ( Y ◦ Y ) . The n × n matrix W consists of spatial w eights. The elemen ts of W are assumed to b e non-sto c hastic, nonnegative and zero on the main diagonal to preven t observ ations from inﬂu- encing themselv es. Moreo ver, eac h comp onent of the vector α = ( α i ) i =1 ,...,n is assumed to b e nonnegativ e. Hence, the i -th entry of h at lo cation s i can b e written as h ( s i ) = α i + n X v =1 w iv Y ( s v ) 2 , where w iv refers to iv -th entry of W and w ii = 0 for i = 1 , . . . , n . Thus, h ( s i ) does not seem to dep end on Y ( s i ) 2 . How ev er, b ecause h ( s i ) dep ends on Y ( s j ) , j 6 = i and these quantities dep end on Y ( s i ) via h ( s j ), this is not the case. W e shall discuss this p oint later in more detail. Regarding this sp eciﬁcation of h , we refer to the resulting pro cess as the spatial ARCH mo del (spAR CH). If α = 1 n , where 1 n is the n -dimensional vector of ones, and W = 0 , the resulting pro cess coincides with the spatial white noise pro cess. The ab ov ementioned spatiotemp oral pro cess could b e modeled b y deﬁning the lo cations s = ( s s , t ) 0 , where s s is the spatial lo cation and t ∈ Z represen ts the p oint of time. F or spa- tiotemp oral settings, one must assume additionally that the w eights of the lo cations ( s s , t ) and ( s ˜ s , ˜ t ) are zero if ˜ t ≥ t . In the follo wing Section 2.2, w e demonstrate how the w eighting matrix m ust b e deﬁned for sev eral temp oral and spatiotemp oral settings that hav e b een prop osed in 6 the literature. T o express the mo del in a more con v enient manner, the time p oin t t can also b e written as an index. F or that reason, the n um b er of included temporal lags is denoted b y p , and the set of all spatial lo cations is { s 1 , . . . , s n } . Th us, the pro cess can b e sp eciﬁed as Y t ( s i ) = p h t ( s i ) ε t ( s i ) and h t ( s i ) = α i + n X v =1 p X τ =0 w τ ,iv Y t − τ ( s v ) 2 . It is worth noting that the spatial weigh ting parameters w τ ,iv migh t dep end on the temp oral lag. F or τ = 0, w τ ,iv describ es the instan taneous spatial eﬀect. F urthermore, one can rewrite the equation in matrix notation; that is, Y t = diag( h t ) 1 / 2 ε t , h t = α + p X τ =0 W τ diag( Y t − τ ) Y t − τ . In the following paragraphs, w e omit the index t . The weigh ting matrix W may dep end on additional parameters. Possible choices for W include, e.g., W = ρ ˜ W , W = diag( ρ 1 , . . . , ρ n ) ˜ W , W = diag( ρ 1 , . . . , ρ 1 , . . . , ρ r , . . . , ρ r ) ˜ W , with a known w eighting matrix ˜ W or W = ρ  λ || s i − s j ||  i,j =1 ,...,n , W = ( K ( || s i − s j || ; θ )) i,j =1 ,...,n with a decreasing function K : [0 , ∞ ) → [0 , ∞ ). Here, || . || stands for the vector norm. In Section 2.2, we discuss some sp ecial weigh ting matrices in more detail. Next, w e fo cus on the conditions on the parameters suc h that the pro cess is well deﬁned. Initially , it is analyzed whether Y is uniquely determined b y ε . Let η =         α 1 ε ( s 1 ) 2 + ε ( s 1 ) 2 n P v =1 w 1 v ε ( s v ) 2 α v α 2 ε ( s 2 ) 2 + ε ( s 2 ) 2 n P v =1 w 2 v ε ( s v ) 2 α v . . . α n ε ( s n ) 2 + ε ( s n ) 2 n P v =1 w nv ε ( s v ) 2 α v         and A = diag  ε ( s 1 ) 2 , . . . , ε ( s n ) 2  W , Y (2) = ( Y ( s 1 ) 2 , . . . , Y ( s n ) 2 ) 0 . 7 Theorem 1. Supp ose that det  I − A 2  6 = 0 . (3) Then, ther e is one and only one Y ( s 1 ) , . . . , Y ( s n ) that c orr esp onds to e ach ε ( s 1 ) , . . . , ε ( s n ) . It holds that Y (2) =  I − A 2  − 1 η , h = α + W  I − A 2  − 1 η , and Y = diag ( h ) 1 / 2 ε . (4) Because of the complex dep endence structure, i.e., Y ( s i ) dep ends on Y ( s j ) for all i, j = 1 , . . . , n and vice v ersa, it turns out that the comp onen ts of Y (2) are not necessarily nonneg- ativ e; th us, the square ro ot of h ( s i ) might not exist. The choice of the weigh ting matrix W aﬀects whether all elemen ts of the squared observ ations Y (2) are greater than or equal to zero. Moreo v er, this condition also dep ends on the realizations of the error vector ε . Therefore, w e further analyze the required condition suc h that the comp onents of Y (2) are nonnegative. Theorem 2. Supp ose that α ≥ 0 , w ij ≥ 0 for al l i, j = 1 , . . . , n , w ii = 0 for al l i = 1 , . . . , n and that det( I − A 2 ) 6 = 0 . If al l elements of the matrix ( I − A 2 ) − 1 ar e nonne gative, then al l c omp onents of Y (2) ar e nonne gative; i.e., Y ( s i ) 2 ≥ 0 for i = 1 , . . . , n . Mor e over, h ( s i ) ≥ 0 for i = 1 , . . . , n . In general, it seems to b e diﬃcult to chec k whether the condition giv en in Theorem 2 is fulﬁlled b ecause it dep ends on b oth the w eighting matrix W and the error v ector ε . Ho wev er, in the imp ortan t case in which W is an upp er or lo w er triangular matrix, the condition is alw a ys satisﬁed. Lemma 1. Supp ose that α ≥ 0 , w ij ≥ 0 for al l i, j = 1 , . . . , n and w ij = 0 for 1 ≤ i ≤ j ≤ n . A l l elements of the matrix ( I − A 2 ) − 1 ar e then nonne gative. T riangular matrices of spatial w eights are of high practical relev ance b ecause the resulting spatial pro cess can b e observ ed as an orien ted pro cess. This means that the pro cess ev olves in a certain direction. In the case of a lo wer triangular matrix W , the location s 1 is regarded as the origin of the pro cess. F or the case of an arbitrary weigh ting matrix W , we need a criterion that can b e more easily chec k ed than that of Theorem 2. Another p ossibilit y is given in the next lemma. Lemma 2. Supp ose that α ≥ 0 , w ij ≥ 0 for al l i, j = 1 , . . . , n and w ij = 0 for i = j . If lim k →∞ A 2 k = 0 , (5) 8 then al l elements of the matrix ( I − A 2 ) − 1 ar e nonne gative. It is w orth noting that if || · || denotes some induced matrix norm, then (5) is fulﬁlled if || A 2 || < 1 (cf. Theorem 18.2.19 of Harville (2008)). T o tak e a closer lo ok at the condition in the ab ov e Lemma 2, w e consider tw o simple examples. Example 1. Initial ly, the simple spARCH pr o c ess for n = 2 is c onsider e d in mor e detail, which me ans that the pr o c ess has exactly two observations Y ( s 1 ) and Y ( s 2 ) at the two lo c ations s 1 and s 2 . Simple c alculations show that Y ( s i ) 2 =    ε ( s 1 ) 2 α 1 + α 2 w 12 ε ( s 2 ) 2 1 − w 12 w 21 ε ( s 1 ) 2 ε ( s 2 ) 2 for i = 1 ε ( s 2 ) 2 α 2 + α 1 w 21 ε ( s 1 ) 2 1 − w 12 w 21 ε ( s 1 ) 2 ε ( s 2 ) 2 for i = 2 . These quantities ar e nonne gative if and only if α ≥ 0 , w 12 ≥ 0 , w 21 ≥ 0 and ε ( s 1 ) 2 ε ( s 2 ) 2 < 1 w 12 w 21 . (6) Conse quently, h ( s 2 ) = α 2 + w 21 Y ( s 1 ) 2 ≥ 0 , and by analo gy, h ( s 1 ) ≥ 0 . Thus, al l quantities ar e wel l deﬁne d. Cho osing in L emma 2 the norm || B || 1 = max j P i | b ij | , it c an b e observe d that the afor ementione d c ondition is e quivalent to c ondition (6) . This result shows that the supp ort of ε ( s i ) 2 m ust b e b ounded; otherwise, there arise problems with the interpretation of the mo del quan tities. Hence, condition (5) m ust also b e understo o d in this manner. This means that the supp ort of the error quantities m ust b e b ounded in a certain manner. In the general case, the condition on the induced norm is more diﬃcult to c heck. Therefore, we consider in the next example that the supp ort of the error term is compact. Example 2. Supp ose that ε ( s i ) is taking values on a ﬁnite supp ort. L et | ε ( s i ) | ≤ a for al l i = 1 , . . . , n . Mor e over, we wil l utilize the norm || B || 1 = max j P i | b ij | . Now, A 2 = ε ( s i ) 2 n X v =1 w iv w v j ε ( s v ) 2 ! i,j =1 ,...,n and || A 2 || 1 = max 1 ≤ j ≤ n n X i =1 ε ( s i ) 2 n X v =1 w iv w v j ε ( s v ) 2 ≤ a 4 max 1 ≤ j ≤ n n X i =1 n X v =1 w iv w v j = a 4 || W 2 || 1 . 9 Thus, the norm is less than 1 if a < 1 4 p || W 2 || 1 . It is worth noting that for n = 2 , we obtain the b ound of Example 1. W e observe a tradeoﬀ betw een the weigh ting co eﬃcien ts and the parameter a . T o b e precise, if the w eigh ting co eﬃcien ts increase, one w ould exp ect that the spatial auto correlation of the squared observ ations would increase by the same magnitude. Ho w ev er, increasing v alues of the elements in W imply smaller v alues of a , which reduces the extent of the spatial auto- correlation. W e fo cus on this issue in more detail in Section 5. Below, the probability structure of Y is derived. Supp ose that the assumptions of Theorem 2 are satisﬁed, with α > 0, and that ε is con tin uous with densit y function f ε . Let h i = α i + P n v =1 ,v 6 = i w iv y 2 v . Applying the transformation rule for random vectors (e.g., Bic kel & Doksum 2015), we obtain a densit y of Y = diag( h ) 1 / 2 ε = f ( ε ). Note that the transformation is one-to-one b ecause if Y = diag( h ) 1 / 2 ε = f ( ε ) = ˜ Y = diag( ˜ h ) 1 / 2 ˜ ε = f ( ˜ ε ), it follo ws that h = h ( Y ) = h ( ˜ Y ) = ˜ h and thus ε = ˜ ε b ecause h ( s i ) > 0. W e obtain that f Y ( y ) = f ( Y ( s 1 ) ,...,Y ( s n )) ( y 1 , . . . , y n ) = f ( ε ( s 1 ) ,...,ε ( s n ))  y 1 √ h 1 , . . . , y n √ h n  | det   ∂ y j / p h j ∂ y i ! i,j =1 ,...,n   | . (7) Because ∂ y j / p h j ∂ y i =    1 / √ h j for i = j − y i y j h 3 / 2 j w j i for i 6 = j , it follows that | det   ∂ y j / √ h j ∂ y i ! i,j =1 ,...,n   | = n Y i =1 y 2 i h 3 / 2 i · | det  diag  h 1 y 2 1 , . . . , h n y 2 n  + W 0  | . The determinant of the sum of a diagonal matrix and an arbitrary matrix can b e calculated as describ ed in Theorem 13.7.3 of Harville (2008). In the sp ecial case of Example 1 ( n = 2), we obtain that f ( Y ( s 1 ) ,Y ( s 2 )) ( y 1 , y 2 ) = α 1 α 2 + α 1 w 21 y 2 1 + α 2 w 12 y 2 2 ( α 1 + w 12 y 2 2 ) 3 / 2 ( α 2 + w 21 y 2 1 ) 3 / 2 f ( ε ( s 1 ) ,ε ( s 2 )) y 1 p α 1 + w 12 y 2 2 , y 2 p α 2 + w 21 y 2 1 ! . (8) 10 Our next aim is to develop statements ab out the momen ts of Y ( s i ). T o accomplish this, w e shall assume that the error quan tities are symmetric. There are v arious p ossibilities of deﬁning symmetry for m ultiv ariate distributions (cf. Serﬂing 2006). Here, we consider sign-symmetric m ultiv ariate distributions. Theorem 3. Supp ose that the assumptions of The or em 2 ar e satisﬁe d and that the distribution of ε is sign-symmetric; i.e., ε d = (( − 1) v 1 ε ( s 1 ) , . . . , ( − 1) v n ε ( s n )) for al l v 1 , . . . , v n ∈ { 0 , 1 } . It then holds that the distribution of Y is sign-symmetric as wel l. It is imp ortan t to note that ( ε ( s 1 ) , . . . , ε ( s n )) is sign-symmetric if the random v ariables ε ( s 1 ) , . . . , ε ( s n ) are indep endent and if ε ( s i ) is symmetric ab out zero for all i = 1 , . . . , n . Next, we w an t to discuss the conditions under which the momen ts of Y (2) exist. Using symmetry , it is pro ved that all o dd moments are zero if the error v ariable is symmetric. First, it is assumed that the w eighting matrix W is a triangular matrix. Lemma 3. Supp ose that the assumptions of The or em 2 ar e satisﬁe d. L et n ≥ 3 , r ∈ N and supp ose that E ( ε ( s i ) 8 r [( n − 1) / 2] ) < ∞ for al l i = 1 , . . . , n . L et w ij ≥ 0 for i, j = 1 , . . . , n and w ij = 0 for 1 ≤ i ≤ j ≤ n ; then, it holds that a) E ( Y ( s i ) 2 r ) < ∞ for al l i = 1 , . . . , n . b) If ε is additional ly sign-symmetric, then E ( Y ( s i ) 2 v − 1 ) = 0 and E ( Y ( s i ) 2 v − 1 | Y ( s j ) , j = 1 , . . . , n, j 6 = i ) = 0 for v = 1 , . . . , r, i = 1 , . . . , n . Belo w, we fo cus on the moments of the pro cess in the case of an arbitrary weigh ting matrix. Theorem 4. Supp ose that the assumptions of The or em 3 ar e satisﬁe d. L et || . || denote some induc e d matrix norm. L et r ∈ N , and supp ose that E ( ε ( s i ) 2 r ) < ∞ for al l i = 1 , . . . , n . a) If ther e exists a c onstant λ > 0 such that || ( I − A 2 ) − 1 || ≤ λ , then it holds that 11 a 1 ) E ( Y ( s i ) 2 r ) < ∞ for al l i = 1 , . . . , n . a 2 ) E ( Y ( s i ) 2 v − 1 ) = 0 and E ( Y ( s i ) 2 v − 1 | Y ( s j ) , j = 1 , . . . , n, j 6 = i ) = 0 for v = 1 , . . . , r , i = 1 , . . . , n . b) If ther e exists 0 < λ < 1 such that || A 2 || ≤ λ < 1 , then || ( I − A 2 ) − 1 || is b ounde d. The moments of ε ( s i ) are of course b ounded if we assume that the supp ort of ε ( s i ) is b ounded. It is worth noting that one imp ortan t prop ert y of the classical, temporal GARCH approac h is not fulﬁlled for each speciﬁcation of W . Generally , it do es not hold that h ( s 1 ) is equal to E ( Y ( s 1 ) 2 | Y ( s 2 ) , . . . , Y ( s n )). T o prov e this, we consider the simple case of Example 1 ( n = 2). In that case, E ( Y ( s 1 ) 2 | Y ( s 2 )) = ( α 1 + w 12 Y ( s 2 ) 2 ) E ( ε ( s 1 ) 2 | Y ( s 2 )) . The problem lies in the fact that ε ( s 1 ) and Y ( s 2 ) are not indep enden t; th us, E ( ε ( s 1 ) 2 | Y ( s 2 )) do es not hav e to b e equal to E ( ε ( s 1 ) 2 ). If the conditions of Theorem 4 are fulﬁlled and if ε ( s 1 ) and ε ( s 2 ) are indep endent, it follows with (8) that E ( Y ( s 1 ) 2 | Y ( s 2 ) = y 2 ) = 1 f Y ( s 2 ) ( y 2 ) ∞ Z −∞ y 2 1 α 1 α 2 + α 1 w 21 y 2 1 + α 2 w 12 y 2 2 ( α 1 + w 12 y 2 2 ) 3 / 2 ( α 2 + w 21 y 2 1 ) 3 / 2 f ε ( s 1 ) y 1 p α 1 + w 12 y 2 2 ! f ε ( s 2 ) y 2 p α 2 + w 21 y 2 1 ! dy 1 . In Figure 2, the conditional exp ectation of Y ( s 1 ) 2 giv en Y ( s 2 ) is plotted together with h ( s 1 ) for t wo diﬀerent speciﬁcations of W and α . Obviously , E ( Y ( s 1 ) 2 | Y ( s 2 ) = y 2 ) diﬀers suc h that the greater the diﬀerence from h ( s 1 ), the larger the chosen elements of W . Certainly , the diﬀerence b et ween the conditional exp ectation and h ( s 1 ) v anishes for W = 0 . Ho wev er, w e ﬁnd that this classical prop erty of an AR CH pro cess, namely , that E ( ε ( s 1 ) 2 | Y ( s 2 )) = h ( s 1 ), is fulﬁlled in the case of a triangular weigh ting matrix. T o summarize, the conditional v ariance giv en the neighboring observ ations dep ends on these neigh b oring observ ations. It is imp ortant to note that this is not the case for linear spatial mo dels (cf. Cressie 1993). Hence, the new mo del is muc h more ﬂexible. 12 y 2 E ( Y ( s 1 ) 2 | Y ( s 2 ) = y 2 ) −3 −2 −1 0 1 2 3 1 3 5 7 w 12 = w 21 = 0.1 , α 1 = α 2 = 2 w 12 = w 21 = 0.5 , α 1 = α 2 = 1 Figure 2: Conditional exp ectation of Y ( s 1 ) 2 giv en y 2 for n = 2, where E ( Y ( s 1 ) 2 | Y ( s 2 ) = y 2 2 ) is plotted as a solid line and h ( s 1 ) as a dashed line. 13 Theorem 5. Supp ose that the assumptions of The or em 2 ar e satisﬁe d. L et w ij ≥ 0 for i, j = 1 , . . . , n and w ij = 0 for 1 ≤ i ≤ j ≤ n . Supp ose that E ( ε ( s i ) 8[( n − 1) / 2] ) < ∞ for al l i = 1 , . . . , n and let ε ( s 1 ) , . . . , ε ( s n ) b e indep endent. It then holds for e ach k ∈ { 1 , . . . , n } that E ( Y ( s k ) 2 | Y ( s j ) , j = 1 , . . . , k − 1) = h k . Principally , it is not necessary that the matrix W of spatial weigh ts b e a triangular matrix, but there should exist a p erm utation matrix P such that ¨ W = PWP 0 is triangular. In this case, the observ ations also m ust b e p erm uted; i.e., the p ermuted vector of observ ations is ¨ Y = P Y . F urthermore, one may see that E ( Y ( s k ) 2 | Y ( s j ) , j = k + 1 , . . . , n ) = h k , if the weigh ting matrix W is a strictly upp er triangular matrix. In the follo wing section, we tak e a closer lo ok at t wo diﬀeren t sp eciﬁcations of the w eighting matrix W . 2.2 Choice of the W eigh ting Matrix W In this section, we suggest t wo diﬀeren t sp eciﬁcations of the matrix W of spatial weigh ts to adapt the pro cess to v arious situations. In particular, the second matrix is a triangular matrix; i.e., for this sp eciﬁcation, the support of ε must not b e bounded, and Theorem 5 can be applied. First, w e presen t a p ossible method to mo del more than one lag in space. Assume that the set ζ ( δ, s i ) = { j : || s i − s j || ∈ ( δ − c, δ ] } consists of all lo cations j for whic h the distance from lo cation s i is b etw een δ − c and δ . The distance is measured by some predeﬁned metric || a − b || on the considered space induced by an arbitrary norm || · || . The spatial lag constan t c is equiv alent to the time perio d of one lag in the temporal setting, which could b e one day , w eek, year, et cetera. In the spatial setting, the constant c must b e c hosen according to sp eciﬁc requiremen ts of the pro cess, e.g., 1 µm - 1 mm (microbiology), 1 cm - 1 m (materials science) or 1 k m - 100 k m (macro economics). Finally , the weigh ting matrix W is based on an arbitrarily c hosen matrix ˜ W fulﬁlling the assumptions introduced abov e, such as the binary contiguit y matrix, nearest-neighbor matrix, or inv erse-distance matrix (cf. Elhorst 2010). The elements 14 of W can b e sp eciﬁed as w ij =      ˜ w ij p P k =1 ρ k 1 ζ ( k c, s i ) ( j ) for i 6 = j 0 for i = j ∀ i, j = 1 , . . . , n , (9) where 1 A is the indicator function on the set A . Hence, t wo lo cations s i and s j are w eighted b y ρ 1 ˜ w ij if they are ﬁrst lag neigh b ors; i.e., the distance b et w een s i and s j lies b et w een zero and c . Moreov er, these tw o lo cations are weigh ted by ρ 2 ˜ w ij if the distance is b et w een c and 2 c . In this manner, as man y as p ∈ { 1 , 2 , . . . , d c − 1 max ij || s i − s j ||e} spatial lags can b e included in the pro cess. W e refer to this sp eciﬁcation of W as the spatial AR CH pro cess of order p (spAR CH( p )). Because the matrix ˜ W is assumed to b e kno wn, it remains to estimate only p spatial autoregressive parameters ρ 1 , . . . , ρ p . Second, an example to mo del pro cesses with some direction is presented. F or instance, orien ted pro cesses could spread from some cen ter/origin in to every direction of the considered space (e.g., epidemiology or disease mapping), or the pro cess could evolv e in one direction, e.g., from north to south (e.g., o cean currents or wind sp eed). In particular, w e fo cus on the ﬁrst case of an oriented pro cess. Therefore, assume that there is some known origin s 0 of the spatial pro cess. It is worth noting that the origin could also b e estimated. Without loss of generality , one can order the lo cations s 1 , . . . , s n with resp ect to the distance from the center. Th us, 0 < || s 1 − s 0 || ≤ || s 2 − s 0 || ≤ . . . ≤ || s n − s 0 || . Assuming additionally that eac h lo cation is inﬂuenced only b y the lo cations closer to the cen ter leads to an upp er triangular representation of W ; i.e., w ij =    ˜ w ij for || s i − s 0 || < || s j − s 0 || 0 otherwise =    ˜ w ij for i < j 0 for i ≥ j . (10) Both examples of W are illustrated in Figure 3. First, the prop osed spARCH( p ) is illustrated in the left-hand ﬁgure a) for p = 5. The p ositiv e w eights of the i -th ro w of W are dra wn with ﬁlled circles; i.e., the v ariance of the observ ations at all lo cations, which are drawn with ﬁlled circles, inﬂuence the v ariance of the observ ation at lo cation i colored in red. The spatial lag constan t c is assumed to be 1, and the distance b et w een the lo cations is measured using the Euclidean norm. Second, w e illustrate the prop osed orien ted pro cess on the righ t- hand side of Figure 3. The p oin t of origin s 0 is dra wn as a star, such that one ma y see that 15 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● a) s 1 s 2 1 2 3 4 5 −10 −8 −6 −4 −2 0 2 4 6 8 10 −10 −8 −6 −4 −2 0 2 4 6 8 10 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● b) s 1 s 2 s 0 −10 −8 −6 −4 −2 0 2 4 6 8 10 −10 −8 −6 −4 −2 0 2 4 6 8 10 Figure 3: Represen tation of the p ositiv e elemen ts in W (colored in grey , ﬁlled dots) for some lo cation i (colored in red) regarding a) a spARCH(5) pro cess and b) an orien ted pro cess with cen ter s 0 . All lo cations that inﬂuence the red lo cation i are colored in grey , whereas all other lo cation having no inﬂuence on i are drawn as empty circles. Matrix ˜ W is c hosen as the binary matrix of the 50 nearest neigh b ors. Moreov er, the 200 lo cations s = ( s 1 , s 2 ) result from a con tin uous pro cess in tw o-dimensional space. 16 T able 1: Summary of several co v ered settings. Mo del q D s W triangular time-series mo dels AR CH(1) Engle (1982) 1 Z  α 1 { s i − s j =1 }  i,j =1 ,...,n X AR CH( p ) Engle (1982) 1 Z  P p k =1 α k 1 { s i − s j = k }  i,j =1 ,...,n X sp atiotemp or al mo dels spatial ARCH Boro vko v a & Lopuhaa (2012) 1 Z  ( a 1 ,i + a 2 ,i w ij ) 1 { s i − s j =1 }  i,j =1 ,...,n X sp atial mo dels SAR CH(1) Bera & Simlai (2004) 2 , 3 Z q , R q  α 1 w 2 ij  i,j =1 ,...,n new pr op ositions (multidimensional) spAR CH( p ) ≥ 1 Z q , R q cf. eq. (9) orien ted ≥ 1 Z q , R q cf. eq. (10) X only lo cations closer to s 0 ha v e an inﬂuence on the lo cation i . Regarding b oth cases a) and b), w e c ho ose ˜ W as the q -nearest-neighbor matrix, where q = 50. Finally , w e provide the link to classical heteroscedastic time-series mo dels and other prop o- sitions of spatial ARCH mo dels in T able 1. In particular, w e show ho w the parameters and the spatial weigh ting matrix must b e c hosen to transfer the in tro duced mo del to the classical AR CH( p ) pro cess proposed by Engle (1982). It is worth noting that the supp ort of the error distribution do es not ha ve to b e b ounded b ecause W is triangular. 3 Statistical Inference T o date, the weigh ting matrix W has mostly b een chosen to b e an arbitrary matrix with nonnegativ e elements and zeros on the main diagonal. T o ensure that h ( s i ) and Y ( s i ) are nonnegativ e, the weigh ts must fulﬁll an additional condition as shown, e.g., in Theorem 2 and Example 2. These conditions connect the weigh ts with the supp ort of ε ( s i ). In applications, the weigh ting matrix W ma y dep end on additional parameters as discussed earlier. First, we consider the mo del h ( s i ) = α + ρ i − 1 X v =1 ˜ w iv Y ( s v ) 2 , i = 1 , . . . , n with ˜ w iv ≥ 0 for i, v = 1 , . . . , n and ˜ w iv = 0 for 1 ≤ i ≤ v ≤ n . Thus, W is c hosen as a lo w er 17 triangular matrix. It is assumed that α > 0 and ρ > 0. Supp ose that ε ( s 1 ) , . . . , ε ( s n ) are indep enden t and iden tically distributed. Let f ε denote its densit y function and let f ε b e diﬀeren tiable. Moreov er, let y = ( y 1 , . . . , y n ) 0 b e the v ector of observ ations and h i = h ( s i ; y ). Using (7), the density of Y is given b y f Y ( y ) = n Y i =1  f ε  y i √ h i  1 √ h i  = n Y i =1 f Y ( s i ) | Y ( s i − 1 ) ,...,Y ( s 1 ) ( y i | y i − 1 , . . . , y 1 ) and log( f Y ( y )) = n X i =1  log  f ε  y i √ h i  − 1 2 log( h i )  . Let ˜ f = f 0 ε /f ε . Putting the partial deriv atives of log ( f Y ( y ; α, ρ )) with resp ect to α and ρ equal to zero, we obtain the estimators ˆ α and ˆ ρ that satisfy n X i =1 1 ˆ α + ˆ ρA i = − n X i =1 y i ( ˆ α + ˆ ρA i ) 3 / 2 ˜ f  y i √ ˆ α + ˆ ρA i  , (11) n X i =2 A i ˆ α + ˆ ρA i = − n X i =1 A i y i ( ˆ α + ˆ ρA i ) 3 / 2 ˜ f  y i √ ˆ α + ˆ ρA i  (12) with A i = P i − 1 v =1 ˜ w iv y 2 v for i = 1 , . . . , n . If the corresp onding information matrix B n is p ositiv e deﬁnite, then the results of Crowder (1976) can b e applied. It follo ws that there is a unique solution of (11) and (12). The estimators ˆ α and ˆ ρ are consisten t, and ( ˆ α, ˆ ρ ) is approximately distributed as N 2 ( 0 , B − 1 n ). This result can be used for testing the h yp otheses on the parameters α and ρ . F or instance, assuming f ε to b e the standard normal distribution. Then, it follo ws that ˜ f ( x ) = − x , and the information matrix is given b y B n = − E      − 1 2 n P i =1 1 ( α + ρA i ) 2 + n P i =1 y 2 i ( α + ρA i ) 3 − 1 2 n P i =1 A i ( α + ρA i ) 2 + n P i =1 A i y 2 i ( α + ρA i ) 3 − 1 2 n P i =1 A i ( α + ρA i ) 2 + n P i =1 A i y 2 i ( α + ρA i ) 3 − 1 2 n P i =1 A 2 i ( α + ρA i ) 2 + n P i =1 A 2 i y 2 i ( α + ρA i ) 3      . These results can b e easily extended to more general mo dels, such as the approach de- scrib ed in (9). Moreov er, in this section, we fo cused on low er triangular matrices, but all of the results presented ab o ve also hold for upp er triangular matrices. Next, we wan t to consider a mo del in whic h the weigh t matrix is neither a low er nor an upp er triangular matrix. Let h ( s i ) = α + ρ n X v =1 ˜ w iv Y ( s v ) 2 , i = 1 , . . . , n 18 with ˜ w iv ≥ 0 for i, v = 1 , . . . , n and ˜ w ii = 0 for 1 ≤ i ≤ n . It is assumed that α > 0 and ρ > 0. F or these settings, the determinan t | det   ∂ y j / √ h j ∂ y i ! i,j =1 ,...,n   | m ust b e computed. F or practical applications, it is m uch easier to compute the logarithm of this determinant; i.e., log | det   ∂ y j / √ h j ∂ y i ! i,j =1 ,...,n   | = n X i =1  2 log y i − 3 2 log h i  + n X i =1 log | λ i | , where λ i is the i -th eigen v alue of  diag  h 1 y 2 1 , . . . , h n y 2 n  + ρ W 0  . In addition, it is imp ortan t to note that the w eigh ting matrix is usually sparse, and there are p ositive weigh ts up to the k -th sub diagonal, where k = max {| i − j | : w ij > 0 } . If the locations are w ell ordered (e.g., b y the distance to an arbitrarily c hosen lo cation), k is muc h smaller than n . 4 Applications In the following section, the fo cus is on applications of the suggested spatial ARCH mo del. In particular, w e extend the w ell-known spatial autoregressiv e pro cess by assuming conditional heteroscedastic residuals. Finally , the mo del parameters of such a mo del are estimated for a real data example. In the ensuing Section 5, w e analyze the performance of the estimators in more detail by rep orting the results of an extensiv e sim ulation study . 4.1 Spatial Autoregressiv e Pro cess with Conditional Heteroscedastic Residuals: SARspAR CH F or the deﬁnition of the spatial autoregressiv e process, w e must int ro duce a further matrix B of spatial w eigh ts. This matrix B could diﬀer from the aforementioned w eighting matrix W . Ho w ever, it is also assumed that B is non-sto chastic and nonnegativ e with zeros on the main diagonal. F urthermore, let λ denote the spatial autoregressive coeﬃcient and µ b e the mean parameter. The mo del is then deﬁned as follows: Y = µ 1 + λ B Y + ξ , i.e. Y = ( I − λ B ) − 1 ( µ 1 + ξ ) . (13) 19 The vector of disturbances ξ = ( ξ 1 , . . . , ξ n ) follows a spatial ARCH according to the suggested mo del in (1). Consequently , the error pro cess is given by ξ = diag( h ) 1 / 2 ε and h = α + W diag( ξ ) ξ . (14) In Figure 4, we plotted four diﬀerent simulated spatial mo dels to illustrate the b eha vior of these pro cesses and compare them with resp ect to their prop erties. Moreo ver, the resp ective spatial auto correlation functions (ACF) are shown in Figure 5. F or the simulation, the spatial domain is assumed to b e a lattice; i.e., D s = { ( i, j ) ∈ Z 2 : i, j = 1 , . . . , d } . In plot (a), the innov ations ε truncated on the in terv al [ − a, a ] are sho wn. The respective b ound a results from the c hoice of the weigh ting matrix W . In particular, this matrix is assumed to b e the pro duct of the parameter ρ and a known weigh ting matrix ˜ W ; i.e., W = ρ ˜ W . This setting w as discussed in Section 2.2 as a spARCH(1) model, where c equals 1, and the considered metric is induced b y the maximum norm. Moreov er, the known matrix ˜ W of spatial weigh ts is a classical ro w-standardized Ro oks con tiguit y matrix, and α = α 0 1 . In the next plot (b) of Figure 4, the spatial autoregressive process with white noise ε is plotted. The sim ulation shows the classical b eha vior of a spatial autoregressiv e process; i.e., one can observ e clusters of high and lo w v alues. The prop osed spARCH model and SARspAR CH mo del are presen ted in the second ro w of Figure 4. On the left-hand side, the sim ulation of the spatial AR CH pro cess ξ is sho wn. Obviously , that pro cess diﬀers from the white noise pro cess in (a). The clusters of high and lo w v ariance are c haracterized b y the luminance of the colors. Thus, the v ariance is lo w in areas where the observ ations hav e a light color, and the v ariance is high in areas of deep colors. This is supp orted by the ACF function in Figure 5, where the squared observ ations are p ositiv ely correlated. The simulation of the SARspARCH process according to (13) and (14) yields the last image (d) in Figure 4. Finally , w e brieﬂy discuss the spatial autocorrelation function of the spatial AR CH pro- cess. In Figure 6, w e plot a sim ulation of an oriented spatial AR CH pro cess; i.e., the w eighting matrix is triangular. Moreov er, the spatial auto correlation function is plotted for the obser- v ations Y and the squared observ ations Y (2) . More precisely , the auto correlation function rep orts Moran’s I for diﬀeren t spatial lags; i.e., the ﬁrst-order spatial lag consists of the di- rectly neighboring lo cations, the second-order lag are all neighbors of these ﬁrst-lag neighbors, 20 0 10 20 30 40 50 0 10 20 30 40 50 (a) s 1 s 2 −1.0 −0.5 0.0 0.5 1.0 0 10 20 30 40 50 0 10 20 30 40 50 (b) s 1 s 2 −3 −2 −1 0 1 2 3 0 10 20 30 40 50 0 10 20 30 40 50 (c) s 1 s 2 −1.5 −1.0 −0.5 0.0 0.5 1.0 1.5 0 10 20 30 40 50 0 10 20 30 40 50 (d) s 1 s 2 −2 −1 0 1 2 Figure 4: Sim ulated spatial white noise pro cess (a) truncated on [ − a, a ], spatial autoregressiv e pro cess (b), spatial ARCH pro cess (c), spatial autoregressive pro cess with spatial ARCH errors (d), where d = 50, λ = 0 . 8, ρ = 0 . 5 ( a = 1 . 334), µ = 0, α 0 = 0 . 1 and σ 2 ε = 1. 21 (a) spatial lag Moran' s I 1 2 3 4 5 6 7 8 9 10 0.00 0.05 0.10 0.15 0.20 Y Y ( 2 ) (b) spatial lag Moran' s I 1 2 3 4 5 6 7 8 9 10 0.0 0.1 0.2 0.3 0.4 0.5 Y Y ( 2 ) (c) spatial lag Moran' s I 1 2 3 4 5 6 7 8 9 10 0.00 0.05 0.10 0.15 0.20 Y Y ( 2 ) (d) spatial lag Moran' s I 1 2 3 4 5 6 7 8 9 10 0.0 0.1 0.2 0.3 0.4 0.5 Y Y ( 2 ) Figure 5: Spatial auto correlation function of the simulated spatial white noise pro cess (a) truncated on [ − a, a ], spatial autoregressive pro cess (b), spatial ARCH pro cess (c), spatial autoregressiv e pro cess with spatial AR CH errors (d) plotted in Figure 4 22 and so forth. As exp ected, the observ ations Y are not spatially auto correlated, whereas the squared observ ations Y (2) exhibit a p ositiv e auto correlation, whic h decreases with increasing order of the spatial lag. 4.2 Real Data Example: Cancer Mortalit y Rates In this section, w e illustrate the prop osed pro cess using an empirical example. F or this reason, w e analyze the 5-y ear av erage mortality (2008–2012) caused by cancer of the lungs or bronc h us pro vided b y the Cen ter for Disease Con trol and Prev ention (U.S. Departmen t of Health and Human Services, Cen ters for Disease Control and Prev ention and National Cancer Institute (2015)). The death rates are age-adjusted to the 2000 U.S. standard p opulation (cf. CDC (2015)). The spatial domain is all U.S. counties excluding Alask a and Haw aii, i.e., 3108 counties. Moreo v er, w e do not distinguish in terms of race, sex, and age. In Figure 7, w e show the mortalit y for lung cancer and the main cov ariates: particulate matter PM 2 . 5 , the p ercen tage of smokers in 2012, and the p ersonal income per capita. In addition to these regressors, w e include the amoun ts of nitrogen dioxide (NO 2 ), sulfate dioxide (SO 2 ), particulate matter PM 10 , carb on mono xide (CO), and ozone (O 3 ) as regressors. Many studies hav e demonstrated that particulate matters are carcinogenic (cf., Raasc hou-Nielsen et al. (2013), Cohen & P op e (1995)). Con v ersely , there is no association betw een traﬃc in tensity , which results in a high amount of nitrogen dioxide, and the risk of lung cancer, as Raasc hou-Nielsen et al. (2013) noted. All en vironmen tal data are annual a verages (2012) recorded at the ground lev el by the United States Environmen tal Protection Agency (EP A). The measurement stations are plot- ted in the resp ectiv e maps in Figure 7. Moreov er, the data used as regressors are computed b y spatial interpolation, in particular, in verse-distance-based kriging. Finally , w e include co- v ariates describing the health and economic status in eac h count y , namely , the p ercen tage of smok ers in 2012 and the p ersonal income per capita recorded b y the CDC (Chronic Disease and Health Promotion Data & Indicators) and the U.S. Departmen t of Commerce, Bureau of Economic Analysis, resp ectiv ely . The en vironmental cov ariates and the p ercentage of smokers are included in our analysis b ecause they are the main drivers that cause a higher risk of lung cancer. Moreov er, we include p ersonal income to adjust for p ossible eﬀects, suc h as b etter access to health care, early diagnosis/recognition, and screening. The mo del given by (13) and (14) is estimated using the maximum likelihoo d approac h. 23 0 10 20 30 40 50 60 70 0 10 20 30 40 50 60 70 (c) s 1 s 2 −4 −2 0 2 4 spatial lag Moran' s I 1 2 3 4 5 6 7 8 9 10 0.00 0.10 0.20 0.30 Y Y ( 2 ) Figure 6: Simulated oriented spatial AR CH pro cess in the t wo- and three-dimensional view (ab o v e) and the spatial auto correlation function of the sim ulated observ ations and squared observ ations (b elow). 24 Figure 7: Mortalit y caused by cancer of the lungs or bronc hus in U.S. counties (ab o v e left) and main co v ariates: annual av erage of PM 2 . 5 in 2012 (abov e righ t), percentage of smokers in 2012 (b elo w left), and p ersonal income p er capita in thousand U.S. dollars (b elo w right). The measuremen t stations of the cov ariates PM 2 . 5 and the p ercen tage of smokers are indicated on the maps via empty circles. 25 T o include the cov ariates, the intercept µ 1 is replaced b y X β , where X is the matrix of regressors with the ﬁrst column 1 . Because of the speciﬁc setting, w e additionally incorp orate tw o matrices of spatial weigh ts for the autoregressiv e part; i.e., the mo del equation is given by Y = X β + ( λ 1 B 1 + λ 2 B 2 ) Y + ξ ξ = diag( h ) 1 / 2 ε with h = α 1 + ρ ˜ W diag( ξ ) ξ . In particular, w e estimate the spatial autoregressive part of the mo del using the well-kno wn quasi-maxim um-lik eliho o d estimator with Gaussian errors ξ (e.g., Lee 2004). Moreo v er, the spAR CH parameters are included in the logarithmic likelihoo d function (cf. Section 3); i.e., all parameters are estimated in one step. The spatial weigh ting matrix of the spAR CH pro cess is c hosen to be a non-triangular matrix W = ρ ˜ W . In particular, ˜ W is deﬁned as ro w-standardized Queen’s contiguit y matrix for all spatial lags up to order 5; i.e., ˜ W = diag 5 X k =1 B k ! 1 n ! − 1 5 X k =1 B k ! with B k denoting the ro w-standardized binary contiguit y matrix of the k -th-lag neigh b ors. Th us, the weigh ting matrix B 1 is a classical ro w-standardized Queen’s con tiguity matrix of the ﬁrst-lag neigh b ors, and matrix B 2 is the row-standardized contiguit y matrix of the second-lag neigh b ors. In T able 2, we summarize the results of three models: a simple linear regression mo del, the SAR model, and the SARspAR CH model. Moreo v er, we rep ort Moran’s I statistics and the p -v alues for testing the n ull hypothesis of the absence of spatial autocorrelation. All v ariables are log-transformed; thus, the estimates m ust b e interpreted in elasticity terms, k eeping the p ositiv e spatial correlation in mind (cf. LeSage 2008). W e selected the regressors b y minimizing the Ak aik e information criterion. It is unsurprising that the cov ariate describing the b eha vioral asp ect, namely , the p ercentage of smokers, has a large, p ositiv e impact on the mortality caused b y lung cancer. Moreo ver, we observ e only p ositiv e eﬀects of the amoun t of nitrogen dio xide and PM 2 . 5 regarding the en vironmental cov ariates. Ho w ever, it is imp ortan t to distinguish b etw een cancer incidence and cancer mortalit y . Hence, it is not surprising that w e found diﬀeren t eﬀects in terms of cancer mortality compared with the results of Raaschou-Nielsen et al. (2013) and Cohen & Pope (1995). 26 In all, it is in teresting to compare the results of the linear regression mo del and the mo dels that account for spatial dependence. All estimated parameters of the regression mo del are larger in absolute v alues than the estimated coeﬃcients of the SAR mo del. F or the SARspARCH mo del, the co eﬃcients are again smaller in absolute terms (e.g., p ercen tage of smokers and all environmen tal eﬀects), and sev eral co eﬃcients are omitted due to the Ak aike information criterion (e.g., nitrogen dio xide, ozone). Hence, the spatial auto correlation of the residual’s v ariance also aﬀects the results of the estimated co eﬃcien ts and, therefore, the interpretation of the impact of the regressors. Th us, it would b e in teresting to analyze the impact of spatial heteroscedasticit y on the estimated co eﬃcients of an SAR mo del in more detail in future studies. Moreo v er, the spatial auto correlation of the residuals and the squared residuals are worth noting. Whereas Moran’s I of the residuals do es not diﬀer signiﬁcantly from zero for b oth the SAR and SARspAR CH mo dels, the squared residuals are p ositiv ely correlated for the SAR mo del. Consequen tly , the residual’s v ariance exhibits spatial clusters, and the residuals cannot result from a spatial white noise pro cess. Ho wev er, b y applying the prop osed spAR CH model to the residuals, it is p ossible to remov e the spatial auto correlation of the squared residuals. F or the SARspARCH mo del, neither the residuals nor the squared residuals are correlated. 27 T able 2: Estimated co eﬃcien ts and summary statistics of a simple regression mo del as a b enc hmark and of the SAR and SARspARCH mo dels for the mortalit y caused b y lung cancer. Linear Regression SAR SARspARCH Estimate Standard Error p -V alue Estimate Standard Error p -V alue Estimate Standard Error p -V alue Intercept -10.8575 0.9964 0.0000 -4.3157 0.9347 0.0000 -0.0059 0.1629 0.9712 Envir onmental Covariates PM 10 -1.1734 0.0874 0.0000 -0.3584 0.0844 0.0000 -0.2641 0.0397 0.0000 PM 2 . 5 2.1193 0.1427 0.0000 0.6162 0.1402 0.0000 0.5365 0.0632 0.0000 SO 2 0.1210 0.0422 0.0042 - - - - - - NO 2 0.6217 0.0799 0.0000 0.2731 0.0732 0.0002 - - - O 3 -2.4133 0.2489 0.0000 -0.8082 0.2251 0.0003 - - - CO -0.4041 0.1151 0.0005 -0.1759 0.1026 0.0863 - - - Behavior al Covariates T obacco Use 1.2859 0.1677 0.0000 0.6090 0.1381 0.0000 0.3188 0.0665 0.0000 Ec onomic Covariates Personal Income - - - - - - - - - Sp atial Co eﬃcients λ 1 0.2449 0.0262 0.0000 0.2624 0.0278 0.0000 λ 2 0.4431 0.0327 0.0000 0.3888 0.0400 0.0000 σ 2 ξ 0.4628 0.0119 0.0000 α 0.0601 0.0015 0.0000 ρ 0.6680 0.0161 0.0000 Summary Statistics Moran’s I ξ 0.2203 0.0106 0.0000 -0.0114 0.0106 0.2966 Moran’s I ξ (2) 0.3331 0.0106 0.0000 0.3212 0.0106 0.0000 Moran’s I ε 0.0075 0.0106 0.4565 Moran’s I ε (2) 0.0067 0.0106 0.4852 AIC 7091.062 6560.509 2484.686 28 5 Sim ulation Studies The following section fo cuses on insights that we gained via extensive Monte Carlo simulation studies. Initially , w e analyze the impact of the b ounded support of the error distribution for the case of a non-triangular w eighting matrix. F urthermore, w e demonstrate ho w the parameters of the suggested spatial ARCH mo del can b e estimated and illustrate the b eha vior of the estimators for ﬁnite samples. F or all Monte Carlo simulations, w e sim ulated the pro cess as a t w o-dimensional lattice pro cess; i.e., D s = { s = ( s 1 , s 2 ) 0 ∈ Z 2 : 0 ≤ s 1 , s 2 ≤ d } . Hence, the num b er of observ ations n is equal to d 2 . Moreo ver, all simulations are p erformed for 10 5 replications. F or the ﬁrst sim ulation study , w e use a common ro w-standardized Rook con tiguity matrix. Consequen tly , the w eigh ting matrix ˜ W is set equal to the row-standardized Ro ok contiguit y matrix R 1 = ( r 1 ,ij ) i,j =1 ,...,n , with r 1 ,ij =    1 if || s i − s j || 1 = 1 0 otherwise . F urthermore, w e include the parameter ρ such that the weigh ting matrix is given b y W = ρ ˜ W . Hence, the matrix W is not triangular; thus, the support of the error distribution m ust b e compact. Therefore, the residuals are simulated from a standard normal distribution truncated on the interv al [ − a, a ]. The parameter α is c hosen to b e 5 · 1 n . Even tually , w e sim ulated the pro cess for diﬀeren t v alues of ρ and calculated Moran’s I statistic of the squared observ ations to measure the exten t of the spatial auto correlation of the conditional v ariance (cf. Moran 1950). In Figure 8, we plot Moran’s I and the resulting asymptotic 95% conﬁdence in terv als of I for diﬀeren t v alues of ρ . Obviously , the supp ort do es not hav e to b e constrained regarding ρ = 0. Ho w ever, this supp ort decreases with increasing v alues of ρ . If ρ = 1, the parameter a is equal to 0 . 968. Moreo ver, we observe that the growth rate of I decreases with increasing spatial w eigh t. This trend can b e explained b y the compact supp ort of the residuals. Because there cannot b e large innov ations ε ( s i ) in absolute terms, there also cannot o ccur large spatial clusters of high or lo w v ariance. F urthermore, w e analyzed the performance of the prop osed maxim um-likelihoo d estimator in detail. F or this sim ulation study , an oriented spatial ARCH pro cess is considered; i.e., the 29 ρ Moran' s I ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0.0 0.4 0.8 1.2 1.6 2.0 −0.05 0.05 0.15 0.25 −1.25 1.25 3.75 5 6.25 Y Y ( 2 ) a a Figure 8: Moran’s I of the observ ations Y and the squared observ ations Y (2) , including the asymptotic 95% conﬁdence in terv als of I for ρ ∈ { 0 , 0 . 05 , . . . , 2 } . Moreo ver, the resulting b ound a is plotted as a b old, black line. 30 w eigh ting matrix W is strictly triangular. W e again utilize the matrix ˜ W , which results in a ro w-standardized binary weigh ting matrix R 2 = ( r 2 ,ij ) i,j =1 ,...,n with r 2 ,ij =    1 if || s i − s j || 2 ≤ √ 2 ∧ || s i − s 0 || 2 < || s j − s 0 || 2 0 otherwise and s 0 =  b d 2 c , b d 2 c  0 . Consequen tly , any lo cation s i is inﬂuenced b y lo cations that lie within a distance of √ 2 from s i and that are closer to the origin s 0 . The cen tral lo cation s 0 is c hosen to b e in the middle of the tw o-dimensional lattice D s . T o ev aluate the p erformance of the estimators, w e consider the simple spARCH(1) mo del with Y = diag( h ) 1 / 2 ε h = α 1 n + ρ ˜ W diag( Y ) Y . In Figure 9, the p erformance of the estimators for b oth parameters α and ρ is visualized using k ernel density estimates. The pro cess was simulated for d ∈ { 10 , 20 , 50 } with 10 5 replications. Moreo v er, we considered all combinations of the true parameters α ∈ { 0 . 5 , 1 , 2 , 5 } and ρ ∈ { 0 , 0 . 2 , 0 . 6 , 0 . 9 } ; i.e., the simulation study was p erformed for 48 settings. Regarding the ﬁrst plot (Ia) in Figure 9, one migh t see that the true parameter of α is slightly underestimated if ρ = 0, although the density of the estimated ˆ α is sharper. F or increasing v alues of ρ , the densities b ecome less sharp, but the estimates are un biased. F urther- more, we analyzed the p erformance of the estimators for an increasing num b er of observ ations and ﬁxed ρ = 0 . 2 (see plot (I Ib)). In terestingly , the smaller v alues of α are estimated more precisely than larger v alues of α . Ho wev er, all estimators seem to b e unbiased and consisten t. F or the estimator of ρ , the p erformance do es not dep end on the magnitude of the spatial auto- correlation in the v ariance (see plot (I Ib)). All density estimates are equally shap ed. How ever, one migh t observ e that the estimator w orks po orly if the num b er of observ ations and the pa- rameter ρ are small. F or ρ = 0 . 2 and d = 10, ˆ ρ is more often close to zero than to the correct v alue of 0 . 2. If either ρ or d is increasing, the bias v anishes. In all settings, the absence of dep endence in the v ariance, i.e., ρ = 0, is estimated b etter than the presence of spatial clusters in the v ariance, i.e., ρ > 0. Moreo v er, the estimation of ρ is indep endent of the co eﬃcien t α . Please note that the curves in plot (IIa) are identical b ecause the random seed w as set to the n um b er of replicates for each setting; i.e., the innov ations ε are identical for eac h setting. 31 (I) (I I) (a) d = 20 α ^ Density ρ = 0 ρ = 0.2 ρ = 0.9 0.5 2.0 5.0 0 2 4 6 8 10 ρ = 0.2 α ^ Density d = 10 d = 20 d = 50 0.5 2.0 5.0 0 2 4 6 8 10 12 14 (b) d = 20 ρ ^ Density α = 0.5 α = 2 α = 5 0.0 0.2 0.6 0.9 0 10 20 30 40 50 60 70 α = 1 ρ ^ Density d = 10 d = 20 d = 50 0.0 0.2 0.6 0.9 0 10 20 30 40 Figure 9: Kernel density estimates of (a) ˆ α and (b) ˆ ρ for (I) a constan t num b er of observ ations ( d = 20) and (I I) an increasing num b er of observ ations. The true v alues of the parameters are α ∈ { 0 . 5 , 1 , 2 , 5 } and ρ ∈ { 0 , 0 . 2 , 0 . 6 , 0 . 9 } for the diﬀeren t settings. 32 6 Discussion Finally , w e discuss possible extensions of the model in this section and conclude the pap er b y summarizing the main ﬁndings. One p ossible extension of the prop osed spAR CH pro cess w ould b e to consider a generalized version analogous to the GARCH pro cess introduced b y Bollerslev (1986). F or the spatial ARCH pro cess, we deﬁned the conditional spatial v ariance b y (2); i.e., h = α + W 1 diag( Y ) Y . Adding a weigh ting matrix W 2 for h leads to h = α + W 1 diag( Y ) Y + W 2 h , whic h is equiv alent to h = ( I − W 2 ) − 1 ( α + W 1 diag( Y ) Y ) . The weigh ting matrix W 2 consists of the w eights for the spatial mo ving av erage part, and it can b e chosen analogous to matrix W 1 . Surely , the matrix m ust be non-sto c hastic with zeros on the diagonal, and the determinan t of ( I − W 2 ) must not b e zero. F or the ab o vemen tioned pro cess, the i -th comp onen t of h is given b y h ( s i ) = α i + n X v =1 w 1 ,iv Y ( s v ) 2 + n X v =1 w 2 ,iv h ( s v ) . Consequen tly , this spatial GAR CH pro cess incorp orates a spatial autoregressiv e and moving- a v erage part in the conditional v ariance. Ho wev er, the momen ts of this pro cess are not straigh t- forw ard; thus, this pro cess should b e considered in more detail in the future. A further p ossible extension w ould be a m ultiv ariate spatial pro cess with conditional heteroscedasticit y; i.e., w e do not observe a univ ariate random v ariable at each lo cation but rather a v ector of observ ations. F or the introduced spatial ARCH process, w e derived the required conditions such that the pro cess is well deﬁned. In particular, certain assumptions regarding the conv ergence of A 2 k are necessary if the weigh ting matrix is not triangular. F urthermore, we analyzed the moments of this new spatial model and proposed an estimation strategy based on the maxim um lik eliho o d approac h. Via extensiv e simulation studies, the p erformance of this estimator is illustrated. T o fo cus on empirical problems, w e discussed p ossible spatial weigh ting sc hemes in detail. Moreo v er, w e introduced a spatial autoregressive pro cess with heteroscedastic residuals (SARspAR CH). In particular, we applied this pro cess to the cancer death rate in all U.S. 33 coun ties except Alask a and Ha waii. F or this empirical example, w e included en vironmen tal, economic, and health-b eha vioral co v ariates. Comparing the estimation results of a spatial autoregressiv e (SAR) and the prop osed SARspAR CH process, one migh t observe that the re- gression co eﬃcients are sligh tly diﬀeren t. In particular, the eﬀect implied by the num b er of smok ers is underestimated if we do not accoun t for heteroscedastic residuals. Whereas the estimated co eﬃcien t equals 0.61 for the SAR mo del, the estimate is 0.32 for the SARspAR CH mo del. In the future, it would b e interesting to analyze whether the estimators of an SAR pro- cess are biased if the v ariance of the residuals exhibit spatial clusters. Moreo ver, the sensitivity of our pro cess and the in tro duced maximum likelihoo d estimator should b e analyzed in more detail with respect to the c hoice of the w eighting matrices. In particular, the focus should be on the assumption of an orien ted process, i.e., in the case when the assumed w eighting matrix is strictly triangular, although the process is not orien ted. Moreo ver, the p erformance of the lik eli- ho o d estimator of the parameters of an SAR mo del under spatial conditional heteroscedasticit y should b e critically examined, as we noted ab o ve. App endix A Pro ofs Pr o of of The or em 1. W e observ e that for i ∈ { 1 , . . . , n } Y ( s i ) 2 = ε ( s i ) 2 h ( s i ) = α i ε ( s i ) 2 + ε ( s i ) 2 n X v =1 w iv Y ( s v ) 2 | {z } = h ( s v ) ε ( s v ) 2 = α i ε ( s i ) 2 + ε ( s i ) 2 n X v =1 α v w iv ε ( s v ) 2 + ε ( s i ) 2 n X v =1 w iv ε ( s v ) 2 n X j =1 w v j Y ( s j ) 2 . (15) (15) can b e rewritten in matrix notation as follows: η =  I − A 2  Y (2) . The system of linear equations has a unique solution if (3) is fulﬁlled. Thus, Y ( s 1 ) 2 , . . . , Y ( s n ) 2 are uniquely deﬁned b y ε ( s 1 ) 2 , . . . , ε ( s n ) 2 . Because h = α + W ( I − A 2 ) − 1 η 34 and Y = diag( h ) 1 / 2 ε , the result follows. Pr o of of The or em 2. The result is ob vious b ecause all elements of η are nonnegative. Pr o of of L emma 1. If W is a low er triangular matrix, it is nilp otent b ecause W n = 0 . The same holds for the matrix A ; i.e., A n = 0 . Because r k ( I − A 2 ) = n and ( I − A ) − 1 = I + A + . . . + A n − 1 , it follows that ( I − A 2 ) − 1 = I + A 2 + . . . + A 2[( n − 1) / 2] . (16) All elements of A are nonnegative; thus, the result follows straigh tforwardly . Pr o of of L emma 2. W e make use of Theorem 18.2.16 of Harville (2008). Th us, if lim k →∞ A 2 k = 0 , it follows that det ( I − A 2 ) 6 = 0. Consequen tly , Theorem 1 can b e applied. Moreo v er, it holds that  I − A 2  − 1 = ∞ X v =0 ( A 2 ) v . Because all elemen ts of A are nonnegative, it follows that all comp onen ts of the matrix ( I − A 2 ) − 1 are also nonnegative. Pr o of of The or em 3. W e utilize (4) and obtain h = α + W ( I n − A 2 ) − 1 η = k ( ε ( s 1 ) 2 , . . . , ε ( s n ) 2 ) . Consequen tly , Y 0 = diag( k ( ε ( s 1 ) 2 , . . . , ε ( s n ) 2 )) ( ε ( s 1 ) , . . . , ε ( s n )) 0 d = diag( k ((( − 1) v 1 ε ( s 1 )) 2 , . . . , (( − 1) v n ε ( s n )) 2 )) (( − 1) v 1 ε ( s 1 ) , . . . , ( − 1) v n ε ( s n )) 0 = (( − 1) v 1 Y ( s 1 ) , . . . , ( − 1) v n Y ( s n )) 0 . Th us, the result is prov ed. Pr o of of L emma 3. First, let || . || b e an arbitrary induced matrix norm. Because || Y (2) || r ≤ || ( I − A 2 ) − 1 || r || η || r , it follows that E ( || Y (2) || r ) ≤ p E ( || ( I − A 2 ) − 1 || 2 r ) E ( || η || 2 r ) . 35 In (16), it is sho wn that ( I − A 2 ) − 1 = I + A 2 + . . . + A 2[( n − 1) / 2] . Consequen tly , || ( I − A 2 ) − 1 || ≤ || I || + || A 2 || + . . . + || A 2[( n − 1) / 2] || ≤ [( n − 1) / 2] X v =0 || A || 2 v ≤ [( n − 1) / 2] X v =0 || diag( ε ( s 1 ) 2 , . . . , ε ( s n ) 2 ) || 2 v || W || 2 v and by Jensen’s inequalit y , || ( I − A 2 ) − 1 || 2 r ≤  n − 1 2  + 1  2 r − 1 [( n − 1) / 2] X v =0 || diag( ε ( s 1 ) 2 , . . . , ε ( s n ) 2 ) || 4 rv || W || 4 rv . This leads to E ( || ( I − A 2 ) − 1 || 2 r ) ≤  n − 1 2  + 1  2 r − 1 [( n − 1) / 2] X v =0 || W || 4 rv E  || diag( ε ( s 1 ) 2 , . . . , ε ( s n ) 2 ) || 4 rv  . T aking the norm || . || 1 , we obtain that E  || diag( ε ( s 1 ) 2 , . . . , ε ( s n ) 2 ) || 4 rv 1  = max 1 ≤ i ≤ n E ( ε ( s i ) 8 rv ) . This sho ws that for the existence of the upp er b ound, it is required that E ( ε ( s i ) 8 r [( n − 1) / 2] ) m ust exist. F or the existence of E ( || η || 2 r ), it is suﬃcient that E ( ε ( s i ) 4 r ) exists. Regarding b), one can see that E ( Y ( s i ) 2 v − 1 ) = 0 because the distribution is symmetric, and the momen ts exist. ( Y ( s 1 ) , . . . , Y ( s n )) and ( − Y ( s 1 ) , Y ( s 2 ) , . . . , Y ( s n )) hav e the same distribution. Th us, it follo ws that E ( Y ( s 1 ) 2 v − 1 | Y ( s 2 ) , . . . , Y ( s n )) = E ( − Y ( s 1 ) 2 v − 1 | Y ( s 2 ) , . . . , Y ( s n )) . Consequen tly , this quantit y is equal to zero. Pr o of of The or em 4. Now, ||  Y ( s 1 ) 2 , . . . , Y ( s n ) 2  || ≤ || ( I − A 2 ) − 1 || || η || ≤ λ || η || . Cho osing the norm || · || 2 , we see that the 2 r -th moment is ﬁnite. 36 The pro of of part a 2 ) follows as in the ab ov e lemma. T o prov e b), w e apply the representation given in the pro of of Theorem 2; i.e., Y (2) = ∞ X v =0 A 2 v η . No w, ||  Y ( s 1 ) 2 , . . . , Y ( s n ) 2  || ≤ ∞ X v =0 || A 2 || v || η || ≤ 1 1 − λ || η || . This completes the pro of. Pr o of of The or em 5. Because W is a strictly triangular matrix, it follows that det   ∂ y j / √ h j ∂ y i ! i,j =1 ,...,n   = 1 n Q j =1 p h j . with h j = α j + P j − 1 v =1 w j v Y ( s v ) 2 . Let Y k = ( Y ( s 1 ) , . . . , Y ( s k )) 0 . Then, f Y k ( y ) = k Y j =1 1 p h j f ε ( s j ) y j p h j ! . Th us, E ( Y ( s k ) 2 | Y ( s j ) , j = 1 , . . . , k − 1) = 1 f Y ( s 1 ) ,...,Y ( s k − 1 ) ( y 1 , . . . , y k − 1 ) ∞ Z −∞ y 2 k k Y j =1 1 p h j f ε ( s j ) y j p h j ! d y k = ∞ Z −∞ y 2 k 1 √ h k f ε ( s k )  y k √ h k  d y k . Because it is assumed that V ar ( ε ( s k )) = 1 for all k , it follows that E ( Y ( s k ) 2 | Y ( s j ) , j = 1 , . . . , k − 1) = h k . References Anselin, L. (2010), ‘Thirt y y ears of spatial econometrics’, Pap ers in R e gional Scienc es 89 , 3–25. 37 Bera, A. K. & Simlai, P . (2004), T esting for Spatial Dep endence and a F orm ulation of Spatial AR CH (SAR CH) Mo del with Applications, T echnical rep ort, W orking pap er Univ ersity of Illinois. Bic k el, P . J. & Doksum, K. A. (2015), Mathematic al Statistics: Basic Ide as and Sele cte d T opics , V ol. 117, CRC Press. Bollerslev, T. (1986), ‘Generalized autoregressiv e conditional heteroskedasticit y’, Journal of e c onometrics 31 (3), 307–327. Boro vk ov a, S. & Lopuhaa, R. (2012), ‘Spatial GAR CH: A Spatial Approac h to Multiv ariate V olatility Mo deling’, Available at SSRN 2176781 . Cap orin, M. & P aruolo, P . (2006), ‘GARC H mo dels with spatial structure’, SIS Statistic a pp. 447–450. Cohen, A. J. & P op e, C. A. (1995), ‘Lung cancer and air pollution.’, Envir onmental He alth Persp e ctives 103 (Suppl 8), 219. Cressie, N. (1993), Statistics for sp atial data , Wiley . URL: https://b o oks.go o gle.de/b o oks?id=4L dCgAAQBAJ Cressie, N. & Wikle, C. K. (2011), Statistics for sp atio-temp or al data , Wiley . Cro wder, M. J. (1976), ‘Maximum likelihoo d estimation for dep enden t observ ations’, Journal of the R oyal Statistic al So ciety. Series B (Metho dolo gic al) pp. 45–53. Elhorst, J. P . (2010), ‘Applied spatial econometrics: raising the bar’, Sp atial Ec onomic A nalysis 5 (1), 9–28. Engle, R. F. (1982), ‘Autoregressive conditional heteroscedasticity with estimates of the v ari- ance of united kingdom inﬂation’, Ec onometric a: Journal of the Ec onometric So ciety pp. 987– 1007. F ass` o, A., Cameletti, M. & Nicolis, O. (2007), ‘Air qualit y monitoring using heterogeneous net w orks’, Envir onmetrics 18 (3), 245–264. F ass` o, A. & Finazzi, F. (2011), ‘Maximum likelihoo d estimation of the dynamic coregionaliza- tion mo del with heterotopic data’, Envir onmetrics 22 (6), 735–748. 38 Fingleton, B. (2008 a ), ‘A generalized metho d of moments estimator for a spatial mo del with mo ving av erage errors, with application to real estate prices’, Empiric al Ec onomics 34 , 35–57. Fingleton, B. (2008 b ), ‘A generalized metho d of momen ts estimator for a spatial panel model with an endogenous spatial lag and spatial moving av erage errors’, Sp atial Ec onomic Analysis 3 (1), 27–44. F uentes, M. (2001), ‘A high frequency kriging approach for non-stationary environmen tal pro- cesses’, Envir onmetrics 12 (5), 469–483. F uentes, M. (2002), ‘Sp ectral metho ds for nonstationary spatial pro cesses’, Biometrika 89 (1), 197–210. Harville, D. A. (2008), Matrix algebr a fr om a statistician ’s p ersp e ctive , V ol. 1, Springer. Ho, M.-H. R., Ombao, H. & Shum w ay , R. (2005), ‘A state-space approach to mo delling brain dynamics’, Statistic a Sinic a pp. 407–425. Holly , S., P esaran, M. H. & Y amagata, T. (2010), ‘A spatio-temp oral mo del of house prices in the USA’, Journal of Ec onometrics 158 , 160–173. Lee, L.-F. (2004), ‘Asymptotic distributions of quasi-maxim um likelihoo d estimators for spatial autoregressiv e mo dels’, Ec onometric a 72 (6), 1899–1925. LeSage, J. P . (2008), ‘An introduction to spatial econometrics’, R evue d’ ´ ec onomie industriel le (3), 19–44. MacNab, Y. C. & Dean, C. (2001), ‘Autoregressive spatial smo othing and temp oral spline smo othing for mapping rates’, Biometrics 57 (3), 949–956. Moran, P . A. P . (1950), ‘Notes on con tinuous sto c hastic phenomena’, Biometrika 37 , 17–23. Noib oar, A. & Cohen, I. (2005), Tw o-dimensional garc h model with application to anomaly detection, in ‘13th Europ ean Signal Pro cessing Conf., Istan bul, T urk ey’, IEEE, pp. 1–4. Noib oar, A. & Cohen, I. (2007), ‘Anomaly detection based on wa velet domain garc h random ﬁeld mo deling’, Ge oscienc e and R emote Sensing, IEEE T r ansactions on 45 (5), 1361–1373. 39 Om bao, H., Shao, X., Rykhlevsk aia, E., F abiani, M. & Gratton, G. (2008), ‘Spatio-sp ectral analysis of brain signals’, Statistic a Sinic a pp. 1465–1482. Raasc hou-Nielsen, O., Andersen, Z. J., Beelen, R., Samoli, E., Stafoggia, M., W einmayr, G., Hoﬀmann, B., Fisc her, P ., Nieu wenh uijsen, M. J., Brunekreef, B. et al. (2013), ‘Air p ollution and lung cancer incidence in 17 europ ean cohorts: prosp ectiv e analyses from the europ ean study of cohorts for air p ollution eﬀects (escap e)’, The lanc et onc olo gy 14 (9), 813–822. Sampson, P . D. & Guttorp, P . (1992), ‘Nonparametric estimation of nonstationary spatial co v ariance structure’, Journal of the Americ an Statistic al Asso ciation 87 (417), 108–119. Sc hmidt, A. M. & O’Hagan, A. (2003), ‘Bay esian inference for non-stationary spatial co v ari- ance structure via spatial deformations’, Journal of the R oyal Statistic al So ciety: Series B (Statistic al Metho dolo gy) 65 (3), 743–758. Serﬂing, R. J. (2006), ‘Multiv ariate symmetry and asymmetry’, Encyclop e dia of statistic al sci- enc es . Shink arev a, S. V., Ombao, H. C., Sutton, B. P ., Mohan ty , A. & Miller, G. A. (2006), ‘Clas- siﬁcation of functional brain images with a spatio-temp oral dissimilarit y map’, Neur oImage 33 (1), 63–71. Stroud, J. R., M ¨ uller, P . & Sans´ o, B. (2001), ‘Dynamic mo dels for spatiotemporal data’, Journal of the R oyal Statistic al So ciety: Series B (Statistic al Metho dolo gy) 63 (4), 673–689. U.S. Departmen t of Health and Human Services, Centers for Disease Control and Preven tion and National Cancer Institute (2015), ‘United States Cancer Statistics 1999-2012 Incidence and Mortality W eb-based Rep ort’. Whittle, P . (1954), ‘On stationary pro cesses in the plane’, Biometrika pp. 434–449. 40

Generalized Spatial and Spatiotemporal Autoregressive Conditional Heteroscedasticity

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment