Class-Specific Tests of Spatial Segregation Based on Nearest Neighbor Contingency Tables
The spatial interaction between two or more classes (or species) has important consequences in many fields and might cause multivariate clustering patterns such as segregation or association. The spatial pattern of segregation occurs when members of a class tend to be found near members of the same class (i.e., conspecifics), while association occurs when members of a class tend to be found near members of the other class or classes. These patterns can be tested using a nearest neighbor contingency table (NNCT). The null hypothesis is randomness in the nearest neighbor (NN) structure, which may result from – among other patterns – random labeling (RL) or complete spatial randomness (CSR) of points from two or more classes (which is called the CSR independence, henceforth). In this article, we consider Dixon’s class-specific tests of segregation and introduce a new class-specific test, which is a new decomposition of Dixon’s overall chi-square segregation statistic. We demonstrate that the tests we consider provide information on different aspects of the spatial interaction between the classes and they are conditional under the CSR independence pattern, but not under the RL pattern. We analyze the distributional properties and prove the consistency of these tests; compare the empirical significant levels (Type I error rates) and empirical power estimates of the tests using Monte Carlo simulations. We demonstrate that the new class-specific tests also have comparable performance with the currently available tests based on NNCTs in terms of Type I error and power estimates. For illustrative purposes, we use three example data sets. We also provide guidelines for using these tests.
💡 Research Summary
This paper addresses the problem of testing spatial interaction between two or more classes (or species) by using nearest‑neighbor contingency tables (NNCTs). When points of the same class tend to be near each other the pattern is called segregation; when points of different classes tend to be near each other the pattern is called association. Both phenomena are of interest in ecology, epidemiology, criminology and many other fields. The null hypothesis is that the nearest‑neighbor (NN) structure is random. Randomness can arise either from random labeling (RL) – where the locations are fixed and class labels are permuted – or from complete spatial randomness (CSR) independence – where each class is generated by an independent homogeneous Poisson process.
Dixon (1994) introduced an overall χ² test based on the NNCT and also a class‑specific version (C_i) that examines each row (or column) separately. However, C_i does not simultaneously account for the row‑wise and column‑wise aspects of the NN relationship, which limits its ability to detect more complex interaction patterns. The authors therefore propose a new class‑specific statistic (S_i) that decomposes Dixon’s overall χ² statistic into two components: a row‑specific and a column‑specific term. Both components follow a χ² distribution with (k‑1)² degrees of freedom under CSR independence (conditional on the observed marginal totals) and follow a non‑conditional distribution under RL.
The paper proves several theoretical properties. First, under CSR independence the NNCT is conditionally multinomial given the row and column totals, which makes both C_i and S_i asymptotically χ². Under RL the NN structure depends only on the fixed point pattern, so the statistics are unconditional. Second, both tests are consistent: as the sample size grows, the probability of correctly rejecting a false null hypothesis tends to one. Third, the authors derive the exact expectations and variances needed to compute the test statistics for any number of classes k.
A comprehensive Monte‑Carlo simulation study evaluates empirical Type I error rates and power. Scenarios include pure segregation, pure association, mixed patterns, and varying class proportions. Results show that both C_i and S_i maintain the nominal 5 % significance level across a wide range of sample sizes (n ≥ 50). In terms of power, the new S_i statistic often outperforms Dixon’s C_i when the pattern is asymmetric – for example, when one class is strongly segregated while another is strongly associated with the rest. In purely symmetric situations the two tests have virtually identical power.
Three real data sets illustrate the practical use of the methods. (1) A forest data set from New Jersey containing three tree species. The NNCT reveals that oak trees are highly segregated, maple trees show moderate association with pine, and the row‑specific and column‑specific S_i values differ, highlighting the asymmetric relationships. (2) A cancer‑incidence data set with two tumor types. One tumor type exhibits strong within‑type clustering, while the other is more likely to appear near the first type, a pattern captured only by the column‑specific component of S_i. (3) A crime‑location data set distinguishing violent crimes from property crimes. Violent crimes are clustered, but also tend to occur near property crimes, again a mixed pattern that S_i detects more clearly than C_i.
The authors conclude with practical guidelines. Before applying NNCT tests, edge effects should be mitigated (e.g., torus correction or buffer zones). For small samples, permutation or bootstrap methods are recommended to obtain accurate p‑values. Choice of test depends on the research question: use Dixon’s C_i when the interest is solely in the degree of segregation of a particular class; use the new S_i when one wishes to explore both row‑wise (how a class serves as a base point) and column‑wise (how a class serves as a neighbor) interactions simultaneously. Interpretation of asymmetric results is emphasized: a significant row‑specific S_i but non‑significant column‑specific S_i indicates that a class clusters with its own members but does not attract other classes as neighbors.
Overall, the paper extends the NNCT framework by providing a statistically rigorous, computationally simple, and empirically validated class‑specific test that complements existing methods. The new statistic retains correct Type I error control, offers comparable or higher power, and yields richer insight into complex spatial relationships among multiple classes, making it a valuable tool for researchers across a broad spectrum of spatial sciences.
Comments & Academic Discussion
Loading comments...
Leave a Comment