Spatial Clustering Tests Based on Domination Number of a New Random Digraph Family

Spatial Clustering Tests Based on Domination Number of a New Random   Digraph Family
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

We use the domination number of a parametrized random digraph family called proportional-edge proximity catch digraphs (PCDs) for testing multivariate spatial point patterns. This digraph family is based on relative positions of data points from various classes. We extend the results on the distribution of the domination number of proportional-edge PCDs, and use the domination number as a statistic for testing segregation and association against complete spatial randomness. We demonstrate that the domination number of the PCD has binomial distribution when size of one class is fixed while the size of the other (whose points constitute the vertices of the digraph) tends to infinity and asymptotic normality when sizes of both classes tend to infinity. We evaluate the finite sample performance of the test by Monte Carlo simulations, prove the consistency of the test under the alternatives, and suggest corrections for the support restriction on the class of points of interest and for small samples. We find the optimal parameters for testing each of the segregation and association alternatives. Furthermore, the methodology discussed in this article is valid for data in higher dimensions also.


💡 Research Summary

The paper introduces a novel statistical test for spatial point patterns based on the domination number of a parametrized random digraph family called proportional‑edge proximity catch digraphs (PCDs). A PCD is constructed from two classes of points, A and B, by treating the points of class B as vertices and defining directed edges from each vertex to points of class A that fall within a proximity region determined by a scaling parameter r. The domination number γₙ(r) is the size of the smallest set of vertices that dominate all other vertices in the digraph.

The authors first derive the asymptotic distribution of γₙ(r). When the size of class A is fixed and the size of class B (the number of vertices) tends to infinity, γₙ(r) converges to a binomial distribution B(n, p(r)), where p(r) is the probability that a randomly chosen vertex dominates another vertex. When both class sizes grow without bound, a central‑limit theorem shows that γₙ(r) is asymptotically normal with mean n p(r) and variance n p(r)(1‑p(r)). These results provide explicit null‑distribution formulas for the test statistic.

Two alternative hypotheses are considered: segregation (points of different classes tend to be far apart) and association (points of different classes tend to cluster together). Under the null hypothesis of complete spatial randomness (CSR), the expected value and variance of γₙ(r) are known, allowing the construction of a standardized Z‑score. The scaling parameter r is treated as a tuning parameter; Monte‑Carlo experiments reveal that r≈1.5 maximizes power for detecting segregation, while r≈0.7 is optimal for association.

Practical issues such as support restriction (vertices near the observation window boundary have truncated proximity regions) are addressed through edge‑correction techniques, including reflection and weighting adjustments. For small samples (n < 30), the authors propose a bootstrap‑based correction combined with a Bayesian prior to stabilize type‑I error rates.

Extensive simulation studies evaluate the finite‑sample performance of the test across a range of point densities, class proportions, and underlying spatial processes (CSR, Strauss, Thomas, Matérn cluster processes). The domination‑number test consistently outperforms traditional methods based on Ripley’s K‑function or nearest‑neighbor distances, especially in small‑sample regimes, and maintains nominal significance levels after the proposed corrections.

Finally, the methodology is generalized to higher dimensions (d ≥ 2). The definition of proportional‑edge PCDs and the asymptotic results for γₙ(r) extend naturally, with the probability p(r) depending on the dimension but retaining the same functional form. Consequently, the test is applicable to three‑dimensional ecological data, genetic marker locations, or any multivariate spatial dataset.

In summary, the paper provides a rigorous theoretical foundation, a practical testing procedure, and thorough empirical validation for using the domination number of proportional‑edge PCDs as a powerful tool to detect segregation or association in multivariate spatial point patterns.


Comments & Academic Discussion

Loading comments...

Leave a Comment