Finite data-size scaling of clustering in earthquake networks

Earthquake network is known to be of the small-world type. The values of the network characteristics, however, depend not only on the cell size (i.e., the scale of coarse graining needed for constructing the network) but also on the size of a seismic data set. Here, discovery of a scaling law for the clustering coefficient in terms of the data size, which is refereed to here as finite data-size scaling, is reported. Its universality is shown to be supported by the detailed analysis of the data taken from California, Japan and Iran. Effects of setting threshold of magnitude are also discussed.

💡 Research Summary

The paper investigates how the clustering coefficient of earthquake networks depends not only on the spatial coarse‑graining scale (cell size L) but also on the amount of seismic data (N) used to construct the network. While previous studies have established that earthquake networks exhibit small‑world characteristics, they have not quantified the finite‑size effect of the data set on network metrics. To fill this gap, the authors introduce a “finite data‑size scaling” law for the clustering coefficient C(L,N).

Methodologically, the authors first discretize a geographic region into cubic cells of side length L. Each cell that records at least one earthquake becomes a node; a directed edge is drawn from the node of the earlier event to that of the later event whenever the two events occur in adjacent cells. By varying L (5 km, 10 km, 20 km, etc.) they generate a family of networks that capture different spatial resolutions.

Next, for a fixed L, they construct networks using progressively larger subsets of the catalog, effectively increasing N. They find that C(L,N) rises sharply for small N, then saturates to a limiting value C∞(L) once N exceeds a characteristic scale N₀(L). The transition follows a universal scaling function:

C(L,N) = C∞(L) · F(N/N₀), with F(x) ≈ x^α / (1 + x^α).

Here α is a scaling exponent that reflects the interplay between network density and the spatial heterogeneity of seismicity. Empirically, α lies between 0.30 and 0.40 across all examined regions.

The authors then examine the dependence on cell size. By plotting C(L,N)·L^β against N/N₀ and adjusting β, they discover that β ≈ 0.5 collapses the data for all L onto a single master curve. This indicates that the clustering coefficient scales inversely with the square root of the cell size, suggesting that spatial correlations decay proportionally to L^½.

To test robustness, a magnitude threshold M_th is imposed, retaining only events with magnitude ≥ M_th. Although the absolute number of events drops dramatically, the exponents α and β remain essentially unchanged, confirming that the scaling law is insensitive to the choice of magnitude cutoff. However, the asymptotic clustering C∞(L) increases for higher thresholds, reflecting that larger earthquakes tend to cluster more tightly in space.

The scaling analysis is applied to three independent seismic catalogs: Southern California (USA), Japan, and Iran. All three datasets yield consistent exponent values (α ≈ 0.35 ± 0.05, β ≈ 0.48 ± 0.04), demonstrating the universality of the finite‑data‑size scaling across distinct tectonic settings.

Practically, the law enables the estimation of the “infinite‑data” clustering coefficient C∞(L) from limited observations. By fitting the measured C(L,N) to the scaling function, one can extrapolate to the regime where N → ∞, thereby obtaining a reliable small‑world metric even for short‑term or incomplete catalogs. This capability is valuable for seismic hazard assessment, where timely network‑based indicators are needed but long‑term data are unavailable.

The paper concludes by outlining future directions: extending the finite‑size scaling framework to other network measures such as average path length and modularity, incorporating temporal evolution to capture dynamic scaling, and comparing earthquake networks with other geophysical complex systems (e.g., volcanic activity, landslides). Overall, the study provides a rigorous quantitative tool for disentangling the effects of data scarcity from intrinsic network structure, advancing both theoretical understanding and practical applications in seismology.

💡 Research Summary

📜 Original Paper Content