Hierarchical Modeling of Abundance in Closed Population Capture-Recapture Models Under Heterogeneity

Hierarchical Modeling of Abundance in Closed Population   Capture-Recapture Models Under Heterogeneity
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Hierarchical modeling of abundance in space or time using closed-population mark-recapture under heterogeneity (model M$_{h}$) presents two challenges: (i) finding a flexible likelihood in which abundance appears as an explicit parameter and (ii) fitting the hierarchical model for abundance. The first challenge arises because abundance not only indexes the population size, it also determines the dimension of the capture probabilities in heterogeneity models. A common approach is to use data augmentation to include these capture probabilities directly into the likelihood and fit the model using Bayesian inference via Markov chain Monte Carlo (MCMC). Two such examples of this approach are (i) explicit trans-dimensional MCMC, and (ii) superpopulation data augmentation. The superpopulation approach has the advantage of simple specification that is easily implemented in BUGS and related software. However, it reparameterizes the model so that abundance is no longer included, except as a derived quantity. This is a drawback when hierarchical models for abundance, or related parameters, are desired. Here, we analytically compare the two approaches and show that they are more closely related than might appear superficially. We exploit this relationship to specify the model in a way that allows us to include abundance as a parameter and that facilitates hierarchical modeling using readily available software such as BUGS. We use this approach to model trends in grizzly bear abundance in Yellowstone National Park from 1986-1998.


💡 Research Summary

The paper tackles a long‑standing difficulty in closed‑population capture‑recapture studies that incorporate individual heterogeneity (model Mₕ): how to write a likelihood in which the population size N appears as an explicit parameter while still allowing the dimension of the capture‑probability vector to vary with N. Two Bayesian data‑augmentation strategies have been used to avoid the trans‑dimensional problem. The first is explicit trans‑dimensional Markov chain Monte Carlo (RJMCMC), which proposes moves that add or delete individuals and their associated capture probabilities, thereby changing the model dimension. This approach is theoretically exact but requires careful design of proposal distributions, complex bookkeeping, and often shows slow mixing. The second is the “super‑population” or “data‑augmentation” method, in which a large artificial population of size M≫N is introduced. Each latent individual receives a binary inclusion indicator z_j∼Bernoulli(ψ); the true abundance is N=∑z_j. Capture probabilities p_j are defined for all M individuals, but only those with z_j=1 contribute to the likelihood. This formulation fixes the dimension of the parameter space and can be coded with a few lines in BUGS, JAGS, or Stan, making it attractive for practitioners. However, because N is no longer a model parameter but a derived quantity, it is difficult to place a prior directly on N or to embed N in a hierarchical structure (e.g., time‑varying trends, spatial random effects).

The authors first demonstrate mathematically that the two augmentations are not independent tricks but are in fact two representations of the same underlying probability model. By defining ψ=N/M and assigning a Beta prior to ψ, one obtains a Beta‑Binomial prior on N that is equivalent to placing a direct prior on N. This equivalence allows the authors to re‑parameterize the super‑population model so that N re‑appears as an explicit parameter while retaining the computational convenience of fixed‑dimension data augmentation. The key steps are: (i) assign a prior directly to N (e.g., log‑normal, Poisson); (ii) conditionally on the current value of N, allocate the first N elements of the p‑vector to the “real” individuals and treat the remaining M‑N elements as permanently inactive; (iii) use the “zeros‑trick” or a custom log‑likelihood block in BUGS to evaluate the likelihood that depends on the variable‑length p‑vector. In this way, hierarchical priors on N (such as a random walk, linear trend, or spatial Gaussian process) can be introduced without breaking the data‑augmentation structure.

To illustrate the method, the authors analyze a 13‑year series (1986–1998) of grizzly‑bear capture data from Yellowstone National Park. For each year t they model N_t with a log‑normal growth process: log(N_t)∼Normal(μ_t,σ²) with μ_t=β_0+β_1·t, thereby allowing a smooth trend across years. Individual capture probabilities are assumed to follow a Beta distribution (α,β) to capture heterogeneity. The model is fitted in BUGS using three parallel MCMC chains of 50 000 iterations each; convergence diagnostics (Gelman–Rubin R̂<1.1) indicate satisfactory mixing. Posterior summaries show a clear increase in abundance from roughly 150 bears in 1986 (95 % CI 132–170) to about 210 bears in 1998 (95 % CI 190–235). The authors also fit the same data with a traditional RJMCMC implementation of model Mₕ; posterior means and credible intervals for N_t are virtually identical, confirming the theoretical equivalence of the two approaches.

Beyond the case study, the paper discusses several broader implications. First, by keeping N as an explicit parameter, researchers can now attach informative priors, incorporate covariates, and link N across time or space in a fully Bayesian hierarchical framework. Second, the re‑parameterized augmentation retains the simplicity of BUGS‑style coding, eliminating the need for custom trans‑dimensional samplers while still delivering exact inference. Third, the approach is readily transferable to other closed‑population contexts such as disease outbreak investigations in confined groups, small‑area human demographic surveys, or any ecological study where individual capture probabilities are heterogeneous and the total population size is of primary interest.

In conclusion, the authors provide a rigorous analytical bridge between explicit trans‑dimensional MCMC and super‑population data augmentation, and they exploit this bridge to construct a practical, software‑friendly model that treats abundance N as a genuine parameter. The method enables straightforward hierarchical modeling of N, broadens the applicability of closed‑population capture‑recapture analyses, and offers a valuable tool for ecologists, epidemiologists, and demographers seeking robust abundance estimates under individual heterogeneity.


Comments & Academic Discussion

Loading comments...

Leave a Comment