A hierarchical Bayesian approach for estimating the origin of a mixed population

Reading time: 7 minute
...

📝 Abstract

We propose a hierarchical Bayesian model to estimate the proportional contribution of source populations to a newly founded colony. Samples are derived from the first generation offspring in the colony, but mating may occur preferentially among migrants from the same source population. Genotypes of the newly founded colony and source populations are used to estimate the mixture proportions, and the mixture proportions are related to environmental and demographic factors that might affect the colonizing process. We estimate an assortative mating coefficient, mixture proportions, and regression relationships between environmental factors and the mixture proportions in a single hierarchical model. The first-stage likelihood for genotypes in the newly founded colony is a mixture multinomial distribution reflecting the colonizing process. The environmental and demographic data are incorporated into the model through a hierarchical prior structure. A simulation study is conducted to investigate the performance of the model by using different levels of population divergence and number of genetic markers included in the analysis. We use Markov chain Monte Carlo (MCMC) simulation to conduct inference for the posterior distributions of model parameters. We apply the model to a data set derived from grey seals in the Orkney Islands, Scotland. We compare our model with a similar model previously used to analyze these data. The results from both the simulation and application to real data indicate that our model provides better estimates for the covariate effects.

💡 Analysis

We propose a hierarchical Bayesian model to estimate the proportional contribution of source populations to a newly founded colony. Samples are derived from the first generation offspring in the colony, but mating may occur preferentially among migrants from the same source population. Genotypes of the newly founded colony and source populations are used to estimate the mixture proportions, and the mixture proportions are related to environmental and demographic factors that might affect the colonizing process. We estimate an assortative mating coefficient, mixture proportions, and regression relationships between environmental factors and the mixture proportions in a single hierarchical model. The first-stage likelihood for genotypes in the newly founded colony is a mixture multinomial distribution reflecting the colonizing process. The environmental and demographic data are incorporated into the model through a hierarchical prior structure. A simulation study is conducted to investigate the performance of the model by using different levels of population divergence and number of genetic markers included in the analysis. We use Markov chain Monte Carlo (MCMC) simulation to conduct inference for the posterior distributions of model parameters. We apply the model to a data set derived from grey seals in the Orkney Islands, Scotland. We compare our model with a similar model previously used to analyze these data. The results from both the simulation and application to real data indicate that our model provides better estimates for the covariate effects.

📄 Content

arXiv:0805.3269v1 [stat.AP] 21 May 2008 IMS Collections Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh Vol. 3 (2008) 237–250 c⃝Institute of Mathematical Statistics, 2008 DOI: 10.1214/074921708000000174 A hierarchical Bayesian approach for estimating the origin of a mixed population∗ Feng Guo1, Dipak K. Dey2 and Kent E. Holsinger3 Virginia Polytechnic Institute, University of Connecticut and University of Connecticut Abstract: We propose a hierarchical Bayesian model to estimate the propor- tional contribution of source populations to a newly founded colony. Samples are derived from the first generation offspring in the colony, but mating may occur preferentially among migrants from the same source population. Geno- types of the newly founded colony and source populations are used to estimate the mixture proportions, and the mixture proportions are related to environ- mental and demographic factors that might affect the colonizing process. We estimate an assortative mating coefficient, mixture proportions, and regression relationships between environmental factors and the mixture proportions in a single hierarchical model. The first-stage likelihood for genotypes in the newly founded colony is a mixture multinomial distribution reflecting the colonizing process. The environmental and demographic data are incorporated into the model through a hierarchical prior structure. A simulation study is conducted to investigate the performance of the model by using different levels of popu- lation divergence and number of genetic markers included in the analysis. We use Markov chain Monte Carlo (MCMC) simulation to conduct inference for the posterior distributions of model parameters. We apply the model to a data set derived from grey seals in the Orkney Islands, Scotland. We compare our model with a similar model previously used to analyze these data. The results from both the simulation and application to real data indicate that our model provides better estimates for the covariate effects. Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 2 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239 3 Simulation study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243 4 Application to the grey seal data set . . . . . . . . . . . . . . . . . . . . . 246 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248 Appendix: A general approach for updating a proportional vector . . . . . . . 249 Acknowledgments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250 ∗Supported by NIH Grant 1R01-GM068449-01A1. 1Department of Statistics, Virginia Polytechnic Institute, Blacksburg, VA 24060, USA, e-mail: feng.guo@vt.edu 2Department of Statistics, University of Connecticut, Storrs, CT 06269, USA, e-mail: dey@stat.uconn.edu 3Department of Ecology and Evolutionary Biology, University of Connecticut, Storrs, CT 06269, USA, e-mail: kent@darwin.eeb.uconn.edu AMS 2000 subject classifications: Primary 60K35, 60K35; secondary 60K35. Keywords and phrases: hierarchical Bayes, MCMC, multinomial. 237 238 F. Guo, D. K. Dey and K. E. Holsinger

  1. Introduction Fisheries scientists and marine biologists are often faced with the problem of iden- tifying proportions of individuals in a single catch that come from different stocks. Estimating these proportions is necessary for evaluating the effect of commercial fisheries on individual fisheries stocks and for understanding the ecological factors that influence the relative contributions of different stocks. Similarly, those who study marine mammals are often interested in identifying the source populations for newly founded colonies as well as environmental or demographic factors that influence the relative contributions of different sources. The increasing ease with which genetic data are collected and the tendency for populations of species to become genetically differentiated over time has led to the increase in using ge- netic markers to estimate the proportional contribution of source populations to mixed stocks. The rationale is simple: allele frequencies are likely to differ among source populations, and genotype frequencies in the harvest site/new habitat are determined by the proportional contributions of the source populations. Both the differences among source populations and the mixture proportions can be detected by appropriate statistical models. Several methods have been developed for the inference of the proportional con- tribution, m, where mi is the percentage of individuals in the mixed population originating from source i. Conditional Maximum Likelihood Estimates (MLEs) have been widely used [8, 9]. The conditional MLE assumes the sampled source popu- lations are exhaustive lists of all possible sources and the allele frequenc

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut