首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
Vasco DA 《Genetics》2008,179(2):951-963
The estimation of ancestral and current effective population sizes in expanding populations is a fundamental problem in population genetics. Recently it has become possible to scan entire genomes of several individuals within a population. These genomic data sets can be used to estimate basic population parameters such as the effective population size and population growth rate. Full-data-likelihood methods potentially offer a powerful statistical framework for inferring population genetic parameters. However, for large data sets, computationally intensive methods based upon full-likelihood estimates may encounter difficulties. First, the computational method may be prohibitively slow or difficult to implement for large data. Second, estimation bias may markedly affect the accuracy and reliability of parameter estimates, as suggested from past work on coalescent methods. To address these problems, a fast and computationally efficient least-squares method for estimating population parameters from genomic data is presented here. Instead of modeling genomic data using a full likelihood, this new approach uses an analogous function, in which the full data are replaced with a vector of summary statistics. Furthermore, these least-squares estimators may show significantly less estimation bias for growth rate and genetic diversity than a corresponding maximum-likelihood estimator for the same coalescent process. The least-squares statistics also scale up to genome-sized data sets with many nucleotides and loci. These results demonstrate that least-squares statistics will likely prove useful for nonlinear parameter estimation when the underlying population genomic processes have complex evolutionary dynamics involving interactions between mutation, selection, demography, and recombination.  相似文献   

2.
Girod C  Vitalis R  Leblois R  Fréville H 《Genetics》2011,188(1):165-179
Reconstructing the demographic history of populations is a central issue in evolutionary biology. Using likelihood-based methods coupled with Monte Carlo simulations, it is now possible to reconstruct past changes in population size from genetic data. Using simulated data sets under various demographic scenarios, we evaluate the statistical performance of Msvar, a full-likelihood Bayesian method that infers past demographic change from microsatellite data. Our simulation tests show that Msvar is very efficient at detecting population declines and expansions, provided the event is neither too weak nor too recent. We further show that Msvar outperforms two moment-based methods (the M-ratio test and Bottleneck) for detecting population size changes, whatever the time and the severity of the event. The same trend emerges from a compilation of empirical studies. The latest version of Msvar provides estimates of the current and the ancestral population size and the time since the population started changing in size. We show that, in the absence of prior knowledge, Msvar provides little information on the mutation rate, which results in biased estimates and/or wide credibility intervals for each of the demographic parameters. However, scaling the population size parameters with the mutation rate and scaling the time with current population size, as coalescent theory requires, significantly improves the quality of the estimates for contraction but not for expansion scenarios. Finally, our results suggest that Msvar is robust to moderate departures from a strict stepwise mutation model.  相似文献   

3.
Personal genome tests are now offered direct-to-consumer (DTC) via genetic variants identified by genome-wide association studies (GWAS) for common diseases. Tests report risk estimates (age-specific and lifetime) for various diseases based on genotypes at multiple loci. However, uncertainty surrounding such risk estimates has not been systematically investigated. With breast cancer as an example, we examined the combined effect of uncertainties in population incidence rates, genotype frequency, effect sizes, and models of joint effects among genetic variants on lifetime risk estimates. We performed simulations to estimate lifetime breast cancer risk for carriers and noncarriers of genetic variants. We derived population-based cancer incidence rates from Surveillance, Epidemiology, and End Results (SEER) Program and comparative international data. We used data for non-Hispanic white women from 2003 to 2005. We derived genotype frequencies and effect sizes from published GWAS and meta-analyses. For a single genetic variant in FGFR2 gene (rs2981582), combination of uncertainty in these parameters produced risk estimates where upper and lower 95% simulation intervals differed by more than 3-fold. Difference in population incidence rates was the largest contributor to variation in risk estimates. For a panel of five genetic variants, estimated lifetime risk of developing breast cancer before age 80 for a woman that carried all risk variants ranged from 6.1% to 21%, depending on assumptions of additive or multiplicative joint effects and breast cancer incidence rates. Epidemiologic parameters involved in computation of disease risk have substantial uncertainty, and cumulative uncertainty should be properly recognized. Reliance on point estimates alone could be seriously misleading.  相似文献   

4.
Vitalis R  Couvet D 《Genetics》2001,157(2):911-925
Standard methods for inferring demographic parameters from genetic data are based mainly on one-locus theory. However, the association of genes at different loci (e.g., two-locus identity disequilibrium) may also contain some information about demographic parameters of populations. In this article, we define one- and two-locus parameters of population structure as functions of one- and two-locus probabilities for the identity in state of genes. Since these parameters are known functions of demographic parameters in an infinite island model, we develop moment-based estimators of effective population size and immigration rate from one- and two-locus parameters. We evaluate this method through simulation. Although variance and bias may be quite large, increasing the number of loci on which the estimates are derived improves the method. We simulate an infinite allele model and a K allele model of mutation. Bias and variance are smaller with increasing numbers of alleles per locus. This is, to our knowledge, the first attempt of a joint estimation of local effective population size and immigration rate.  相似文献   

5.
Kitada S  Hayashi T  Kishino H 《Genetics》2000,156(4):2063-2079
We developed an empirical Bayes procedure to estimate genetic distances between populations using allele frequencies. This procedure makes it possible to describe the skewness of the genetic distance while taking full account of the uncertainty of the sample allele frequencies. Dirichlet priors of the allele frequencies are specified, and the posterior distributions of the various composite parameters are obtained by Monte Carlo simulation. To avoid overdependence on subjective priors, we adopt a hierarchical model and estimate hyperparameters by maximizing the joint marginal-likelihood function. Taking advantage of the empirical Bayesian procedure, we extend the method to estimate the effective population size using temporal changes in allele frequencies. The method is applied to data sets on red sea bream, herring, northern pike, and ayu broodstock. It is shown that overdispersion overestimates the genetic distance and underestimates the effective population size, if it is not taken into account during the analysis. The joint marginal-likelihood function also estimates the rate of gene flow into island populations.  相似文献   

6.
High-throughput shotgun sequence data make it possible in principle to accurately estimate population genetic parameters without confounding by SNP ascertainment bias. One such statistic of interest is the proportion of heterozygous sites within an individual’s genome, which is informative about inbreeding and effective population size. However, in many cases, the available sequence data of an individual are limited to low coverage, preventing the confident calling of genotypes necessary to directly count the proportion of heterozygous sites. Here, we present a method for estimating an individual’s genome-wide rate of heterozygosity from low-coverage sequence data, without an intermediate step that calls genotypes. Our method jointly learns the shared allele distribution between the individual and a panel of other individuals, together with the sequencing error distributions and the reference bias. We show our method works well, first, by its performance on simulated sequence data and, second, on real sequence data where we obtain estimates using low-coverage data consistent with those from higher coverage. We apply our method to obtain estimates of the rate of heterozygosity for 11 humans from diverse worldwide populations and through this analysis reveal the complex dependency of local sequencing coverage on the true underlying heterozygosity, which complicates the estimation of heterozygosity from sequence data. We show how we can use filters to correct for the confounding arising from sequencing depth. We find in practice that ratios of heterozygosity are more interpretable than absolute estimates and show that we obtain excellent conformity of ratios of heterozygosity with previous estimates from higher-coverage data.  相似文献   

7.
B. Ganter  J. Madsen 《Bird Study》2013,60(1):90-101
We estimated the size of the Svalbard population of Pink-footed Geese Anser brachyrhynchus in 1991–98 using three different methods: (a) counts; (b) mark–resight estimates; (c) annual productivity and survival. Count data showed a slight increase of population size, from 33 000 in 1991 to 38 500 in 1998. Mark–resight estimates showed a larger fluctuation, but were almost always greater than counts. By contrast, estimates of survival and productivity suggested stability or at least a less pronounced increase in the population size, the discrepancy in the number estimated when compared to the other methods being especially large in the last two years of the study. A detailed examination of the assumptions underlying each of the methods reveals possible explanations for some, but not all, of the discrepancies. We conclude that goose population estimates derived from total population counts may be less reliable than commonly assumed, and moderate year-to-year trends should not be over-interpreted. Similarly, assessment of annual productivity and survival may be subject to undetected biases, and these uncertainties should be considered when interpreting results and trends in these parameters. Repeated cross-validation of parameter estimation methods in this and other populations is highly desirable.  相似文献   

8.
Molecular sequences obtained at different sampling times from populations of rapidly evolving pathogens and from ancient subfossil and fossil sources are increasingly available with modern sequencing technology. Here, we present a Bayesian statistical inference approach to the joint estimation of mutation rate and population size that incorporates the uncertainty in the genealogy of such temporally spaced sequences by using Markov chain Monte Carlo (MCMC) integration. The Kingman coalescent model is used to describe the time structure of the ancestral tree. We recover information about the unknown true ancestral coalescent tree, population size, and the overall mutation rate from temporally spaced data, that is, from nucleotide sequences gathered at different times, from different individuals, in an evolving haploid population. We briefly discuss the methodological implications and show what can be inferred, in various practically relevant states of prior knowledge. We develop extensions for exponentially growing population size and joint estimation of substitution model parameters. We illustrate some of the important features of this approach on a genealogy of HIV-1 envelope (env) partial sequences.  相似文献   

9.
Microsatellites have been widely used to reconstruct human evolution. However, the efficient use of these markers relies on information regarding the process producing the observed variation. Here, we present a novel approach to the locus-by-locus characterization of this process. By analyzing somatic mutations in cancer patients, we estimated the distributions of mutation size for each of 20 loci. The same loci were then typed in three ethnically diverse population samples. The generalized stepwise mutation model was used to test the predicted relationship between population and mutation parameters under two demographic scenarios: constant population size and rapid expansion. The agreement between the observed and expected relationship between population and mutation parameters, even when the latter are estimated in cancer patients, confirms that somatic mutations may be useful for investigating the process underlying population variation. Estimated distributions of mutation size differ substantially amongst loci, and mutations of more than one repeat unit are common. A new statistic, the normalized population variance, is introduced for multilocus estimation of demographic parameters, and for testing demographic scenarios. The observed population variation is not consistent with a constant population size. Time estimates of the putative population expansion are in agreement with those obtained by other methods.  相似文献   

10.
Transect count data form the basis of many butterfly and other insect monitoring programs worldwide. A clear understanding of the limitations of such datasets, including the potential for biases in the statistical methods used to analyze them, is therefore crucial. The classical Zonneveld model (CZ) can extract estimates of a suite of demographic parameters from transect count datasets, and has also been used in theoretical analyses of protandry and reproductive asynchrony. The CZ relies on strong assumptions about the emergence and death processes underlying observed transect count datasets. Though reasonable as a starting place, a growing body of empirical evidence suggests these assumptions will, in many cases, not hold. Here, I explore how violations of these assumptions bias CZ-based estimates of two key population parameters: total population size and mean individual lifespan. To do this, I generalize the Zonneveld model by relaxing the symmetrical emergence distribution and constant death rate assumptions such that the generalized models contain the CZ as a special case. Using the generalized models as data generating processes, I then show that the CZ is able to closely mimic the shape of the abundance time course produced by either variant of the generalized model under a wide range of conditions, but produces highly biased estimates of population size and mean lifespan in doing so. My analysis therefore demonstrates both that the CZ is not robust to violations of its emergence and death assumptions, and that a good observed fit to transect count data does not mean these assumptions are satisfied.  相似文献   

11.
In many animal populations, demographic parameters such as survival and recruitment vary markedly with age, as do parameters related to sampling, such as capture probability. Failing to account for such variation can result in biased estimates of population‐level rates. However, estimating age‐dependent survival rates can be challenging because ages of individuals are rarely known unless tagging is done at birth. For many species, it is possible to infer age based on size. In capture–recapture studies of such species, it is possible to use a growth model to infer the age at first capture of individuals. We show how to build estimates of age‐dependent survival into a capture–mark–recapture model based on data obtained in a capture–recapture study. We first show how estimates of age based on length increments closely match those based on definitive aging methods. In simulated analyses, we show that both individual ages and age‐dependent survival rates estimated from simulated data closely match true values. With our approach, we are able to estimate the age‐specific apparent survival rates of Murray and trout cod in the Murray River, Australia. Our model structure provides a flexible framework within which to investigate various aspects of how survival varies with age and will have extensions within a wide range of ecological studies of animals where age can be estimated based on size.  相似文献   

12.
COMPARISON OF METHODS USED TO ESTIMATE NUMBERS OF WALRUSES ON SEA ICE   总被引:2,自引:1,他引:1  
The US and former USSR conducted joint surveys of Pacific walruses on sea ice and at land haul-outs in 1975, 1980, 1985, and 1990. One of the difficulties in interpreting results of these surveys has been that, except for the 1990 survey, the Americans and Soviets used different methods for estimating population size from their respective portions of the sea ice data. We used data exchanged between Sovier and American scientists to compare and evaluate the two estimation procedures and to derive a set of alternative estimates from the 1975, 1980, and 1985 surveys based on a single consistent procedure. Estimation method had only a small effect on total population estimates because most walruses were found at land haul-outs. However, the Sovievr method is subject to bias that depends on the distribution of the population on the sea ice and this has important implications for interpreting the ice portions of previously reported surveys for walruses and other pinniped species. We recommend that the American method be used in future surveys. Future research on survey methods for walruses should focus on other potential sources of bias and variation.  相似文献   

13.
Synopsis The very sparse data that are available on the abundance, population structure and biology of the coelacanth Latimeria chalumnae off Grand Comoro are summarised, and some simple numerical analyses are carried out to explore certain aspects of the population dynamics, particularly the age-profile of the population. The object has not been to provide estimates of key demographic parameters, such as mortality rates, but to propose various scenarios that are useful for comparison with real data as they become available. The analysis also makes it possible to reach some preliminary conclusions that are relevant to the management of the coelacanth population. For instance, it appears that the catch rate of coelacanths by artisanal fishermen may have a negligible effect on coelacanth survivorship, and it is more likely that population size and structure are determined by natural mortality rates and birth rates. It is suggested that predation is the main cause of natural mortality and that the main predators of coelacanths are likely to be large sharks. Interference with the traditional patterns of the Comoran artisanal fishery may threaten the coelacanth. Several important gaps in our knowledge of coelacanth demography are identified.  相似文献   

14.
The methods of Bailey and of Jolly and Seber were used to provide maximum likelihood estimates of population parameters for Jackson's classical mark-recapture experiments on males of the tsetse fly Glossina m. morsitans Westwood. These were compared with Jolly-Seber (J-S) estimates for the same fly from more recent work on Antelope Island, Lake Kariba, Zimbabwe. The Bailey estimates of birth and death rates and total population size had markedly lower variances than Jackson's originals. Both sets of estimates provided moving averages over 6-week periods, whereas the Jolly-Seber analysis provided independent weekly estimates and their variance is consequently higher. Saturation deficit and maximum temperature (Tmax) accounted for 11 and 16% respectively of the variance in independent 4-week means of the weekly J-S survival probabilities. Analysis of covariance, carried out on a joint data set of smoothed J-S estimates of the survival probability in Tanzania and Zimbabwe, showed a significant effect of Tmax on survival. When this effect was removed, the survival probability in the Tanzania studies was found to be 8% lower than on Antelope Island. The two effects accounted for 50% of the variance in the joint data. When saturation deficit was substituted for Tmax, regression only accounted for 35% of the variance. If saturation deficit is important in determining tsetse survival, it must act on stages other than the post-teneral adult. Given the continuous increase in mortality, even at moderate temperatures, it is hard to envisage a direct effect of Tmax. There may be an indirect effect, however, via the number of hunger-related deaths resulting from the increase in the feeding rate with increasing temperature.  相似文献   

15.
The important parameter of effective population size is rarely estimable directly from demographic data. Indirect estimates of effective population size may be made from genetic data such as temporal variation of allelic frequencies or linkage disequilibrium in cohorts. We suggest here that an indirect estimate of the effective number of breeders might be based on the excess of heterozygosity expected in a cohort of progeny produced by a limited number of males and females. In computer simulations, heterozygote excesses for 30 unlinked loci having various numbers of alleles and allele-frequency profiles were obtained for cohorts produced by samples of breeders drawn from an age-structured population and having known variance in reproductive success and effective number. The 95% confidence limits around the estimate contained the true effective population size in 70 of 72 trials and the Spearman rank correlation of estimated and actual values was 0.991. An estimate based on heterozygote excess might have certain advantages over the previous estimates, requiring only single-locus and single-cohort data, but the sampling error among individuals and the effect of departures from random union of gametes still need to be explored.  相似文献   

16.
Austerlitz F  Kalaydjieva L  Heyer E 《Genetics》2003,165(3):1579-1586
The frequency of a rare mutant allele and the level of allelic association between this allele and one or several closely linked markers are frequently measured in genetic epidemiology. Both quantities are related to the time elapsed since the appearance of the mutation in the population and the intrinsic growth rate of the mutation (which may be different from the average population growth rate). Here, we develop a method that uses these two kinds of genetic data to perform a joint estimation of the age of the mutation and the minimum growth rate that is compatible with its present frequency. In absence of demographic data, it provides a useful estimate of population growth rate. When such data are available, contrasts among estimates from several loci allow demographic processes, affecting all loci similarly, to be distinguished from selection, affecting loci differently. Testing these estimates on populations for which data are available for several disorders shows good congruence with demographic data in some cases whereas in others higher growth rates are obtained, which may be the result of selection or hidden demographic processes.  相似文献   

17.
Abstract: Satellite tracking is currently used to make inferences to avian populations. Cost of transmitters and logistical challenges of working with some species can limit sample size and strength of inferences. Therefore, careful study design including consideration of sample size is important. We used simulations to examine how sample size, population size, and population variance affected probability of making reliable inferences from a sample and the precision of estimates of population parameters. For populations of >100 individuals, a sample >20 birds was needed to make reliable inferences about questions with simple outcomes (i.e., 2 possible outcomes). Sample size demands increased rapidly for more complex problems. For example, in a problem with 3 outcomes, a sample of >75 individuals will be needed for proper inference to the population. Combining data from satellite telemetry studies with data from surveys or other types of sampling may improve inference strength.  相似文献   

18.
Hunter–gatherer population growth rate estimates extracted from archaeological proxies and ethnographic data show remarkable differences, as archaeological estimates are orders of magnitude smaller than ethnographic and historical estimates. This could imply that prehistoric hunter–gatherers were demographically different from recent hunter–gatherers. However, we show that the resolution of archaeological human population proxies is not sufficiently high to detect actual population dynamics and growth rates that can be observed in the historical and ethnographic data. We argue that archaeological and ethnographic population growth rates measure different things; therefore, they are not directly comparable. While ethnographic growth rate estimates of hunter–gatherer populations are directly linked to underlying demographic parameters, archaeological estimates track changes in the long-term mean population size, which reflects changes in the environmental productivity that provide the ultimate constraint for forager population growth. We further argue that because of this constraining effect, hunter–gatherer populations cannot exhibit long-term growth independently of increasing environmental productivity.This article is part of the theme issue ‘Cross-disciplinary approaches to prehistoric demography’.  相似文献   

19.
Antarctic phocid seals and particularly the crabeater (Lobodon carcinophagus) have been observed to display a diurnal cycle in their propensity to haul out on pack ice where they are visible for census. The fact that they are not visible for much of the 24-h period means that density estimates made over broad geographic areas at various times of the day statistically confound this cycle with geographic variability. Limitation of census observations to times of peak haulout results in extreme logistical difficulties and/or considerable reduction in sample size upon which to base population estimates. Reduced sample size results in high variability in population estimates and broad confidence bands. To develop a model with which to correct density estimates for variability due to diurnal cycle, a series of stationary censuses at fixed locations in the Antarctic continental ice pack was made over significant fractions of several days. A unimodal polynomial model for the observed density variation in any one location was statistically significant; a similar model combining multiple locations with densities standardized to peak daily values was also significant. The latter model was used to make corrections for time of day to density estimates in three test data sets taken over broad geographic areas of the Antarctic. Statistical simulation (bootstrap) methods were used to determine if variances of corrected density estimates were lower than those based on uncorrected observations taken only during the peak haulout times of the day. Results were that 95% interval estimates for corrected densities were narrowed to between 40% and 61% of the uncorrected estimates. While there are additional possible sources of variation in haulout tendency, pending further data collection and analyses, the model developed represents a considerably more precise methodology than either averaging over haulout variability or limiting observations to peak daily periods.  相似文献   

20.
The brush-tailed rock-wallaby (Petrogale penicillata) is an endangered species in southeastern Australia and many of the remaining populations are declining. The steep rocky habitat and shy nature of the species make it difficult to obtain data on population parameters such as abundance and recruitment. Faecal pellet counts from scat plots are commonly used to monitor population trends but these are imprecise and difficult to relate to absolute population size. We conducted a noninvasive genetic sampling 'mark-recapture' study over a 2-year period to identify individuals from faecal DNA samples and estimate the population size of four brush-tailed rock-wallaby colonies located in Wollemi National Park, New South Wales. Scat plots in rock-wallaby colonies were used as sample collection points for this study. Two separate population estimates were carried out for three of the colonies to determine if we could detect recruitment and changes in population size. We determined that there was one large colony of an estimated 67 individuals (95% confidence interval: 55-91) and three smaller colonies. Monitoring of the smaller colonies also detected possible population size increases in all three. Our results indicate that faecal DNA analysis may be a promising method for estimating and monitoring population trends in this species particularly when used with a traditional field survey method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号