首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Parentage assignment is defined as the identification of the true parents of one focal offspring among a list of candidates and has been commonly used in zoological, ecological, and agricultural studies. Although likelihood‐based parentage assignment is the preferred method in most cases, it requires genotyping a predefined set of DNA markers and providing their population allele frequencies. In the present study, we proposed an alternative method of parentage assignment that does not depend on genotype data and prior information of allele frequencies. Our method employs the restriction site‐associated DNA sequencing (RAD‐seq) reads for clustering into the overlapped RAD loci among the compared individuals, following which the likelihood ratio of parentage assignment could be directly calculated using two parameters—the genome heterozygosity and error rate of sequencing reads. This method was validated on one simulated and two real data sets with the accurate assignment of true parents to focal offspring. However, our method could not provide a statistical confidence to conclude that the first ranked candidate is a true parent.  相似文献   

2.
Kinship plays a fundamental role in the evolution of social systems and is considered a key driver of group living. To understand the role of kinship in the formation and maintenance of social bonds, accurate measures of genetic relatedness are critical. Genotype‐by‐sequencing technologies are rapidly advancing the accuracy and precision of genetic relatedness estimates for wild populations. The ability to assign kinship from genetic data varies depending on a species’ or population's mating system and pattern of dispersal, and empirical data from longitudinal studies are crucial to validate these methods. We use data from a long‐term behavioural study of a polygynandrous, bisexually philopatric marine mammal to measure accuracy and precision of parentage and genetic relatedness estimation against a known partial pedigree. We show that with moderate but obtainable sample sizes of approximately 4,235 SNPs and 272 individuals, highly accurate parentage assignments and genetic relatedness coefficients can be obtained. Additionally, we subsample our data to quantify how data availability affects relatedness estimation and kinship assignment. Lastly, we conduct a social network analysis to investigate the extent to which accuracy and precision of relatedness estimation improve statistical power to detect an effect of relatedness on social structure. Our results provide practical guidance for minimum sample sizes and sequencing depth for future studies, as well as thresholds for post hoc interpretation of previous analyses.  相似文献   

3.
Choi SC  Hey J 《Genetics》2011,189(2):561-577
A new approach to assigning individuals to populations using genetic data is described. Most existing methods work by maximizing Hardy-Weinberg and linkage equilibrium within populations, neither of which will apply for many demographic histories. By including a demographic model, within a likelihood framework based on coalescent theory, we can jointly study demographic history and population assignment. Genealogies and population assignments are sampled from a posterior distribution using a general isolation-with-migration model for multiple populations. A measure of partition distance between assignments facilitates not only the summary of a posterior sample of assignments, but also the estimation of the posterior density for the demographic history. It is shown that joint estimates of assignment and demographic history are possible, including estimation of population phylogeny for samples from three populations. The new method is compared to results of a widely used assignment method, using simulated and published empirical data sets.  相似文献   

4.
The cost of parentage assignment precludes its application in many selective breeding programmes and molecular ecology studies, and/or limits the circumstances or number of individuals to which it is applied. Pooling samples from more than one individual, and using appropriate genetic markers and algorithms to determine parental contributions to pools, is one means of reducing the cost of parentage assignment. This paper describes and validates a novel maximum likelihood (ML) parentage-assignment method, that can be used to accurately assign parentage to pooled samples of multiple individuals—previously published ML methods are applicable to samples of single individuals only—using low-density single nucleotide polymorphism (SNP) ‘quantitative’ (also referred to as ‘continuous’) genotype data. It is demonstrated with simulated data that, when applied to pools, this ‘quantitative maximum likelihood’ method assigns parentage with greater accuracy than established maximum likelihood parentage-assignment approaches, which rely on accurate discrete genotype calls; exclusion methods; and estimating parental contributions to pools by solving the weighted least squares problem. Quantitative maximum likelihood can be applied to pools generated using either a ‘pooling-for-individual-parentage-assignment’ approach, whereby each individual in a pool is tagged or traceable and from a known and mutually exclusive set of possible parents; or a ‘pooling-by-phenotype’ approach, whereby individuals of the same, or similar, phenotype/s are pooled. Although computationally intensive when applied to large pools, quantitative maximum likelihood has the potential to substantially reduce the cost of parentage assignment, even if applied to pools comprised of few individuals.Subject terms: Plant breeding, Agricultural genetics, Animal breeding, Genetic markers  相似文献   

5.
1. Traditional estimation of age-specific survival and mortality rates in vertebrates is limited to individuals with known age. Although this subject has been studied extensively using effective capture-recapture and capture-recovery models, inference remains challenging because of large numbers of incomplete records (i.e. unknown age of many individuals) and because of the inadequate duration of the studies. 2. Here, we present a hierarchical model for capture-recapture/recovery (CRR) data sets with large proportions of unknown times of birth and death. The model uses a Bayesian framework to draw inference on population-level age-specific demographic rates using parametric survival functions and applies this information to reconstruct times of birth and death for individuals with unknown age. 3. We simulated a set of CRR data sets with varying study span and proportions of individuals with known age, and varying recapture and recovery probabilities. We used these data sets to compare our method to a traditional CRR model, which requires knowledge of individual ages. Subsequently, we applied our method to a subset of a long-term CRR data set on Soay sheep. 4. Our results show that this method performs better than the common CRR model when sample sizes are low. Still, our model is sensitive to the choice of priors with low recapture probability and short studies. In such cases, priors that overestimate survival perform better than those that underestimate it. Also, the model was able to estimate accurately ages at death for Soay sheep, with an average error of 0.94 years and to identify differences in mortality rate between sexes. 5. Although many of the problems in the estimation of age-specific survival can be reduced through more efficient sampling schemes, most ecological data sets are still sparse and with a large proportion of missing records. Thus, improved sampling needs still to be combined with statistical models capable of overcoming the unavoidable limitations of any fieldwork. We show that our approach provides reliable estimates of parameters and unknown times of birth and death even with the most incomplete data sets while being flexible enough to accommodate multiple recapture probabilities and covariates.  相似文献   

6.
In this paper, we examine the effects of the State Innovation Models Initiative (SIM) on population-level health status. SIM provided $250 million to six states in 2013 for broad delivery system reforms. We use data from the Behavioral Risk Factor Surveillance System for the years 2010–2016. Our sample is restricted to individuals ages 45 and older residing in 6 SIM and 15 control states. Treatment effects in a difference-in-difference design are estimated using a latent factor model for multiple indicators of health status. In addition to estimates for the primary sample, we obtain estimates for six subsamples based on strata of age, education, income, race and urban/rural status. We find that individuals in states that implemented SIM show significant improvements in health status. The effects of SIM are greater among older, Medicare eligible individuals, including those living in rural areas. The State Innovation Models Initiative, which provided financial incentives for states to implement health care delivery system reforms, led to population-level improvements in health status.  相似文献   

7.
Anderson EC  Garza JC 《Genetics》2006,172(4):2567-2582
Likelihood-based parentage inference depends on the distribution of a likelihood-ratio statistic, which, in most cases of interest, cannot be exactly determined, but only approximated by Monte Carlo simulation. We provide importance-sampling algorithms for efficiently approximating very small tail probabilities in the distribution of the likelihood-ratio statistic. These importance-sampling methods allow the estimation of small false-positive rates and hence permit likelihood-based inference of parentage in large studies involving a great number of potential parents and many potential offspring. We investigate the performance of these importance-sampling algorithms in the context of parentage inference using single-nucleotide polymorphism (SNP) data and find that they may accelerate the computation of tail probabilities >1 millionfold. We subsequently use the importance-sampling algorithms to calculate the power available with SNPs for large-scale parentage studies, paying particular attention to the effect of genotyping errors and the occurrence of related individuals among the members of the putative mother-father-offspring trios. These simulations show that 60-100 SNPs may allow accurate pedigree reconstruction, even in situations involving thousands of potential mothers, fathers, and offspring. In addition, we compare the power of exclusion-based parentage inference to that of the likelihood-based method. Likelihood-based inference is much more powerful under many conditions; exclusion-based inference would require 40% more SNP loci to achieve the same accuracy as the likelihood-based approach in one common scenario. Our results demonstrate that SNPs are a powerful tool for parentage inference in large managed and/or natural populations.  相似文献   

8.
J. Wang  A. W. Santure 《Genetics》2009,181(4):1579-1594
Likelihood methods have been developed to partition individuals in a sample into sibling clusters using genetic marker data without parental information. Most of these methods assume either both sexes are monogamous to infer full sibships only or only one sex is polygamous to infer full sibships and paternal or maternal (but not both) half sibships. We extend our previous method to the more general case of both sexes being polygamous to infer full sibships, paternal half sibships, and maternal half sibships and to the case of a two-generation sample of individuals to infer parentage jointly with sibships. The extension not only expands enormously the scope of application of the method, but also increases its statistical power. The method is implemented for both diploid and haplodiploid species and for codominant and dominant markers, with mutations and genotyping errors accommodated. The performance and robustness of the method are evaluated by analyzing both simulated and empirical data sets. Our method is shown to be much more powerful than pairwise methods in both parentage and sibship assignments because of the more efficient use of marker information. It is little affected by inbreeding in parents and is moderately robust to nonrandom mating and linkage of markers. We also show that individually much less informative markers, such as SNPs or AFLPs, can reach the same power for parentage and sibship inferences as the highly informative marker simple sequence repeats (SSRs), as long as a sufficient number of loci are employed in the analysis.  相似文献   

9.
Genetic paternity testing can provide sire identity data for offspring when females have been exposed to multiple males. However, correct paternity assignment can be influenced by factors determined in the laboratory and by size and genetic composition of breeding groups. In the present study, DNA samples from 26 commingled beef bulls and their calves from the Nebraska Reference Herd-1 (NRH1), along with previously reported Illinois Reference/Resource Families data, were used to estimate the impact of sire number and sire relatedness on microsatellite-based paternity testing. Assay performance was measured by exclusion probabilities and probabilities of unambiguous parentage (PUP) were derived. Proportion of calves with unambiguous parentage (PCUP) was also calculated to provide a readily understandable whole-herd measure of unambiguous paternity assignment. For NRH1, theoretical and observed PCUP values were in close agreement (85.3 and 85.8%, respectively) indicating good predictive value. While the qualitative effects on PUP values of altering sire number and sire relatedness were generally predictable, we demonstrate that the impacts of these variables, and their interaction effects, can be large, are non-linear, and are quantitatively distinct for different combinations of sire number and degree of sire relatedness. In view of the potentially complex dynamics and practical consequences of these relationships in both research and animal production settings, we suggest that a priori estimation of the quantitative impact of a given set of interacting breeding group-specific and assay-specific parameters on PUP may be indicated, particularly when candidate sire pools are large, sire relatedness may be high, and/or loci numbers or heterozygosity values may be limiting.  相似文献   

10.
Molecular techniques are making ever more genetic markers available for use in parentage assignment, and measures of relatedness. We present a program, Kinship, designed to use likelihood techniques to test for any non-inbred pedigree relationship between pairs of individuals, using single-locus codominant genetic markers. Kinship calculates the likelihood that each pair of individuals in a data set are related by a given pedigree hypothesis, and likelihood ratios for any pair of hypotheses. The program also uses a simulation routine to attach statistical significance to its results.  相似文献   

11.
Genetic techniques are frequently used to sample and monitor wildlife populations. The goal of these studies is to maximize the ability to distinguish individuals for various genetic inference applications, a process which is often complicated by genotyping error. However, wildlife studies usually have fixed budgets, which limit the number of genetic markers available for inclusion in a study marker panel. Prior to our study, a formal algorithm for selecting a marker panel that included genotyping error, laboratory costs, and ability to distinguish individuals did not exist. We developed a constrained nonlinear programming optimization algorithm to determine the optimal number of markers for a marker panel, initially applied to a pilot study designed to estimate black bear abundance in central Georgia. We extend the algorithm to other genetic applications (e.g., parentage or population assignment) and incorporate possible null alleles. Our algorithm can be used in wildlife pilot studies to assess the feasibility of genetic sampling for multiple genetic inference applications. © 2011 The Wildlife Society.  相似文献   

12.
Burczyk J  Adams WT  Birkes DS  Chybicki IJ 《Genetics》2006,173(1):363-372
Estimating seed and pollen gene flow in plants on the basis of samples of naturally regenerated seedlings can provide much needed information about "realized gene flow," but seems to be one of the greatest challenges in plant population biology. Traditional parentage methods, because of their inability to discriminate between male and female parentage of seedlings, unless supported by uniparentally inherited markers, are not capable of precisely describing seed and pollen aspects of gene flow realized in seedlings. Here, we describe a maximum-likelihood method for modeling female and male parentage in a local plant population on the basis of genotypic data from naturally established seedlings and when the location and genotypes of all potential parents within the population are known. The method models female and male reproductive success of individuals as a function of factors likely to influence reproductive success (e.g., distance of seed dispersal, distance between mates, and relative fecundity--i.e., female and male selection gradients). The method is designed to account for levels of seed and pollen gene flow into the local population from unsampled adults; therefore, it is well suited to isolated, but also wide-spread natural populations, where extensive seed and pollen dispersal complicates traditional parentage analyses. Computer simulations were performed to evaluate the utility and robustness of the model and estimation procedure and to assess how the exclusion power of genetic markers (isozymes or microsatellites) affects the accuracy of the parameter estimation. In addition, the method was applied to genotypic data collected in Scots pine (isozymes) and oak (microsatellites) populations to obtain preliminary estimates of long-distance seed and pollen gene flow and the patterns of local seed and pollen dispersal in these species.  相似文献   

13.
In parentage assignment by exclusion, using multiple and very polymorphic loci, genotyping errors are a major cause of non‐assignment. Using stochastic simulations, we tested the possibility to allow for mismatches at one or more allele as a way to recover assignment power. This was very efficient provided the set of loci used had a high assignment power (> 99%) and the error rate was not too high (below 3–4%). In these cases, most of the theoretical assignment power could be recovered. We also showed the efficiency of the method in a practical experiment with rainbow trout.  相似文献   

14.
Genetic data from polymorphic microsatellite loci were employed to estimate paternity and maternity in a local population of nine-banded armadillos (Dasypus novemcinctus) in northern Florida. The parentage assessments took advantage of maximum likelihood procedures developed expressly for situations when individuals of neither gender can be excluded a priori as candidate parents. The molecular data for 290 individuals, interpreted alone and in conjunction with detailed biological and spatial information for the population, demonstrate high exclusion probabilities and reasonably strong likelihoods of genetic parentage assignment in many cases; low mean probabilities of successful reproductive contribution to the local population by individual armadillo adults in a given year; and statistically significant microspatial associations of parents and their offspring. Results suggest that molecular assays of highly polymorphic genetic systems can add considerable power to assessments of biological parentage in natural populations even when neither parent is otherwise known.  相似文献   

15.
Stream-dwelling fish populations have long served as important models of animal movement. Populations of adult stream-dwelling fishes are generally composed of a mix of relatively sedentary and mobile individuals. However, we do not know whether this pattern that we typically observe among adults is indicative of patterns of movement that occur throughout the life cycle. Therefore, we do not know whether we can apply these patterns to understanding or predicting processes such as migration and thus the potential for the evolution of genetic differences among populations. We test the general hypothesis that patterns of movement throughout the life cycle are consistent with patterns of movement inferred by indirect genetic methods and, more specifically, that the characteristics of the mobile fraction of the population are consistent with patterns of genetic differentiation. We used parentage analyses to infer the movements of alevin brook charr (Salvelinus fontinalis) in Freshwater River, Newfoundland, Canada, and a capture-recapture study of one cohort in this population to infer movement throughout the rest of the life cycle. We found that alevins move large distances shortly after emergence, primarily in the downstream direction, and that the population is composed of a mix of relatively sedentary and mobile individuals throughout all other intervals of the life cycle. In contrast, when we considered movements of individuals first captured as juveniles and eventually recovered as reproductively mature adults, we found relatively large and uniform distributions of net movement distance. Thus, heterogeneity in individual movement of adults is not representative of patterns of movement throughout the life cycle and therefore may provide only limited inference of population-level processes such as gene flow.  相似文献   

16.
Araki H  Blouin MS 《Molecular ecology》2005,14(13):4097-4109
Parentage assignment is widely applied to studies on mating systems, population dynamics and natural selection. However, little is known about the consequence of assignment errors, especially when some parents are not sampled. We investigated the effects of two types of error in parentage assignment, failing to assign a true parent (type A) and assigning an untrue parent (type B), on an estimate of the relative reproductive success (RRS) of two groups of parents. Employing a mathematical approach, we found that (i) when all parents are sampled, minimizing either type A or type B error insures the minimum bias on RRS, and (ii) when a large number of parents is not sampled, type B error substantially biases the estimated RRS towards one. Interestingly, however, (iii) when all parents were sampled and both error rates were moderately high, type A error biased the estimated RRS even more than type B error. We propose new methods to obtain an unbiased estimate of RRS and the number of offspring whose parents are not sampled (zW(z)), by correcting the error effects. Applying them to genotypic data from steelhead trout (Oncorhynchus mykiss), we illustrated how to estimate and control the assignment errors. In the data, we observed up to a 30% assignment error and a strong trade-off between the two types of error, depending on the stringency of the assignment decision criterion. We show that our methods can efficiently estimate an unbiased RRS and zW(z) regardless of assignment method, and how to maximize the statistical power to detect a difference in reproductive success between groups.  相似文献   

17.
Best linear unbiased allele-frequency estimation in complex pedigrees   总被引:4,自引:0,他引:4  
McPeek MS  Wu X  Ober C 《Biometrics》2004,60(2):359-367
Many types of genetic analyses depend on estimates of allele frequencies. We consider the problem of allele-frequency estimation based on data from related individuals. The motivation for this work is data collected on the Hutterites, an isolated founder population, so we focus particularly on the case in which the relationships among the sampled individuals are specified by a large, complex pedigree for which maximum likelihood estimation is impractical. For this case, we propose to use the best linear unbiased estimator (BLUE) of allele frequency. We derive this estimator, which is equivalent to the quasi-likelihood estimator for this problem, and we describe an efficient algorithm for computing the estimate and its variance. We show that our estimator has certain desirable small-sample properties in common with the maximum likelihood estimator (MLE) for this problem. We treat both the case when parental origin of each allele is known and when it is unknown. The results are extended to prediction of allele frequency in some set of individuals S based on genotype data collected on a set of individuals R. We compare the mean-squared error of the BLUE, the commonly used naive estimator (sample frequency) and the MLE when the latter is feasible to calculate. The results indicate that although the MLE performs the best of the three, the BLUE is close in performance to the MLE and is substantially easier to calculate, making it particularly useful for large complex pedigrees in which MLE calculation is impractical or infeasible. We apply our method to allele-frequency estimation in a Hutterite data set.  相似文献   

18.
Modern genetic parentage methods reveal that alternative reproductive strategies are common in both males and females. Under ideal conditions, genetic methods accurately connect the parents to offspring produced by extra-pair matings or conspecific brood parasitism. However, some breeding systems and sampling scenarios present significant complications for accurate parentage assignment. We used simulated genetic pedigrees to assess the reliability of parentage assignment for a series of challenging sampling regimes that reflect realistic conditions for many brood-parasitic birds: absence of genetic samples from sires, absence of samples from brood parasites and female kin-structured populations. Using 18 microsatellite markers and empirical allele frequencies from two populations of a conspecific brood parasite, the wood duck (Aix sponsa), we simulated brood parasitism and determined maternity using two widely used programs, cervus and colony . Errors in assignment were generally modest for most sampling scenarios but differed by program: cervus suffered from false assignment of parasitic offspring, whereas colony sometimes failed to assign offspring to their known mothers. Notably, colony was able to accurately infer unsampled parents. Reducing the number of markers (nine loci rather than 18) caused the assignment error to slightly worsen with colony but balloon with cervus . One potential error with important biological implications was rare in all cases—few nesting females were incorrectly excluded as the mother of their own offspring, an error that could falsely indicate brood parasitism. We consider the implications of our findings for both a retrospective assessment of previous studies and suggestions for best practices for future studies.  相似文献   

19.
Contemporary pollen flow in forest plant species is measured by the probability of paternal identity (PPI) for two randomly sampled offspring, drawn from a single female, and contrasting that with PPI for two random offspring, drawn from different females. Two different estimation strategies have emerged: (a) an indirect approach, using the 'genetic structure' of the pollen received by different mothers and (b) a direct approach, based on parentage analysis. The indirect strategy is somewhat limited by the assumptions, but is widely useful. The direct approach is most appropriate where a large majority of the true fathers can be identified exactly, which is sometimes possible with high-resolution SSR markers. Using the parentage approach, we develop estimates of PPI, showing that the obvious estimates are severely biased, and providing an unbiased alternative. We then illustrate the methods with SSR data from a 36-tree isolated population of Pinus sylvestris from the Meseta region of Spain, for which categorical paternity assignment was available for over 95% of offspring. For all the females combined, we estimate that PPI=0.0425, indicating uneven male reproductive contributions. Different (but overlapping) arrays of males pollinate different females, and for the average female, PPI=0.317, indicating substantial 'pollen structure' for the population. We also relate the direct measures of PPI to those available from indirect approaches, and show that they are generally comparable.  相似文献   

20.
Wang J 《Molecular ecology》2010,19(22):5061-5078
Genetic markers are widely used to determine the parentage of individuals in studies of mating systems, reproductive success, dispersals, quantitative genetic parameters and in the management of conservation populations. These markers are, however, imperfect for parentage analyses because of the presence of genotyping errors and undetectable alleles, which may cause incompatible genotypes (mismatches) between parents and offspring and thus result in false exclusions of true parentage. Highly polymorphic markers widely used in parentage analyses, such as microsatellites, are especially prone to genotyping errors. In this investigation, I derived the probabilities of excluding a random (related) individual from parentage and the probabilities of Mendelian-inconsistent errors (mismatches) and Mendelian-consistent errors (which do not cause mismatches) in parent-offspring dyads, when a marker having null alleles, allelic dropouts and false alleles is used in a parentage analysis. These probabilities are useful in evaluating the impact of various types of genotyping errors on the information content of a set of markers in and thus the power of a parentage analysis, in determining the threshold number of genetic mismatches that is appropriate for a parentage exclusion analysis and in estimating the rates of genotyping errors and frequencies of null alleles from observed mismatches between known parent-offspring dyads. These applications are demonstrated by numerical examples using both hypothetical and empirical data sets and discussed in the context of practical parentage exclusion analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号