首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Genetic relatedness or similarity between individuals is a key concept in population, quantitative and conservation genetics. When the pedigree of a population is available and assuming a founder population from which the genealogical records start, genetic relatedness between individuals can be estimated by the coancestry coefficient. If pedigree data is lacking or incomplete, estimation of the genetic similarity between individuals relies on molecular markers, using either molecular coancestry or molecular covariance. Some relationships between genealogical and molecular coancestries and covariances have already been described in the literature.

Methods

We show how the expected values of the empirical measures of similarity based on molecular marker data are functions of the genealogical coancestry. From these formulas, it is easy to derive estimators of genealogical coancestry from molecular data. We include variation of allelic frequencies in the estimators.

Results

The estimators are illustrated with simulated examples and with a real dataset from dairy cattle. In general, estimators are accurate and only slightly biased. From the real data set, estimators based on covariances are more compatible with genealogical coancestries than those based on molecular coancestries. A frequently used estimator based on the average of estimated coancestries produced inflated coancestries and numerical instability. The consequences of unknown gene frequencies in the founder population are briefly discussed, along with alternatives to overcome this limitation.

Conclusions

Estimators of genealogical coancestry based on molecular data are easy to derive. Estimators based on molecular covariance are more accurate than those based on identity by state. A correction considering the random distribution of allelic frequencies improves accuracy of these estimators, especially for populations with very strong drift.  相似文献   

2.
Relatedness estimators are widely used in genetic studies, but effects of population structure on performance of estimators, criteria to evaluate estimators, and benefits of using such estimators in conservation programs have to date received little attention. In this article we present new estimators, based on the relationship between coancestry and molecular similarity between individuals, and compare them with existing estimators using Monte Carlo simulation of populations, either panmictic or structured. Estimators were evaluated using statistical criteria and a diversity criterion that minimized relatedness. Results show that ranking of estimators depends on the population structure. An existing estimator based on two-gene and four-gene coefficients of identity performs best in panmictic populations, whereas a new estimator based on coancestry performs best in structured populations. The number of marker alleles and loci did not affect ranking of estimators. Statistical criteria were insufficient to evaluate estimators for their use in conservation programs. The regression coefficient of pedigree relatedness on estimated relatedness (beta2) was substantially lower than unity for all estimators, causing overestimation of the diversity conserved. A simple correction to achieve beta2 = 1 improves both existing and new estimators. Using relatedness estimates with correction considerably increased diversity in structured populations, but did not do so or even decreased diversity in panmictic populations.  相似文献   

3.
Greaves S  Sanson B  White P  Vincent JP 《Genetics》1999,152(4):1753-1766
Applications of quantitative genetics and conservation genetics often require measures of pairwise relationships between individuals, which, in the absence of known pedigree structure, can be estimated only by use of molecular markers. Here we introduce methods for the joint estimation of the two-gene and four-gene coefficients of relationship from data on codominant molecular markers in randomly mating populations. In a comparison with other published estimators of pairwise relatedness, we find these new "regression" estimators to be computationally simpler and to yield similar or lower sampling variances, particularly when many loci are used or when loci are hypervariable. Two examples are given in which the new estimators are applied to natural populations, one that reveals isolation-by-distance in an annual plant and the other that suggests a genetic basis for a coat color polymorphism in bears.  相似文献   

4.
It is crucial to understand the genetic health and implications of inbreeding in wildlife populations, especially of vulnerable species. Using extensive demographic and genetic data, we investigated the relationships among pedigree inbreeding coefficients, metrics of molecular heterozygosity and fitness for a large population of endangered African wild dogs (Lycaon pictus) in South Africa. Molecular metrics based on 19 microsatellite loci were significantly, but modestly correlated to inbreeding coefficients in this population. Inbred wild dogs with inbreeding coefficients of ??0.25 and subordinate individuals had shorter lifespans than outbred and dominant contemporaries, suggesting some deleterious effects of inbreeding. However, this trend was confounded by pack-specific effects as many inbred individuals originated from a single large pack. Despite wild dogs being endangered and existing in small populations, findings within our sample population indicated that molecular metrics were not robust predictors in models of fitness based on breeding pack formation, dominance, reproductive success or lifespan of individuals. Nonetheless, our approach has generated a vital database for future comparative studies to examine these relationships over longer periods of time. Such detailed assessments are essential given knowledge that wild canids can be highly vulnerable to inbreeding effects over a few short generations.  相似文献   

5.
The software package COANCESTRY implements seven relatedness estimators and three inbreeding estimators to estimate relatedness and inbreeding coefficients from multilocus genotype data. Two likelihood estimators that allow for inbred individuals and account for genotyping errors are for the first time included in this user-friendly program for PCs running Windows operating system. A simulation module is built in the program to simulate multilocus genotype data of individuals with a predefined relationship, and to compare the estimators and the simulated relatedness values to facilitate the selection of the best estimator in a particular situation. Bootstrapping and permutations are used to obtain the 95% confidence intervals of each relatedness or inbreeding estimate, and to test the difference in averages between groups.  相似文献   

6.
Molecular marker data collected from natural populations allows information on genetic relationships to be established without referencing an exact pedigree. Numerous methods have been developed to exploit the marker data. These fall into two main categories: method of moment estimators and likelihood estimators. Method of moment estimators are essentially unbiased, but utilise weighting schemes that are only optimal if the analysed pair is unrelated. Thus, they differ in their efficiency at estimating parameters for different relationship categories. Likelihood estimators show smaller mean squared errors but are much more biased. Both types of estimator have been used in variance component analysis to estimate heritability. All marker-based heritability estimators require that adequate levels of the true relationship be present in the population of interest and that adequate amounts of informative marker data are available. I review the different approaches to relationship estimation, with particular attention to optimizing the use of this relationship information in subsequent variance component estimation.  相似文献   

7.
A proper probabilistical proof of a generalization of Wright's classical formula relating the coefficient of inbreeding of an individual at the head of a pedigree to the genotypic probability structure of this individual at one gene locus is presented. It is shown that in general the knowledge of gene frequencies realized within the initial populations from which individuals entering the pedigree are selected at random is not sufficient to predict expected genotypic frequencies in the resulting inbred population. To treat any arbitrary situation concerning the choice of individuals and genotypes to enter the pedigree, it is necessary to determine an additional set of coefficients, which merely depend on the type of the pedigree. A basic method for computing these coefficients is outlined briefly.  相似文献   

8.
Several estimators have been proposed that use molecular marker data to infer the degree of relatedness for pairs of individuals. The objective of this study was to evaluate the performance of seven estimators when applied to marker data of a set of 33 key individuals from a large complex apple pedigree. The evaluation considered different scenarios of allele frequencies and different numbers of marker loci. The method of moments estimators were Similarity, Queller-Goodknight, Lynch-Ritland and Wang. The maximum likelihood estimators were Thompson, Anderson-Weir and Jacquard. The pedigree-based coancestry coefficients were taken as the point of reference in calculating correlations and root mean square error (RMSE). The marker data comprised 86 multi-allelic SSR markers on 17 linkage groups, covering 11 Morgans. Additionally, we simulated 10 datasets conditional on the real pedigree to support the results on the real dataset. None of the estimators outperformed the others. Knowledge of allele frequencies appeared to be the most influential, i.e., the highest correlations and lowest RMSE were found when frequencies from the founder population were available. When equal allele frequencies were used, all estimators resulted in very similar, but on average lower, correlations. The use of allele frequencies estimated from the set of 33 individuals gave, on average, the poorest results. The maximum likelihood estimators and the Lynch-Ritland estimator were the most sensitive to allele frequencies. The results from the simulation study fully supported the trends in results of the real dataset. This study indicated that high correlations (up to 0.90) and small RMSE (below 0.03), may be obtained when population allelic frequencies are available. In this scenario, the performances of the various estimators were similar, but seemed to favor the maximum likelihood estimators. In the absence of reliable allele frequencies the method of moments estimators were shown to be more robust. The number of marker loci influenced the average performance of the estimators; however, the ranking was not affected. Correlations up to 0.80 were obtained when two markers per chromosome and appropriate allele frequencies were available. Adding more markers to the current dataset may lead to marginal improvements.  相似文献   

9.
Gene diversity is sometimes estimated from samples that contain inbred or related individuals. If inbred or related individuals are included in a sample, then the standard estimator for gene diversity produces a downward bias caused by an inflation of the variance of estimated allele frequencies. We develop an unbiased estimator for gene diversity that relies on kinship coefficients for pairs of individuals with known relationship and that reduces to the standard estimator when all individuals are noninbred and unrelated. Applying our estimator to data simulated based on allele frequencies observed for microsatellite loci in human populations, we find that the new estimator performs favorably compared with the standard estimator in terms of bias and similarly in terms of mean squared error. For human population-genetic data, we find that a close linear relationship previously seen between gene diversity and distance from East Africa is preserved when adjusting for the inclusion of close relatives.  相似文献   

10.
Lee SH  Van der Werf JH 《Genetics》2006,174(2):1009-1016
Dominance (intralocus allelic interactions) plays often an important role in quantitative trait variation. However, few studies about dominance in QTL mapping have been reported in outbred animal or human populations. This is because common dominance effects can be predicted mainly for many full sibs, which do not often occur in outbred or natural populations with a general pedigree. Moreover, incomplete genotypes for such a pedigree make it infeasible to estimate dominance relationship coefficients between individuals. In this study, identity-by-descent (IBD) coefficients are estimated on the basis of population-wide linkage disequilibrium (LD), which makes it possible to track dominance relationships between unrelated founders. Therefore, it is possible to use dominance effects in QTL mapping without full sibs. Incomplete genotypes with a complex pedigree and many markers can be efficiently dealt with by a Markov chain Monte Carlo method for estimating IBD and dominance relationship matrices (D(RM)). It is shown by simulation that the use of D(RM) increases the likelihood ratio at the true QTL position and the mapping accuracy and power with complete dominance, overdominance, and recessive inheritance modes when using 200 genotyped and phenotyped individuals.  相似文献   

11.
Maintaining genetic variation and controlling the increase in inbreeding are crucial requirements in animal conservation programs. The most widely accepted strategy for achieving these objectives is to maximize the effective population size by minimizing the global coancestry obtained from a particular pedigree. However, for most natural or captive populations genealogical information is absent. In this situation, microsatellites have been traditionally the markers of choice to characterize genetic variation, and several estimators of genealogical coefficients have been developed using marker data, with unsatisfactory results. The development of high-throughput genotyping techniques states the necessity of reviewing the paradigm that genealogical coancestry is the best parameter for measuring genetic diversity. In this study, the Illumina PorcineSNP60 BeadChip was used to obtain genome-wide estimates of rates of coancestry and inbreeding and effective population size for an ancient strain of Iberian pigs that is now in serious danger of extinction and for which very accurate genealogical information is available (the Guadyerbas strain). Genome-wide estimates were compared with those obtained from microsatellite and from pedigree data. Estimates of coancestry and inbreeding computed from the SNP chip were strongly correlated with genealogical estimates and these correlations were substantially higher than those between microsatellite and genealogical coefficients. Also, molecular coancestry computed from SNP information was a better predictor of genealogical coancestry than coancestry computed from microsatellites. Rates of change in coancestry and inbreeding and effective population size estimated from molecular data were very similar to those estimated from genealogical data. However, estimates of effective population size obtained from changes in coancestry or inbreeding differed. Our results indicate that genome-wide information represents a useful alternative to genealogical information for measuring and maintaining genetic diversity.  相似文献   

12.
We present the program spip for simulating multilocus genetic data on individuals in age‐structured populations. In addition to genetic data on sampled individuals, the pedigree connecting all individuals in the population is recorded. This allows investigation of the relationship between family structure and population parameters. We foresee that spip will be useful for evaluating multilocus estimators of pairwise relatedness and population structure, and for simulating the distribution of relatedness in populations with varying demographies. It also provides a method for simulating genetic drift in complex populations.  相似文献   

13.
It is common practice to use microsatellites to detect parents and their offspring in wild and captive populations, in order to reconstruct a pedigree. However, correct inference is often constrained by a number of factors, including the absence of demographic data and ignorance regarding the completeness of parental sampling. Here we present a new Bayesian estimator that simultaneously estimates the pedigree and the size of the unsampled population. The method is robust to genotyping error, and can estimate pedigrees in the absence of demographic data. Using a large-scale microsatellite assay in four wild cichlid fish populations of Lake Tanganyika (1000 individuals in total), we assess the performance of the Bayesian estimator against the most popular assignment program, Cervus. We found small but significant pedigrees in each of the tested populations using the Bayesian procedure, but Cervus had very high type I error rates when the size of the unsampled population was assumed to be lower than what it was. The need of pedigree relationships to infer adaptive processes in natural populations places strong constraints on sampling design and identification of multigenerational pedigrees in natural populations.  相似文献   

14.
In Greece, seven native horse breeds have been identified so far. Among these, the Skyros pony is outstanding through having a distinct phenotype. In the present study, the aim was to assess genetic diversity in this breed, by using different types of genetic loci and available genealogical information. Its relationships with the other Greek, as well as foreign, domestic breeds were also investigated. Through microsatellite and pedigree analysis it appeared that the Skyros presented a similar level of genetic diversity to the other European breeds. Nevertheless, comparisons between DNA-based and pedigree-based results revealed that a loss of genetic diversity had probably already occurred before the beginning of breed registration. Tests indicated the possible existence of a recent bottleneck in two of the three main herds of Skyros pony. Nonetheless, relatively high levels of heterozygosity and Polymorphism Information Content indicated sufficient residual genetic variability, probably useful in planning future strategies for breed conservation. Three other Greek breeds were also analyzed. A comparison of these with domestic breeds elsewhere, revealed the closest relationships to be with the Middle Eastern types, whereas the Skyros itself remained isolated, without any close relationship, whatsoever.  相似文献   

15.
Molecular markers allow to estimate the pairwise relatedness between the members of a breeding pool when their selection history is no longer available or has become too complex for a classical pedigree analysis. The field of population genetics has several estimation procedures at its disposal, but when the genotyped individuals are highly selected inbred lines, their application is not warranted as the theoretical assumptions on which these estimators were built, usually linkage equilibrium between marker loci or even Hardy–Weinberg equilibrium, are not met. An alternative approach requires the availability of a genotyped reference set of inbred lines, which allows to correct the observed marker similarities for their inherent upward bias when used as a coancestry measure. However, this approach does not guarantee that the resulting coancestry matrix is at least positive semi-definite (psd), a necessary condition for its use as a covariance matrix. In this paper we present the weighted alikeness in state (WAIS) estimator. This marker-based coancestry estimator is compared to several other commonly applied relatedness estimators under realistic hybrid breeding conditions in a number of simulations. We also fit a linear mixed model to phenotypical data from a commercial maize breeding programme and compare the likelihood of the different variance structures. WAIS is shown to be psd which makes it suitable for modelling the covariance between genetic components in linear mixed models involved in breeding value estimation or association studies. Results indicate that it generally produces a low root mean squared error under different breeding circumstances and provides a fit to the data that is comparable to that of several other marker-based alternatives. Recommendations for each of the examined coancestry measures are provided.  相似文献   

16.
Studies of inbreeding depression or kin selection require knowledge of relatedness between individuals. If pedigree information is lacking, one has to rely on genotypic information to infer relatedness. In this study we investigated the performance (absolute and relative) of 10 marker-based relatedness estimators using allele frequencies at microsatellite loci obtained from natural populations of two bird species and one mammal species. Using Monte Carlo simulations we show that many factors affect the performance of estimators and that different sets of loci promote the use of different estimators: in general, there is no single best-performing estimator. The use of locus-specific weights turns out to greatly improve the performance of estimators when marker loci are used that differ strongly in allele frequency distribution. Microsatellite-based estimates are expected to explain between 25 and 79% of variation in true relatedness depending on the microsatellite dataset and on the population composition (i.e. the frequency distribution of relationship in the population). We recommend performing Monte Carlo simulations to decide which estimator to use in studies of pairwise relatedness.  相似文献   

17.
Simple sequence repeats (SSR) are the most widely used molecular markers for relatedness inference due to their multi-allelic nature and high informativeness. However, there is a growing trend toward using high-throughput and inter-specific transferable single-nucleotide polymorphisms (SNP) and Diversity Arrays Technology (DArT) in forest genetics owing to their wide genome coverage. We compared the efficiency of 15 SSRs, 181 SNPs and 2816 DArTs to estimate the relatedness coefficients, and their effects on genetic parameters’ precision, in a relatively small data set of an open-pollinated progeny trial of Eucalyptus grandis (Hill ex Maiden) with limited relationship from the pedigree. Both simulations and real data of Eucalyptus grandis were used to study the statistical performance of three relatedness estimators based on co-dominant markers. Relatedness estimates in pairs of individuals belonging to the same family (related) were higher for DArTs than for SNPs and SSRs. DArTs performed better compared to SSRs and SNPs in estimated relatedness coefficients in pairs of individuals belonging to different families (unrelated) and showed higher ability to discriminate unrelated from related individuals. The likelihood-based estimator exhibited the lowest root mean squared error (RMSE); however, the differences in RMSE among the three estimators studied were small. For the growth traits, heritability estimates based on SNPs yielded, on average, smaller standard errors compared to those based on SSRs and DArTs. Estimated relatedness in the realized relationship matrix and heritabilities can be accurately inferred from co-dominant or sufficiently dense dominant markers in a relatively small E. grandis data set with shallow pedigree.  相似文献   

18.
Wang J 《Genetics》2006,173(3):1679-1692
A variety of estimators have been developed to use genetic marker information in inferring the admixture proportions (parental contributions) of a hybrid population. The majority of these estimators used allele frequency data, ignored molecular information that is available in markers such as microsatellites and DNA sequences, and assumed that mutations are absent since the admixture event. As a result, these estimators may fail to deliver an estimate or give rather poor estimates when admixture is ancient and thus mutations are not negligible. A previous molecular estimator based its inference of admixture proportions on the average coalescent times between pairs of genes taken from within and between populations. In this article I propose an estimator that considers the entire genealogy of all of the sampled genes and infers admixture proportions from the numbers of segregating sites in DNA sequence samples. By considering the genealogy of all sequences rather than pairs of sequences, this new estimator also allows the joint estimation of other interesting parameters in the admixture model, such as admixture time, divergence time, population size, and mutation rate. Comparative analyses of simulated data indicate that the new coalescent estimator generally yields better estimates of admixture proportions than the previous molecular estimator, especially when the parental populations are not highly differentiated. It also gives reasonably accurate estimates of other admixture parameters. A human mtDNA sequence data set was analyzed to demonstrate the method, and the analysis results are discussed and compared with those from previous studies.  相似文献   

19.
Best linear unbiased allele-frequency estimation in complex pedigrees   总被引:4,自引:0,他引:4  
McPeek MS  Wu X  Ober C 《Biometrics》2004,60(2):359-367
Many types of genetic analyses depend on estimates of allele frequencies. We consider the problem of allele-frequency estimation based on data from related individuals. The motivation for this work is data collected on the Hutterites, an isolated founder population, so we focus particularly on the case in which the relationships among the sampled individuals are specified by a large, complex pedigree for which maximum likelihood estimation is impractical. For this case, we propose to use the best linear unbiased estimator (BLUE) of allele frequency. We derive this estimator, which is equivalent to the quasi-likelihood estimator for this problem, and we describe an efficient algorithm for computing the estimate and its variance. We show that our estimator has certain desirable small-sample properties in common with the maximum likelihood estimator (MLE) for this problem. We treat both the case when parental origin of each allele is known and when it is unknown. The results are extended to prediction of allele frequency in some set of individuals S based on genotype data collected on a set of individuals R. We compare the mean-squared error of the BLUE, the commonly used naive estimator (sample frequency) and the MLE when the latter is feasible to calculate. The results indicate that although the MLE performs the best of the three, the BLUE is close in performance to the MLE and is substantially easier to calculate, making it particularly useful for large complex pedigrees in which MLE calculation is impractical or infeasible. We apply our method to allele-frequency estimation in a Hutterite data set.  相似文献   

20.
Relationships play a very important role in studies on quantitative genetics. In traditional breeding, pedigree records are used to establish relationships between animals; while this kind of relationship actually represents one kind of relatedness, it cannot distinguish individual specificity, capture the variation between individuals or determine the actual genetic superiority of an animal. However, with the popularization of high-throughput genotypes, assessments of relationships among animals based on genomic information could be a better option. In this study, we compared the relationships between animals based on pedigree and genomic information from two pig breeding herds with different genetic backgrounds and a simulated dataset. Two different methods were implemented to calculate genomic relationship coefficients and genomic kinship coefficients, respectively. Our results show that, for the same kind of relative, the average genomic relationship coefficients (G matrix) were very close to the pedigree relationship coefficients (A matrix), and on average, the corresponding values were halved in genomic kinship coefficients (K matrix). However, the genomic relationship yielded a larger variation than the pedigree relationship, and the latter was similar to that expected for one relative with no or little variation. Two genomic relationship coefficients were highly correlated, for farm1, farm2 and simulated data, and the correlations for the parent-offspring, full-sib and half-sib were 0.95, 0.90 and 0.85; 0.93, 0.96 and 0.89; and 0.52, 0.85 and 0.77, respectively. When the inbreeding coefficient was measured, the genomic information also yielded a higher inbreeding coefficient and a larger variation than that yielded by the pedigree information. For the two genetically divergent Large White populations, the pedigree relationship coefficients between the individuals were 0, and 62 310 and 175 271 animal pairs in the G matrix and K matrix were greater than 0. Our results demonstrated that genomic information outperformed the pedigree information; it can more accurately reflect the relationships and capture the variation that is not detected by pedigree. This information is very helpful in the estimation of genomic breeding values or gene mapping. In addition, genomic information is useful for pedigree correction. Further, our findings also indicate that genomic information can establish the genetic connection between different groups with different genetic background. In addition, it can be used to provide a more accurate measurement of the inbreeding of an animal, which is very important for the assessment of a population structure and breeding plan. However, the approaches for measuring genomic relationships need further investigation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号