首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

Genetic relatedness or similarity between individuals is a key concept in population, quantitative and conservation genetics. When the pedigree of a population is available and assuming a founder population from which the genealogical records start, genetic relatedness between individuals can be estimated by the coancestry coefficient. If pedigree data is lacking or incomplete, estimation of the genetic similarity between individuals relies on molecular markers, using either molecular coancestry or molecular covariance. Some relationships between genealogical and molecular coancestries and covariances have already been described in the literature.

Methods

We show how the expected values of the empirical measures of similarity based on molecular marker data are functions of the genealogical coancestry. From these formulas, it is easy to derive estimators of genealogical coancestry from molecular data. We include variation of allelic frequencies in the estimators.

Results

The estimators are illustrated with simulated examples and with a real dataset from dairy cattle. In general, estimators are accurate and only slightly biased. From the real data set, estimators based on covariances are more compatible with genealogical coancestries than those based on molecular coancestries. A frequently used estimator based on the average of estimated coancestries produced inflated coancestries and numerical instability. The consequences of unknown gene frequencies in the founder population are briefly discussed, along with alternatives to overcome this limitation.

Conclusions

Estimators of genealogical coancestry based on molecular data are easy to derive. Estimators based on molecular covariance are more accurate than those based on identity by state. A correction considering the random distribution of allelic frequencies improves accuracy of these estimators, especially for populations with very strong drift.  相似文献   

2.
Hardy OJ 《Molecular ecology》2003,12(6):1577-1588
A new estimator of the pairwise relatedness coefficient between individuals adapted to dominant genetic markers is developed. This estimator does not assume genotypes to be in Hardy-Weinberg proportions but requires a knowledge of the departure from these proportions (i.e. the inbreeding coefficient). Simulations show that the estimator provides accurate estimates, except for some particular types of individual pairs such as full-sibs, and performs better than a previously developed estimator. When comparing marker-based relatedness estimates with pedigree expectations, a new approach to account for the change of the reference population is developed and shown to perform satisfactorily. Simulations also illustrate that this new relatedness estimator can be used to characterize isolation by distance within populations, leading to essentially unbiased estimates of the neighbourhood size. In this context, the estimator appears fairly robust to moderate errors made on the assumed inbreeding coefficient. The analysis of real data sets suggests that dominant markers (random amplified polymorphic DNA, amplified fragment length polymorphism) may be as valuable as co-dominant markers (microsatellites) in studying microgeographic isolation-by-distance processes. It is argued that the estimators developed should find major applications, notably for conservation biology.  相似文献   

3.
Estimates of relatedness coefficients, based on genetic marker data, are often necessary for studies of genetics and ecology. Whilst many estimates based on method‐of‐moment or maximum‐likelihood methods exist for diploid organisms, no such estimators exist for organisms with multiple ploidy levels, which occur in some insect and plant species. Here, we extend five estimators to account for different levels of ploidy: one relatedness coefficient estimator, three coefficients of coancestry estimators and one maximum‐likelihood estimator. We use arrhenotoky (when unfertilized eggs develop into haploid males) as an example in evaluations of estimator performance by Monte Carlo simulation. Also, three virtual sex‐determination systems are simulated to evaluate their performances for higher levels of ploidy. Additionally, we used two real data sets to test the robustness of these estimators under actual conditions. We make available a software package, PolyRelatedness , for other researchers to apply to organisms that have various levels of ploidy.  相似文献   

4.
Maintaining genetic variation and controlling the increase in inbreeding are crucial requirements in animal conservation programs. The most widely accepted strategy for achieving these objectives is to maximize the effective population size by minimizing the global coancestry obtained from a particular pedigree. However, for most natural or captive populations genealogical information is absent. In this situation, microsatellites have been traditionally the markers of choice to characterize genetic variation, and several estimators of genealogical coefficients have been developed using marker data, with unsatisfactory results. The development of high-throughput genotyping techniques states the necessity of reviewing the paradigm that genealogical coancestry is the best parameter for measuring genetic diversity. In this study, the Illumina PorcineSNP60 BeadChip was used to obtain genome-wide estimates of rates of coancestry and inbreeding and effective population size for an ancient strain of Iberian pigs that is now in serious danger of extinction and for which very accurate genealogical information is available (the Guadyerbas strain). Genome-wide estimates were compared with those obtained from microsatellite and from pedigree data. Estimates of coancestry and inbreeding computed from the SNP chip were strongly correlated with genealogical estimates and these correlations were substantially higher than those between microsatellite and genealogical coefficients. Also, molecular coancestry computed from SNP information was a better predictor of genealogical coancestry than coancestry computed from microsatellites. Rates of change in coancestry and inbreeding and effective population size estimated from molecular data were very similar to those estimated from genealogical data. However, estimates of effective population size obtained from changes in coancestry or inbreeding differed. Our results indicate that genome-wide information represents a useful alternative to genealogical information for measuring and maintaining genetic diversity.  相似文献   

5.
Analyses of pairwise relatedness represent a key component to addressing many topics in biology. However, such analyses have been limited because most available programs provide a means to estimate relatedness based on only a single estimator, making comparison across estimators difficult. Second, all programs to date have been platform specific, working only on a specific operating system. This has the undesirable outcome of making choice of relatedness estimator limited by operating system preference, rather than being based on scientific rationale. Here, we present a new R package, called related, that can calculate relatedness based on seven estimators, can account for genotyping errors, missing data and inbreeding, and can estimate 95% confidence intervals. Moreover, simulation functions are provided that allow for easy comparison of the performance of different estimators and for analyses of how much resolution to expect from a given data set. Because this package works in R, it is platform independent. Combined, this functionality should allow for more appropriate analyses and interpretation of pairwise relatedness and will also allow for the integration of relatedness data into larger R workflows.  相似文献   

6.
Molecular markers allow to estimate the pairwise relatedness between the members of a breeding pool when their selection history is no longer available or has become too complex for a classical pedigree analysis. The field of population genetics has several estimation procedures at its disposal, but when the genotyped individuals are highly selected inbred lines, their application is not warranted as the theoretical assumptions on which these estimators were built, usually linkage equilibrium between marker loci or even Hardy–Weinberg equilibrium, are not met. An alternative approach requires the availability of a genotyped reference set of inbred lines, which allows to correct the observed marker similarities for their inherent upward bias when used as a coancestry measure. However, this approach does not guarantee that the resulting coancestry matrix is at least positive semi-definite (psd), a necessary condition for its use as a covariance matrix. In this paper we present the weighted alikeness in state (WAIS) estimator. This marker-based coancestry estimator is compared to several other commonly applied relatedness estimators under realistic hybrid breeding conditions in a number of simulations. We also fit a linear mixed model to phenotypical data from a commercial maize breeding programme and compare the likelihood of the different variance structures. WAIS is shown to be psd which makes it suitable for modelling the covariance between genetic components in linear mixed models involved in breeding value estimation or association studies. Results indicate that it generally produces a low root mean squared error under different breeding circumstances and provides a fit to the data that is comparable to that of several other marker-based alternatives. Recommendations for each of the examined coancestry measures are provided.  相似文献   

7.
Wang J 《Molecular ecology》2004,13(10):3169-3178
Knowledge of the genetic relatedness between a pair of individuals is important in many research areas of quantitative genetics, conservation genetics, evolution and ecology. Many estimators have been developed to estimate such pairwise relatedness (r) using codominant markers, such as microsatellites and enzymes. In contrast, only two estimators are proposed to use dominant markers, such as random amplified polymorphic DNAs (RAPDs) and amplified fragment length polymorphisms (AFLPs), in relatedness inference. They are both biased estimators, and their statistical properties and robustness to the sampling errors in allele frequency have not been investigated. In this short paper, I propose two new pairwise relatedness estimators for dominant markers, and compare them in precision, accuracy and robustness to sampling with the two previous estimators using simulations. It was found that the new estimator based on the least squares approach is unbiased when allele frequencies are known or estimated from a sample without correcting for sampling effects. It has, however, a low precision and as a result, an intermediate overall performance among the four estimators in terms of the mean squared deviation (MSD) of estimates from actual values of r. The new estimator based on a similarity index is slightly biased but has generally the lowest MSD among the four estimators compared, regardless of the number of loci, type of actual relationships, allele frequencies known or estimated from samples. Simulations also show that the confidence intervals estimated by bootstrapping are appropriate for different estimators provided that the number of loci used in the estimation is not small.  相似文献   

8.
Fernández J  Toro MA  Caballero A 《Genetics》2008,179(1):683-692
Within the context of a conservation program the management of subdivided populations implies a compromise between the control of the global genetic diversity, the avoidance of high inbreeding levels, and, sometimes, the maintenance of a certain degree of differentiation between subpopulations. We present a dynamic and flexible methodology, based on genealogical information, for the maximization of the genetic diversity (measured through the global population coancestry) in captive subdivided populations while controlling/restricting the levels of inbreeding. The method is able to implement specific restrictions on the desired relative levels of coancestry between and within subpopulations. By accounting for the particular genetic population structure, the method determines the optimal contributions (i.e., number of offspring) of each individual, the number of migrants, and the particular subpopulations involved in the exchange of individuals. Computer simulations are used to illustrate the procedure and its performance in a range of reasonable scenarios. The method performs well in most situations and is shown to be more efficient than the commonly accepted one-migrant-per-generation strategy.  相似文献   

9.
There has been remarkably little attention to using the high resolution provided by genotyping‐by‐sequencing (i.e., RADseq and similar methods) for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of data that could lead to downward‐biased, yet precise, estimates of relatedness. Here, we assess the applicability of genotyping‐by‐sequencing for relatedness inferences given its relatively high genotyping error rate. Individuals of known relatedness were simulated under genotyping error, allelic dropout and missing data scenarios based on an empirical ddRAD data set, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (Genetics Research, 67, 175) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP data set with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite data set of tens of markers. The simulation‐based approach used here can be easily implemented by others on their own genotyping‐by‐sequencing data sets to confirm the most appropriate and powerful estimator for their data.  相似文献   

10.

Background

The most efficient method to maintain genetic diversity in populations under conservation programmes is to optimize, for each potential parent, the number of offspring left to the next generation by minimizing the global coancestry. Coancestry is usually calculated from genealogical data but molecular markers can be used to replace genealogical coancestry with molecular coancestry. Recent studies showed that optimizing contributions based on coancestry calculated from a large number of SNP markers can maintain higher levels of diversity than optimizing contributions based on genealogical data. In this study, we investigated how SNP density and effective population size impact the use of molecular coancestry to maintain diversity.

Results

At low SNP densities, the genetic diversity maintained using genealogical coancestry for optimization was higher than that maintained using molecular coancestry. The performance of molecular coancestry improved with increasing marker density, and, for the scenarios evaluated, it was as efficient as genealogical coancestry if SNP density reached at least 3 times the effective population size.However, increasing SNP density resulted in reduced returns in terms of maintained diversity. While a benefit of 12% was achieved when marker density increased from 10 to 100 SNP/Morgan, the benefit was only 2% when it increased from 100 to 500 SNP/Morgan.

Conclusions

The marker density of most SNP chips already available for farm animals is sufficient for molecular coancestry to outperform genealogical coancestry in conservation programmes aimed at maintaining genetic diversity. For the purpose of effectively maintaining genetic diversity, a marker density of around 500 SNPs/Morgan can be considered as the most cost effective density when developing SNP chips for new species. Since the costs to develop SNP chips are decreasing, chips with 500 SNPs/Morgan should become available in a short-term horizon for non domestic species.  相似文献   

11.
The Demerelate package offers algorithms to calculate different interindividual relatedness measurements. Three different allele sharing indices, five pairwise weighted estimates of relatedness and four pairwise weighted estimates with sample size correction are implemented to analyse kinship structures within populations. Statistics are based on randomization tests; modelling relatedness coefficients by logistic regression, modelling relatedness with geographic distance by mantel correlation and comparing mean relatedness between populations using pairwise t‐tests. Demerelate provides an advance on previous software packages by including some estimators not available in R to date, along with FIS, as well as combining analysis of relatedness and spatial structuring. An UPGMA tree visualizes genetic relatedness among individuals. Additionally, Demerelate summarizes information on data sets (allele vs. genotype frequencies; heterozygosity; FIS values). Demerelate is – to our knowledge – the first R package implementing basic allele sharing indices such as Blouin's Mxy relatedness, the estimator of Wang corrected for sample size (wangxy), estimators based on Morans I adapted to genetic relatedness as well as combining all estimators with geographic information. The R environment enables users to better understand relatedness within populations due to the flexibility of Demerelate of accepting different data sets as empirical data, reference data, geographical data and by providing intermediate results. Each statistic and tool can be used separately, which helps to understand the suitability of the data for relatedness analysis, and can be easily implemented in custom pipelines.  相似文献   

12.
Reynolds J  Weir BS  Cockerham CC 《Genetics》1983,105(3):767-779
A distance measure for populations diverging by drift only is based on the coancestry coefficient θ, and three estimators of the distance D = -ln(1 - θ) are constructed for multiallelic, multilocus data. Simulations of a monoecious population mating at random showed that a weighted ratio of single-locus estimators performed better than an unweighted average or a least squares estimator. Jackknifing over loci provided satisfactory variance estimates of distance values. In the drift situation, in which mutation is excluded, the weighted estimator of D appears to be a better measure of distance than others that have appeared in the literature.  相似文献   

13.
Previously reported maximum-likelihood pairwise relatedness (r) estimator of Thompson and Milligan (M) was extended to allow for negative r estimates under the regression interpretation of r. This was achieved by establishing the equivalency of the likelihoods used in the kinship program and the likelihoods of Thompson. The new maximum-likelihood (ML) estimator was evaluated by Monte Carlo simulations. It was found that the new ML estimator became unbiased significantly faster compared to the original M estimator when the amount of genotype information was increased. The effects of allele frequency estimation errors on the new and existing relatedness estimators were also considered.  相似文献   

14.
Genomics provides new opportunities for conservation genetics. Conservation genetics in livestock is based on estimating diversity by pedigree relatedness and managing diversity by choosing those animals that maximize genetic diversity. Animals can be chosen as parents for the next generation, as donors of material to a gene bank, or as breeds for targeting conservation efforts. Genomics provides opportunities to estimate diversity for specific parts of the genome, such as neutral and adaptive diversity and genetic diversity underlying specific traits. This enables us to choose candidates for conservation based on specific genetic diversity (e.g. diversity of traits or adaptive diversity) or to monitor the loss of diversity without conservation. In wild animals direct genetic management, by choosing candidates for conservation as in livestock, is generally not practiced. With dense marker maps opportunities exist for monitoring relatedness and genetic diversity in wild populations, thus enabling a more active management of diversity.  相似文献   

15.
Conservation programmes aim at maximizing the survival probability of populations, by minimizing the loss of genetic diversity, which allows populations to adapt to changes, and controlling inbreeding increases. The best known strategy to achieve these goals is optimizing the contributions of the parents to minimize global coancestry in their offspring. Results on neutral scenarios showed that management based on molecular coancestry could maintain more diversity than management based on genealogical coancestry when a large number of markers were available. However, if the population has deleterious mutations, managing using optimal contributions can lead to a decrease in fitness, especially using molecular coancestry, because both beneficial and harmful alleles are maintained, compromising the long‐term viability of the population. We introduce here two strategies to avoid this problem: The first one uses molecular coancestry calculated removing markers with low minor allele frequencies, as they could be linked to selected loci. The second one uses a coancestry based on segments of identity by descent, which measures the proportion of genome segments shared by two individuals because of a common ancestor. We compare these strategies under two contrasting mutational models of fitness effects, one assuming many mutations of small effect and another with few mutations of large effect. Using markers at intermediate frequencies maintains a larger fitness than using all markers, but leads to maintaining less diversity. Using the segment‐based coancestry provides a compromise solution between maintaining diversity and fitness, especially when the population has some inbreeding load.  相似文献   

16.
17.
Methods to evaluate populations for alleles to improve an elite hybrid   总被引:1,自引:0,他引:1  
Elite hybrids can be improved by the introgression of favorable alleles not already present in the hybrid. Our first objective was to evaluate several estimators derived from quantitative genetic theory that attempt to quantify the relative number of useful alleles in potential donor populations. Secondly, we wanted to evaluate two proposed ways of determining relatedness of donor populations to the parents of the elite hybrid. Two experiments, each consisting of 21 maize populations of known pedigree, were grown at three and four environments in Minnesota in 1991. Yield and plant height means were used to provide estimates of each of the following statistics: (1) LPLU, a minimally biased statistic, (2) UBND, the minimum estimate of an upper bound, (3) NI, the net improvement, (4) PTC, the predicted three-way cross, and (5) TCSC, the testcross of the populations. These statistics are biased estimators of the relative number of unique favorable alleles contained within a population compared to a reference elite hybrid. Based on rank correlations, all statistics except NI ranked populations similarly. The percent novel germplasm relative to the single cross to be improved was positively correlated with the estimates of favorable alleles except when NI was used as the estimator. The relationship estimators agreed with the genetic constitution of the donor populations. Strong positive correlations existed between diversity, based on the relationship rankings, and all the estimator rankings, except NI. Potential donor populations were effectively identified by LPLU, UBND, PTC, and TCSC. NI was not a good estimator of unique favorable alleles.  相似文献   

18.
Studies of inbreeding depression or kin selection require knowledge of relatedness between individuals. If pedigree information is lacking, one has to rely on genotypic information to infer relatedness. In this study we investigated the performance (absolute and relative) of 10 marker-based relatedness estimators using allele frequencies at microsatellite loci obtained from natural populations of two bird species and one mammal species. Using Monte Carlo simulations we show that many factors affect the performance of estimators and that different sets of loci promote the use of different estimators: in general, there is no single best-performing estimator. The use of locus-specific weights turns out to greatly improve the performance of estimators when marker loci are used that differ strongly in allele frequency distribution. Microsatellite-based estimates are expected to explain between 25 and 79% of variation in true relatedness depending on the microsatellite dataset and on the population composition (i.e. the frequency distribution of relationship in the population). We recommend performing Monte Carlo simulations to decide which estimator to use in studies of pairwise relatedness.  相似文献   

19.
The computer program identix estimates relatedness in natural populations using multilocus genotypic data. Queller & Goodnight's (1989) and Lynch & Ritland's (1999) estimators of pairwise relatedness are implemented, as well as the identity index of Mathieu et al. (1990). Estimates of the confidence intervals around these pairwise values are also provided. The null hypothesis of no relatedness (multilocus genotypes are independent draws from a panmictic population) is tested using a permutation method that compares the observed distribution of the moments of pairwise relatedness coefficients to that expected in unstructured population.  相似文献   

20.
MOTIVATION: Ranking feature sets is a key issue for classification, for instance, phenotype classification based on gene expression. Since ranking is often based on error estimation, and error estimators suffer to differing degrees of imprecision in small-sample settings, it is important to choose a computationally feasible error estimator that yields good feature-set ranking. RESULTS: This paper examines the feature-ranking performance of several kinds of error estimators: resubstitution, cross-validation, bootstrap and bolstered error estimation. It does so for three classification rules: linear discriminant analysis, three-nearest-neighbor classification and classification trees. Two measures of performance are considered. One counts the number of the truly best feature sets appearing among the best feature sets discovered by the error estimator and the other computes the mean absolute error between the top ranks of the truly best feature sets and their ranks as given by the error estimator. Our results indicate that bolstering is superior to bootstrap, and bootstrap is better than cross-validation, for discovering top-performing feature sets for classification when using small samples. A key issue is that bolstered error estimation is tens of times faster than bootstrap, and faster than cross-validation, and is therefore feasible for feature-set ranking when the number of feature sets is extremely large.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号