首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Estimates of relatedness coefficients, based on genetic marker data, are often necessary for studies of genetics and ecology. Whilst many estimates based on method‐of‐moment or maximum‐likelihood methods exist for diploid organisms, no such estimators exist for organisms with multiple ploidy levels, which occur in some insect and plant species. Here, we extend five estimators to account for different levels of ploidy: one relatedness coefficient estimator, three coefficients of coancestry estimators and one maximum‐likelihood estimator. We use arrhenotoky (when unfertilized eggs develop into haploid males) as an example in evaluations of estimator performance by Monte Carlo simulation. Also, three virtual sex‐determination systems are simulated to evaluate their performances for higher levels of ploidy. Additionally, we used two real data sets to test the robustness of these estimators under actual conditions. We make available a software package, PolyRelatedness , for other researchers to apply to organisms that have various levels of ploidy.  相似文献   

2.
An estimator for pairwise relatedness using molecular markers   总被引:21,自引:0,他引:21  
Wang J 《Genetics》2002,160(3):1203-1215
I propose a new estimator for jointly estimating two-gene and four-gene coefficients of relatedness between individuals from an outbreeding population with data on codominant genetic markers and compare it, by Monte Carlo simulations, to previous ones in precision and accuracy for different distributions of population allele frequencies, numbers of alleles per locus, actual relationships, sample sizes, and proportions of relatives included in samples. In contrast to several previous estimators, the new estimator is well behaved and applies to any number of alleles per locus and any allele frequency distribution. The estimates for two- and four-gene coefficients of relatedness from the new estimator are unbiased irrespective of the sample size and have sampling variances decreasing consistently with an increasing number of alleles per locus to the minimum asymptotic values determined by the variation in identity-by-descent among loci per se, regardless of the actual relationship. The new estimator is also robust for small sample sizes and for unknown relatives being included in samples for estimating allele frequencies. Compared to previous estimators, the new one is generally advantageous, especially for highly polymorphic loci and/or small sample sizes.  相似文献   

3.
Molecular marker data provide a means of circumventing the problem of not knowing the population structure of a natural population, as observed similarities between a pair's genotypes provide information on their genetic relationship. Numerous method-of-moment (MOM) estimators have been developed for estimating relationship coefficients using this information. Here, I present a simplified form of Wang's 2002 relationship estimator that is not dependent upon a previously required weighting scheme, thus improving the efficiency of the estimator when used with genuinely related pairs. The new estimator is compared against other estimators under a range of conditions, including situations where the parameter estimates are truncated to lie within the legitimate parameter space. The advantages of the new estimator are most notable for the two-gene coefficient of relatedness. Truncating the MOM estimators results in parameter estimates whose properties are similar to maximum likelihood estimates, with them having generally lower sampling variances, but being biased.  相似文献   

4.
Relatedness estimators are widely used in genetic studies, but effects of population structure on performance of estimators, criteria to evaluate estimators, and benefits of using such estimators in conservation programs have to date received little attention. In this article we present new estimators, based on the relationship between coancestry and molecular similarity between individuals, and compare them with existing estimators using Monte Carlo simulation of populations, either panmictic or structured. Estimators were evaluated using statistical criteria and a diversity criterion that minimized relatedness. Results show that ranking of estimators depends on the population structure. An existing estimator based on two-gene and four-gene coefficients of identity performs best in panmictic populations, whereas a new estimator based on coancestry performs best in structured populations. The number of marker alleles and loci did not affect ranking of estimators. Statistical criteria were insufficient to evaluate estimators for their use in conservation programs. The regression coefficient of pedigree relatedness on estimated relatedness (beta2) was substantially lower than unity for all estimators, causing overestimation of the diversity conserved. A simple correction to achieve beta2 = 1 improves both existing and new estimators. Using relatedness estimates with correction considerably increased diversity in structured populations, but did not do so or even decreased diversity in panmictic populations.  相似文献   

5.
K Huang  S T Guo  M R Shattuck  S T Chen  X G Qi  P Zhang  B G Li 《Heredity》2015,114(2):133-142
Relatedness between individuals is central to ecological genetics. Multiple methods are available to quantify relatedness from molecular data, including method-of-moment and maximum-likelihood estimators. We describe a maximum-likelihood estimator for autopolyploids, and quantify its statistical performance under a range of biologically relevant conditions. The statistical performances of five additional polyploid estimators of relatedness were also quantified under identical conditions. When comparing truncated estimators, the maximum-likelihood estimator exhibited lower root mean square error under some conditions and was more biased for non-relatives, especially when the number of alleles per loci was low. However, even under these conditions, this bias was reduced to be statistically insignificant with more robust genetic sampling. We also considered ambiguity in polyploid heterozygote genotyping and developed a weighting methodology for candidate genotypes. The statistical performances of three polyploid estimators under both ideal and actual conditions (including inbreeding and double reduction) were compared. The software package POLYRELATEDNESS is available to perform this estimation and supports a maximum ploidy of eight.  相似文献   

6.
Studies of genetics and ecology often require estimates of relatedness coefficients based on genetic marker data. However, with the presence of null alleles, an observed genotype can represent one of several possible true genotypes. This results in biased estimates of relatedness. As the numbers of marker loci are often limited, loci with null alleles cannot be abandoned without substantial loss of statistical power. Here, we show how loci with null alleles can be incorporated into six estimators of relatedness (two novel). We evaluate the performance of various estimators before and after correction for null alleles. If the frequency of a null allele is <0.1, some estimators can be used directly without adjustment; if it is >0.5, the potency of estimation is too low and such a locus should be excluded. We make available a software package entitled PolyRelatedness v1.6, which enables researchers to optimize these estimators to best fit a particular data set.  相似文献   

7.
There has been remarkably little attention to using the high resolution provided by genotyping‐by‐sequencing (i.e., RADseq and similar methods) for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of data that could lead to downward‐biased, yet precise, estimates of relatedness. Here, we assess the applicability of genotyping‐by‐sequencing for relatedness inferences given its relatively high genotyping error rate. Individuals of known relatedness were simulated under genotyping error, allelic dropout and missing data scenarios based on an empirical ddRAD data set, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (Genetics Research, 67, 175) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP data set with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite data set of tens of markers. The simulation‐based approach used here can be easily implemented by others on their own genotyping‐by‐sequencing data sets to confirm the most appropriate and powerful estimator for their data.  相似文献   

8.
Studies of inbreeding depression or kin selection require knowledge of relatedness between individuals. If pedigree information is lacking, one has to rely on genotypic information to infer relatedness. In this study we investigated the performance (absolute and relative) of 10 marker-based relatedness estimators using allele frequencies at microsatellite loci obtained from natural populations of two bird species and one mammal species. Using Monte Carlo simulations we show that many factors affect the performance of estimators and that different sets of loci promote the use of different estimators: in general, there is no single best-performing estimator. The use of locus-specific weights turns out to greatly improve the performance of estimators when marker loci are used that differ strongly in allele frequency distribution. Microsatellite-based estimates are expected to explain between 25 and 79% of variation in true relatedness depending on the microsatellite dataset and on the population composition (i.e. the frequency distribution of relationship in the population). We recommend performing Monte Carlo simulations to decide which estimator to use in studies of pairwise relatedness.  相似文献   

9.
Wang J 《Molecular ecology》2004,13(10):3169-3178
Knowledge of the genetic relatedness between a pair of individuals is important in many research areas of quantitative genetics, conservation genetics, evolution and ecology. Many estimators have been developed to estimate such pairwise relatedness (r) using codominant markers, such as microsatellites and enzymes. In contrast, only two estimators are proposed to use dominant markers, such as random amplified polymorphic DNAs (RAPDs) and amplified fragment length polymorphisms (AFLPs), in relatedness inference. They are both biased estimators, and their statistical properties and robustness to the sampling errors in allele frequency have not been investigated. In this short paper, I propose two new pairwise relatedness estimators for dominant markers, and compare them in precision, accuracy and robustness to sampling with the two previous estimators using simulations. It was found that the new estimator based on the least squares approach is unbiased when allele frequencies are known or estimated from a sample without correcting for sampling effects. It has, however, a low precision and as a result, an intermediate overall performance among the four estimators in terms of the mean squared deviation (MSD) of estimates from actual values of r. The new estimator based on a similarity index is slightly biased but has generally the lowest MSD among the four estimators compared, regardless of the number of loci, type of actual relationships, allele frequencies known or estimated from samples. Simulations also show that the confidence intervals estimated by bootstrapping are appropriate for different estimators provided that the number of loci used in the estimation is not small.  相似文献   

10.
Marker-based methods for estimating heritability have been proposed as an effective means to study quantitative traits in long-lived organisms and natural populations. However, practical examinations to evaluate the usefulness and robustness of a regression method are limited. Using several quantitative traits of Japanese flounder Paralichthys olivaceus, the present study examined the influence of relatedness estimator and population structure on the estimation of heritability and genetic correlation under a regression method with 7 microsatellite loci. Significant heritability and genetic correlation were detected for several quantitative traits in 2 laboratory populations but not in a natural population. In the laboratory populations, upward bias in heritability appeared depending on the relatedness estimators and the populations. Upward bias in heritability increased with decreasing the actual variance of relatedness, suggesting that the estimates of heritability under the regression method tend to be overestimated due to the underestimation of the actual variance of relatedness. Therefore, relationship structure and precise estimation of relatedness are critical for applying this method.  相似文献   

11.
1. Total species richness for an assemblage or site is a valuable measure in conservation monitoring and assessment, but protocols for sampling and species richness determination in wetland habitats such as ponds, bogs or mires remain largely unrefined. 2. Techniques for estimation of total richness of an assemblage based upon replicated sampling offer the opportunity to derive useful estimates of total richness based upon small numbers of samples, and limit sampling‐derived disturbance which can be particularly problematic in small aquatic habitats. 3. We quantified the performance of eight of the most commonly encountered estimators of species richness for a variety of littoral zone macrofauna from ponds, comparing estimated richness to maximum richness derived from sampling. 4. Estimates using non‐parametric techniques based on species incidence provided the most accurate and precise estimates. The estimators Chao 2 and incidence‐based coverage estimator (ICE) from this category were reliable and consistent slight over‐estimators; the abundance‐based estimator Chao1 also performed well. 5. Species inventory based on relatively small numbers of samples might be significantly improved by use of non‐parametric estimators for quantification of species richness. 6. Use of non‐parametric estimators of species richness can assist biodiversity inventory by preventing erroneous rankings of habitat richness based upon observed species numbers from limited sampling.  相似文献   

12.
Shrinkage Estimators for Covariance Matrices   总被引:1,自引:0,他引:1  
Estimation of covariance matrices in small samples has been studied by many authors. Standard estimators, like the unstructured maximum likelihood estimator (ML) or restricted maximum likelihood (REML) estimator, can be very unstable with the smallest estimated eigenvalues being too small and the largest too big. A standard approach to more stably estimating the matrix in small samples is to compute the ML or REML estimator under some simple structure that involves estimation of fewer parameters, such as compound symmetry or independence. However, these estimators will not be consistent unless the hypothesized structure is correct. If interest focuses on estimation of regression coefficients with correlated (or longitudinal) data, a sandwich estimator of the covariance matrix may be used to provide standard errors for the estimated coefficients that are robust in the sense that they remain consistent under misspecification of the covariance structure. With large matrices, however, the inefficiency of the sandwich estimator becomes worrisome. We consider here two general shrinkage approaches to estimating the covariance matrix and regression coefficients. The first involves shrinking the eigenvalues of the unstructured ML or REML estimator. The second involves shrinking an unstructured estimator toward a structured estimator. For both cases, the data determine the amount of shrinkage. These estimators are consistent and give consistent and asymptotically efficient estimates for regression coefficients. Simulations show the improved operating characteristics of the shrinkage estimators of the covariance matrix and the regression coefficients in finite samples. The final estimator chosen includes a combination of both shrinkage approaches, i.e., shrinking the eigenvalues and then shrinking toward structure. We illustrate our approach on a sleep EEG study that requires estimation of a 24 x 24 covariance matrix and for which inferences on mean parameters critically depend on the covariance estimator chosen. We recommend making inference using a particular shrinkage estimator that provides a reasonable compromise between structured and unstructured estimators.  相似文献   

13.
Estimation of relatedness by DNA fingerprinting   总被引:28,自引:0,他引:28  
The recent discovery of hypervariable VNTR (variable number of tandem repeat) loci has led to much excitement among population biologists regarding the feasibility of deriving individual estimates of relatedness in field populations by DNA fingerprinting. It is shown that unbiased estimates of relatedness cannot be obtained at the individual level without knowledge of the allelic distributions in both the individuals of interest and the base population unless the proportion of shared marker alleles between unrelated individuals is essentially zero. Since the latter is usually on the order of 0.1-0.5 and since there are enormous practical difficulties in obtaining the former, only an approximate estimator for the relatedness can be given. The bias of this estimator is individual specific and inversely related to the number of marker loci and frequencies of marker alleles. Substantial sampling variance in estimates of relatedness arises from variation in identity by descent within and between loci and, with finite numbers of alleles, from variation in identity in state between genes that are not identical by descent. In the extreme case of 25 assayed loci, each with an effectively infinite number of alleles, the standard error of a relatedness estimate is no less than 14%, 20%, 35%, and 53% of the expectation for full sibs and second-, third-, and fourth-order relationships, respectively. Attempts to ascertain relatedness by means of DNA fingerprinting should proceed with caution.   相似文献   

14.
Several estimators have been proposed that use molecular marker data to infer the degree of relatedness for pairs of individuals. The objective of this study was to evaluate the performance of seven estimators when applied to marker data of a set of 33 key individuals from a large complex apple pedigree. The evaluation considered different scenarios of allele frequencies and different numbers of marker loci. The method of moments estimators were Similarity, Queller-Goodknight, Lynch-Ritland and Wang. The maximum likelihood estimators were Thompson, Anderson-Weir and Jacquard. The pedigree-based coancestry coefficients were taken as the point of reference in calculating correlations and root mean square error (RMSE). The marker data comprised 86 multi-allelic SSR markers on 17 linkage groups, covering 11 Morgans. Additionally, we simulated 10 datasets conditional on the real pedigree to support the results on the real dataset. None of the estimators outperformed the others. Knowledge of allele frequencies appeared to be the most influential, i.e., the highest correlations and lowest RMSE were found when frequencies from the founder population were available. When equal allele frequencies were used, all estimators resulted in very similar, but on average lower, correlations. The use of allele frequencies estimated from the set of 33 individuals gave, on average, the poorest results. The maximum likelihood estimators and the Lynch-Ritland estimator were the most sensitive to allele frequencies. The results from the simulation study fully supported the trends in results of the real dataset. This study indicated that high correlations (up to 0.90) and small RMSE (below 0.03), may be obtained when population allelic frequencies are available. In this scenario, the performances of the various estimators were similar, but seemed to favor the maximum likelihood estimators. In the absence of reliable allele frequencies the method of moments estimators were shown to be more robust. The number of marker loci influenced the average performance of the estimators; however, the ranking was not affected. Correlations up to 0.80 were obtained when two markers per chromosome and appropriate allele frequencies were available. Adding more markers to the current dataset may lead to marginal improvements.  相似文献   

15.
The Demerelate package offers algorithms to calculate different interindividual relatedness measurements. Three different allele sharing indices, five pairwise weighted estimates of relatedness and four pairwise weighted estimates with sample size correction are implemented to analyse kinship structures within populations. Statistics are based on randomization tests; modelling relatedness coefficients by logistic regression, modelling relatedness with geographic distance by mantel correlation and comparing mean relatedness between populations using pairwise t‐tests. Demerelate provides an advance on previous software packages by including some estimators not available in R to date, along with FIS, as well as combining analysis of relatedness and spatial structuring. An UPGMA tree visualizes genetic relatedness among individuals. Additionally, Demerelate summarizes information on data sets (allele vs. genotype frequencies; heterozygosity; FIS values). Demerelate is – to our knowledge – the first R package implementing basic allele sharing indices such as Blouin's Mxy relatedness, the estimator of Wang corrected for sample size (wangxy), estimators based on Morans I adapted to genetic relatedness as well as combining all estimators with geographic information. The R environment enables users to better understand relatedness within populations due to the flexibility of Demerelate of accepting different data sets as empirical data, reference data, geographical data and by providing intermediate results. Each statistic and tool can be used separately, which helps to understand the suitability of the data for relatedness analysis, and can be easily implemented in custom pipelines.  相似文献   

16.
Simple sequence repeats (SSR) are the most widely used molecular markers for relatedness inference due to their multi-allelic nature and high informativeness. However, there is a growing trend toward using high-throughput and inter-specific transferable single-nucleotide polymorphisms (SNP) and Diversity Arrays Technology (DArT) in forest genetics owing to their wide genome coverage. We compared the efficiency of 15 SSRs, 181 SNPs and 2816 DArTs to estimate the relatedness coefficients, and their effects on genetic parameters’ precision, in a relatively small data set of an open-pollinated progeny trial of Eucalyptus grandis (Hill ex Maiden) with limited relationship from the pedigree. Both simulations and real data of Eucalyptus grandis were used to study the statistical performance of three relatedness estimators based on co-dominant markers. Relatedness estimates in pairs of individuals belonging to the same family (related) were higher for DArTs than for SNPs and SSRs. DArTs performed better compared to SSRs and SNPs in estimated relatedness coefficients in pairs of individuals belonging to different families (unrelated) and showed higher ability to discriminate unrelated from related individuals. The likelihood-based estimator exhibited the lowest root mean squared error (RMSE); however, the differences in RMSE among the three estimators studied were small. For the growth traits, heritability estimates based on SNPs yielded, on average, smaller standard errors compared to those based on SSRs and DArTs. Estimated relatedness in the realized relationship matrix and heritabilities can be accurately inferred from co-dominant or sufficiently dense dominant markers in a relatively small E. grandis data set with shallow pedigree.  相似文献   

17.
Understanding the functional relationship between the sample size and the performance of species richness estimators is necessary to optimize limited sampling resources against estimation error. Nonparametric estimators such as Chao and Jackknife demonstrate strong performances, but consensus is lacking as to which estimator performs better under constrained sampling. We explore a method to improve the estimators under such scenario. The method we propose involves randomly splitting species‐abundance data from a single sample into two equally sized samples, and using an appropriate incidence‐based estimator to estimate richness. To test this method, we assume a lognormal species‐abundance distribution (SAD) with varying coefficients of variation (CV), generate samples using MCMC simulations, and use the expected mean‐squared error as the performance criterion of the estimators. We test this method for Chao, Jackknife, ICE, and ACE estimators. Between abundance‐based estimators with the single sample, and incidence‐based estimators with the split‐in‐two samples, Chao2 performed the best when CV < 0.65, and incidence‐based Jackknife performed the best when CV > 0.65, given that the ratio of sample size to observed species richness is greater than a critical value given by a power function of CV with respect to abundance of the sampled population. The proposed method increases the performance of the estimators substantially and is more effective when more rare species are in an assemblage. We also show that the splitting method works qualitatively similarly well when the SADs are log series, geometric series, and negative binomial. We demonstrate an application of the proposed method by estimating richness of zooplankton communities in samples of ballast water. The proposed splitting method is an alternative to sampling a large number of individuals to increase the accuracy of richness estimations; therefore, it is appropriate for a wide range of resource‐limited sampling scenarios in ecology.  相似文献   

18.
The software package COANCESTRY implements seven relatedness estimators and three inbreeding estimators to estimate relatedness and inbreeding coefficients from multilocus genotype data. Two likelihood estimators that allow for inbred individuals and account for genotyping errors are for the first time included in this user-friendly program for PCs running Windows operating system. A simulation module is built in the program to simulate multilocus genotype data of individuals with a predefined relationship, and to compare the estimators and the simulated relatedness values to facilitate the selection of the best estimator in a particular situation. Bootstrapping and permutations are used to obtain the 95% confidence intervals of each relatedness or inbreeding estimate, and to test the difference in averages between groups.  相似文献   

19.
Using striped bass (Morone saxatilis) and six multiplexed microsatellite markers, we evaluated procedures for estimating allele frequencies by pooling DNA from multiple individuals, a method suggested as cost-effective relative to individual genotyping. Using moment-based estimators, we estimated allele frequencies in experimental DNA pools and found that the three primary laboratory steps, DNA quantitation and pooling, PCR amplification, and electrophoresis, accounted for 23, 48, and 29%, respectively, of the technical variance of estimates in pools containing DNA from 2-24 individuals. Exact allele-frequency estimates could be made for pools of sizes 2-8, depending on the locus, by using an integer-valued estimator. Larger pools of size 12 and 24 tended to yield biased estimates; however, replicates of these estimates detected allele frequency differences among pools with different allelic compositions. We also derive an unbiased estimator of Hardy-Weinberg disequilibrium coefficients that uses multiple DNA pools and analyze the cost-efficiency of DNA pooling. DNA pooling yields the most potential cost savings when a large number of loci are employed using a large number of individuals, a situation becoming increasingly common as microsatellite loci are developed in increasing numbers of taxa.  相似文献   

20.
Maximum-likelihood estimation of relatedness   总被引:8,自引:0,他引:8  
Milligan BG 《Genetics》2003,163(3):1153-1167
Relatedness between individuals is central to many studies in genetics and population biology. A variety of estimators have been developed to enable molecular marker data to quantify relatedness. Despite this, no effort has been given to characterize the traditional maximum-likelihood estimator in relation to the remainder. This article quantifies its statistical performance under a range of biologically relevant sampling conditions. Under the same range of conditions, the statistical performance of five other commonly used estimators of relatedness is quantified. Comparison among these estimators indicates that the traditional maximum-likelihood estimator exhibits a lower standard error under essentially all conditions. Only for very large amounts of genetic information do most of the other estimators approach the likelihood estimator. However, the likelihood estimator is more biased than any of the others, especially when the amount of genetic information is low or the actual relationship being estimated is near the boundary of the parameter space. Even under these conditions, the amount of bias can be greatly reduced, potentially to biologically irrelevant levels, with suitable genetic sampling. Additionally, the likelihood estimator generally exhibits the lowest root mean-square error, an indication that the bias in fact is quite small. Alternative estimators restricted to yield only biologically interpretable estimates exhibit lower standard errors and greater bias than do unrestricted ones, but generally do not improve over the maximum-likelihood estimator and in some cases exhibit even greater bias. Although some nonlikelihood estimators exhibit better performance with respect to specific metrics under some conditions, none approach the high level of performance exhibited by the likelihood estimator across all conditions and all metrics of performance.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号