首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Anderson AD  Weir BS 《Genetics》2007,176(1):421-440
A maximum-likelihood estimator for pairwise relatedness is presented for the situation in which the individuals under consideration come from a large outbred subpopulation of the population for which allele frequencies are known. We demonstrate via simulations that a variety of commonly used estimators that do not take this kind of misspecification of allele frequencies into account will systematically overestimate the degree of relatedness between two individuals from a subpopulation. A maximum-likelihood estimator that includes F(ST) as a parameter is introduced with the goal of producing the relatedness estimates that would have been obtained if the subpopulation allele frequencies had been known. This estimator is shown to work quite well, even when the value of F(ST) is misspecified. Bootstrap confidence intervals are also examined and shown to exhibit close to nominal coverage when F(ST) is correctly specified.  相似文献   

2.
Maximum-likelihood estimation of relatedness   总被引:8,自引:0,他引:8  
Milligan BG 《Genetics》2003,163(3):1153-1167
Relatedness between individuals is central to many studies in genetics and population biology. A variety of estimators have been developed to enable molecular marker data to quantify relatedness. Despite this, no effort has been given to characterize the traditional maximum-likelihood estimator in relation to the remainder. This article quantifies its statistical performance under a range of biologically relevant sampling conditions. Under the same range of conditions, the statistical performance of five other commonly used estimators of relatedness is quantified. Comparison among these estimators indicates that the traditional maximum-likelihood estimator exhibits a lower standard error under essentially all conditions. Only for very large amounts of genetic information do most of the other estimators approach the likelihood estimator. However, the likelihood estimator is more biased than any of the others, especially when the amount of genetic information is low or the actual relationship being estimated is near the boundary of the parameter space. Even under these conditions, the amount of bias can be greatly reduced, potentially to biologically irrelevant levels, with suitable genetic sampling. Additionally, the likelihood estimator generally exhibits the lowest root mean-square error, an indication that the bias in fact is quite small. Alternative estimators restricted to yield only biologically interpretable estimates exhibit lower standard errors and greater bias than do unrestricted ones, but generally do not improve over the maximum-likelihood estimator and in some cases exhibit even greater bias. Although some nonlikelihood estimators exhibit better performance with respect to specific metrics under some conditions, none approach the high level of performance exhibited by the likelihood estimator across all conditions and all metrics of performance.  相似文献   

3.
K Huang  S T Guo  M R Shattuck  S T Chen  X G Qi  P Zhang  B G Li 《Heredity》2015,114(2):133-142
Relatedness between individuals is central to ecological genetics. Multiple methods are available to quantify relatedness from molecular data, including method-of-moment and maximum-likelihood estimators. We describe a maximum-likelihood estimator for autopolyploids, and quantify its statistical performance under a range of biologically relevant conditions. The statistical performances of five additional polyploid estimators of relatedness were also quantified under identical conditions. When comparing truncated estimators, the maximum-likelihood estimator exhibited lower root mean square error under some conditions and was more biased for non-relatives, especially when the number of alleles per loci was low. However, even under these conditions, this bias was reduced to be statistically insignificant with more robust genetic sampling. We also considered ambiguity in polyploid heterozygote genotyping and developed a weighting methodology for candidate genotypes. The statistical performances of three polyploid estimators under both ideal and actual conditions (including inbreeding and double reduction) were compared. The software package POLYRELATEDNESS is available to perform this estimation and supports a maximum ploidy of eight.  相似文献   

4.
Wang J 《Molecular ecology》2004,13(10):3169-3178
Knowledge of the genetic relatedness between a pair of individuals is important in many research areas of quantitative genetics, conservation genetics, evolution and ecology. Many estimators have been developed to estimate such pairwise relatedness (r) using codominant markers, such as microsatellites and enzymes. In contrast, only two estimators are proposed to use dominant markers, such as random amplified polymorphic DNAs (RAPDs) and amplified fragment length polymorphisms (AFLPs), in relatedness inference. They are both biased estimators, and their statistical properties and robustness to the sampling errors in allele frequency have not been investigated. In this short paper, I propose two new pairwise relatedness estimators for dominant markers, and compare them in precision, accuracy and robustness to sampling with the two previous estimators using simulations. It was found that the new estimator based on the least squares approach is unbiased when allele frequencies are known or estimated from a sample without correcting for sampling effects. It has, however, a low precision and as a result, an intermediate overall performance among the four estimators in terms of the mean squared deviation (MSD) of estimates from actual values of r. The new estimator based on a similarity index is slightly biased but has generally the lowest MSD among the four estimators compared, regardless of the number of loci, type of actual relationships, allele frequencies known or estimated from samples. Simulations also show that the confidence intervals estimated by bootstrapping are appropriate for different estimators provided that the number of loci used in the estimation is not small.  相似文献   

5.
Measuring the information content of markers in relationship/relatedness inferences is important in selecting highly informative markers to attain a given statistical power with the minimal genotyping effort. Using information-theoretic principles, I introduce the informativeness for relationship (I(R)) and the informativeness for relatedness (I(r)) to measure the amount of information provided by markers in inferring pairwise relationships (R) and relatedness (r), respectively. I also propose a fast and accurate algorithm to calculate the power (PW(R)) of a set of markers in differentiating two candidate relationships, and the reciprocal of the mean squared deviations of relatedness estimates (RMSD) to measure the amount of information of markers actually used by an estimator in estimating relatedness. All of the four measurements (I(R), I(r), PW(R), RMSD) apply to dominant and codominant markers, haploid and diploid individuals, and take into account of mutations and typing errors in data. The statistical properties of the four measurements and their relationships are investigated analytically and are examined by applying these methods to simulated and empirical data.  相似文献   

6.
Hardy OJ 《Molecular ecology》2003,12(6):1577-1588
A new estimator of the pairwise relatedness coefficient between individuals adapted to dominant genetic markers is developed. This estimator does not assume genotypes to be in Hardy-Weinberg proportions but requires a knowledge of the departure from these proportions (i.e. the inbreeding coefficient). Simulations show that the estimator provides accurate estimates, except for some particular types of individual pairs such as full-sibs, and performs better than a previously developed estimator. When comparing marker-based relatedness estimates with pedigree expectations, a new approach to account for the change of the reference population is developed and shown to perform satisfactorily. Simulations also illustrate that this new relatedness estimator can be used to characterize isolation by distance within populations, leading to essentially unbiased estimates of the neighbourhood size. In this context, the estimator appears fairly robust to moderate errors made on the assumed inbreeding coefficient. The analysis of real data sets suggests that dominant markers (random amplified polymorphic DNA, amplified fragment length polymorphism) may be as valuable as co-dominant markers (microsatellites) in studying microgeographic isolation-by-distance processes. It is argued that the estimators developed should find major applications, notably for conservation biology.  相似文献   

7.
Studies in genetics and ecology often require estimates of relatedness coefficients based on genetic marker data. Many diploid estimators have been developed using either method‐of‐moments or maximum‐likelihood estimates. However, there are no relatedness estimators for polyploids. The development of a moment estimator for polyploids with polysomic inheritance, which simultaneously incorporates the two‐gene relatedness coefficient and various ‘higher‐order’ coefficients, is described here. The performance of the estimator is compared to other estimators under a variety of conditions. When using a small number of loci, the estimator is biased because of an increase in ill‐conditioned matrices. However, the estimator becomes asymptotically unbiased with large numbers of loci. The ambiguity of polyploid heterozygotes (when balanced heterozygotes cannot be distinguished from unbalanced heterozygotes) is also considered; as with low numbers of loci, genotype ambiguity leads to bias. A software, PolyRelatedness , implementing this method and supporting a maximum ploidy of 8 is provided.  相似文献   

8.
An estimator for pairwise relatedness using molecular markers   总被引:21,自引:0,他引:21  
Wang J 《Genetics》2002,160(3):1203-1215
I propose a new estimator for jointly estimating two-gene and four-gene coefficients of relatedness between individuals from an outbreeding population with data on codominant genetic markers and compare it, by Monte Carlo simulations, to previous ones in precision and accuracy for different distributions of population allele frequencies, numbers of alleles per locus, actual relationships, sample sizes, and proportions of relatives included in samples. In contrast to several previous estimators, the new estimator is well behaved and applies to any number of alleles per locus and any allele frequency distribution. The estimates for two- and four-gene coefficients of relatedness from the new estimator are unbiased irrespective of the sample size and have sampling variances decreasing consistently with an increasing number of alleles per locus to the minimum asymptotic values determined by the variation in identity-by-descent among loci per se, regardless of the actual relationship. The new estimator is also robust for small sample sizes and for unknown relatives being included in samples for estimating allele frequencies. Compared to previous estimators, the new one is generally advantageous, especially for highly polymorphic loci and/or small sample sizes.  相似文献   

9.
Estimation of relatedness by DNA fingerprinting   总被引:28,自引:0,他引:28  
The recent discovery of hypervariable VNTR (variable number of tandem repeat) loci has led to much excitement among population biologists regarding the feasibility of deriving individual estimates of relatedness in field populations by DNA fingerprinting. It is shown that unbiased estimates of relatedness cannot be obtained at the individual level without knowledge of the allelic distributions in both the individuals of interest and the base population unless the proportion of shared marker alleles between unrelated individuals is essentially zero. Since the latter is usually on the order of 0.1-0.5 and since there are enormous practical difficulties in obtaining the former, only an approximate estimator for the relatedness can be given. The bias of this estimator is individual specific and inversely related to the number of marker loci and frequencies of marker alleles. Substantial sampling variance in estimates of relatedness arises from variation in identity by descent within and between loci and, with finite numbers of alleles, from variation in identity in state between genes that are not identical by descent. In the extreme case of 25 assayed loci, each with an effectively infinite number of alleles, the standard error of a relatedness estimate is no less than 14%, 20%, 35%, and 53% of the expectation for full sibs and second-, third-, and fourth-order relationships, respectively. Attempts to ascertain relatedness by means of DNA fingerprinting should proceed with caution.   相似文献   

10.
Studies of inbreeding depression or kin selection require knowledge of relatedness between individuals. If pedigree information is lacking, one has to rely on genotypic information to infer relatedness. In this study we investigated the performance (absolute and relative) of 10 marker-based relatedness estimators using allele frequencies at microsatellite loci obtained from natural populations of two bird species and one mammal species. Using Monte Carlo simulations we show that many factors affect the performance of estimators and that different sets of loci promote the use of different estimators: in general, there is no single best-performing estimator. The use of locus-specific weights turns out to greatly improve the performance of estimators when marker loci are used that differ strongly in allele frequency distribution. Microsatellite-based estimates are expected to explain between 25 and 79% of variation in true relatedness depending on the microsatellite dataset and on the population composition (i.e. the frequency distribution of relationship in the population). We recommend performing Monte Carlo simulations to decide which estimator to use in studies of pairwise relatedness.  相似文献   

11.
There has been remarkably little attention to using the high resolution provided by genotyping‐by‐sequencing (i.e., RADseq and similar methods) for assessing relatedness in wildlife populations. A major hurdle is the genotyping error, especially allelic dropout, often found in this type of data that could lead to downward‐biased, yet precise, estimates of relatedness. Here, we assess the applicability of genotyping‐by‐sequencing for relatedness inferences given its relatively high genotyping error rate. Individuals of known relatedness were simulated under genotyping error, allelic dropout and missing data scenarios based on an empirical ddRAD data set, and their true relatedness was compared to that estimated by seven relatedness estimators. We found that an estimator chosen through such analyses can circumvent the influence of genotyping error, with the estimator of Ritland (Genetics Research, 67, 175) shown to be unaffected by allelic dropout and to be the most accurate when there is genotyping error. We also found that the choice of estimator should not rely solely on the strength of correlation between estimated and true relatedness as a strong correlation does not necessarily mean estimates are close to true relatedness. We also demonstrated how even a large SNP data set with genotyping error (allelic dropout or otherwise) or missing data still performs better than a perfectly genotyped microsatellite data set of tens of markers. The simulation‐based approach used here can be easily implemented by others on their own genotyping‐by‐sequencing data sets to confirm the most appropriate and powerful estimator for their data.  相似文献   

12.
Hu XS 《Heredity》2005,94(3):338-346
The 'spatial' pattern of the correlation of pairwise relatedness among loci within a chromosome is an important aspect for an insight into genomic evolution in natural populations. In this article, a statistical genetic method is presented for estimating the correlation of pairwise relatedness among linked loci. The probabilities of identity-in-state (IIS) are related to the probabilities of identity-by-descent (IBS) for the two- and three-loci cases. By decomposing the joint probabilities of two- or three-loci IBD, the probability of pairwise relatedness at a single locus and its correlation among linked loci can be simultaneously estimated. To provide effective statistical methods for estimation, weighted least square (LS) and maximum likelihood (ML) methods are evaluated through extensive Monte Carlo simulations. Results show that the ML method gives a better performance than the weighted LS method with haploid genotypic data. However, there are no significant differences between the two methods when two- or three-loci diploid genotypic data are employed. Compared with the optimal size for haploid genotypic data, a smaller optimal sample size is predicted with diploid genotypic data.  相似文献   

13.
Relatedness estimators are widely used in genetic studies, but effects of population structure on performance of estimators, criteria to evaluate estimators, and benefits of using such estimators in conservation programs have to date received little attention. In this article we present new estimators, based on the relationship between coancestry and molecular similarity between individuals, and compare them with existing estimators using Monte Carlo simulation of populations, either panmictic or structured. Estimators were evaluated using statistical criteria and a diversity criterion that minimized relatedness. Results show that ranking of estimators depends on the population structure. An existing estimator based on two-gene and four-gene coefficients of identity performs best in panmictic populations, whereas a new estimator based on coancestry performs best in structured populations. The number of marker alleles and loci did not affect ranking of estimators. Statistical criteria were insufficient to evaluate estimators for their use in conservation programs. The regression coefficient of pedigree relatedness on estimated relatedness (beta2) was substantially lower than unity for all estimators, causing overestimation of the diversity conserved. A simple correction to achieve beta2 = 1 improves both existing and new estimators. Using relatedness estimates with correction considerably increased diversity in structured populations, but did not do so or even decreased diversity in panmictic populations.  相似文献   

14.
Analyses of pairwise relatedness represent a key component to addressing many topics in biology. However, such analyses have been limited because most available programs provide a means to estimate relatedness based on only a single estimator, making comparison across estimators difficult. Second, all programs to date have been platform specific, working only on a specific operating system. This has the undesirable outcome of making choice of relatedness estimator limited by operating system preference, rather than being based on scientific rationale. Here, we present a new R package, called related, that can calculate relatedness based on seven estimators, can account for genotyping errors, missing data and inbreeding, and can estimate 95% confidence intervals. Moreover, simulation functions are provided that allow for easy comparison of the performance of different estimators and for analyses of how much resolution to expect from a given data set. Because this package works in R, it is platform independent. Combined, this functionality should allow for more appropriate analyses and interpretation of pairwise relatedness and will also allow for the integration of relatedness data into larger R workflows.  相似文献   

15.
Ritland K 《Molecular ecology》2005,14(10):3157-3165
Estimators for pairwise relatedness designed for dominant markers are derived, based on a genetic model that accounts for the full structure of pairwise relatedness between two individuals at a diploid locus with dominance. They jointly estimate 'relatedness' and 'fraternity', in which case the estimators are inherently multilocus, as at least two loci of differing gene frequency are required. Extensions to cases of zero fraternity and isolation by distance (inbreeding) are also examined. Properties of estimators are examined by simulation and compared to the estimator of Hardy. The most statistical power for pairwise relatedness occurs when roughly half of individuals are the recessive phenotype. Estimation procedures are implemented in the computer program mark.  相似文献   

16.
Wang J 《Genetical research》2007,89(3):135-153
Knowledge of the genetic relatedness among individuals is essential in diverse research areas such as behavioural ecology, conservation biology, quantitative genetics and forensics. How to estimate relatedness accurately from genetic marker information has been explored recently by many methodological studies. In this investigation I propose a new likelihood method that uses the genotypes of a triad of individuals in estimating pairwise relatedness (r). The idea is to use a third individual as a control (reference) in estimating the r between two other individuals, thus reducing the chance of genes identical in state being mistakenly inferred as identical by descent. The new method allows for inbreeding and accounts for genotype errors in data. Analyses of both simulated and human microsatellite and SNP datasets show that the quality of r estimates (measured by the root mean squared error, RMSE) is generally improved substantially by the new triadic likelihood method (TL) over the dyadic likelihood method and five moment estimators. Simulations also show that genotyping errors/mutations, when ignored, result in underestimates of r for related dyads, and that incorporating a model of typing errors in the TL method improves r estimates for highly related dyads but impairs those for loosely related or unrelated dyads. The effects of inbreeding were also investigated through simulations. It is concluded that, because most dyads in a natural population are unrelated or only loosely related, the overall performance of the new triadic likelihood method is the best, offering r estimates with a RMSE that is substantially smaller than the five commonly used moment estimators and the dyadic likelihood method.  相似文献   

17.
The estimation of relatedness within social groups, such as the colonies of a population of social insects, is an important field for evaluating hypotheses concerning the evolution and maintenance of social behaviour. The methodology of this estimation from genetic data in the absence of pedigree information has been poorly understood; we develop this methodology for b, the regression coefficient of relatedness, and discuss its applications. Both b and G (the pedigree coefficient of relatedness) are potentially asymmetric coefficients, whereas φ, r, and FST are necessarily symmetric. We develop an estimator for b suitable for small samples, and also one for standard deviation, and examine the properties of both using sampling simulations. The b estimator returns values slightly below E(b), and the standard deviation estimator yields conservative confidence intervals. A comparative study of b and FST shows that, given the same set of data, b is estimated with greater reliability than is FST. As is the case for FST, b can be used to examine population structure at various levels, and b possesses the advantage of an estimator for its standard error, which can also be used to test for heterogeneity among the loci surveyed. The actual numbers of identical genes held in common by interacting individuals, and not simply their proportions, need to be considered in using coefficients of relatedness in inclusive fitness calculations. This necessity is handled by the weighted coefficients of relatedness, G′ and b′, which have been referred to in the literature as r (as have most relatedness measures).  相似文献   

18.
Estimates of relatedness coefficients, based on genetic marker data, are often necessary for studies of genetics and ecology. Whilst many estimates based on method‐of‐moment or maximum‐likelihood methods exist for diploid organisms, no such estimators exist for organisms with multiple ploidy levels, which occur in some insect and plant species. Here, we extend five estimators to account for different levels of ploidy: one relatedness coefficient estimator, three coefficients of coancestry estimators and one maximum‐likelihood estimator. We use arrhenotoky (when unfertilized eggs develop into haploid males) as an example in evaluations of estimator performance by Monte Carlo simulation. Also, three virtual sex‐determination systems are simulated to evaluate their performances for higher levels of ploidy. Additionally, we used two real data sets to test the robustness of these estimators under actual conditions. We make available a software package, PolyRelatedness , for other researchers to apply to organisms that have various levels of ploidy.  相似文献   

19.
Enjalbert J  David JL 《Genetics》2000,156(4):1973-1982
Using multilocus individual heterozygosity, a method is developed to estimate the outcrossing rates of a population over a few previous generations. Considering that individuals originate either from outcrossing or from n successive selfing generations from an outbred ancestor, a maximum-likelihood (ML) estimator is described that gives estimates of past outcrossing rates in terms of proportions of individuals with different n values. Heterozygosities at several unlinked codominant loci are used to assign n values to each individual. This method also allows a test of whether populations are in inbreeding equilibrium. The estimator's reliability was checked using simulations for different mating histories. We show that this ML estimator can provide estimates of outcrossing rates for the final generation outcrossing rate (t(0)) and a mean of the preceding rates (t(p)) and can detect major temporal variation in the mating system. The method is most efficient for low to intermediate outcrossing levels. Applied to nine populations of wheat, this method gave estimates of t(0) and t(p). These estimates confirmed the absence of outcrossing t(0) = 0 in the two populations subjected to manual selfing. For free-mating wheat populations, it detected lower final generation outcrossing rates t(0) = 0-0.06 than those expected from global heterozygosity t = 0.02-0.09. This estimator appears to be a new and efficient way to describe the multilocus heterozygosity of a population, complementary to Fis and progeny analysis approaches.  相似文献   

20.
The software package COANCESTRY implements seven relatedness estimators and three inbreeding estimators to estimate relatedness and inbreeding coefficients from multilocus genotype data. Two likelihood estimators that allow for inbred individuals and account for genotyping errors are for the first time included in this user-friendly program for PCs running Windows operating system. A simulation module is built in the program to simulate multilocus genotype data of individuals with a predefined relationship, and to compare the estimators and the simulated relatedness values to facilitate the selection of the best estimator in a particular situation. Bootstrapping and permutations are used to obtain the 95% confidence intervals of each relatedness or inbreeding estimate, and to test the difference in averages between groups.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号