首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The prediction of identity by descent (IBD) probabilities is essential for all methods that map quantitative trait loci (QTL). The IBD probabilities may be predicted from marker genotypes and/or pedigree information. Here, a method is presented that predicts IBD probabilities at a given chromosomal location given data on a haplotype of markers spanning that position. The method is based on a simplification of the coalescence process, and assumes that the number of generations since the base population and effective population size is known, although effective size may be estimated from the data. The probability that two gametes are IBD at a particular locus increases as the number of markers surrounding the locus with identical alleles increases. This effect is more pronounced when effective population size is high. Hence as effective population size increases, the IBD probabilities become more sensitive to the marker data which should favour finer scale mapping of the QTL. The IBD probability prediction method was developed for the situation where the pedigree of the animals was unknown (i.e. all information came from the marker genotypes), and the situation where, say T, generations of unknown pedigree are followed by some generations where pedigree and marker genotypes are known.  相似文献   

2.
An empirical comparison between three different methods for estimation of pair-wise identity-by-descent (IBD) sharing at marker loci was conducted in order to quantify the resulting differences in power and localization precision in variance components-based linkage analysis. On the examined simulated, error-free data set, it was found that an increase in accuracy of allele sharing calculation resulted in an increase in power to detect linkage. Linkage analysis based on approximate multi-marker IBD matrices computed by a Markov chain Monte Carlo approach was much more powerful than linkage analysis based on exact single-marker IBD probabilities. A "multiple two-point" approximation to true "multipoint" IBD computation was found to be roughly intermediate in power. Both multi-marker approaches were similar to each other in accuracy of localization of the quantitative trait locus and far superior to the single-marker approach. The overall conclusions of this study with respect to power are expected to also hold for different data structures and situations, even though the degree of superiority of one approach over another depends on the specific circumstances. It should be kept in mind, however, that an increase in computational accuracy is expected to go hand in hand with a decrease in robustness to various sources of errors.  相似文献   

3.
We generalize a recently introduced graphical framework to compute the probability that haplotypes or genotypes of two individuals drawn from a finite, subdivided population match. As in the previous work, we assume an infinite-alleles model. We focus on the case of a population divided into two subpopulations, but the underlying framework can be applied to a general model of population subdivision. We examine the effect of population subdivision on the match probabilities and the accuracy of the product rule which approximates multi-locus match probabilities as a product of one-locus match probabilities. We quantify the deviation from predictions of the product rule by R, the ratio of the multi-locus match probability to the product of the one-locus match probabilities. We carry out the computation for two loci and find that ignoring subdivision can lead to underestimation of the match probabilities if the population under consideration actually has subdivision structure and the individuals originate from the same subpopulation. On the other hand, under a given model of population subdivision, we find that the ratio R for two loci is only slightly greater than 1 for a large range of symmetric and asymmetric migration rates. Keeping in mind that the infinite-alleles model is not the appropriate mutation model for STR loci, we conclude that, for two loci and biologically reasonable parameter values, population subdivision may lead to results that disfavor innocent suspects because of an increase in identity-by-descent in finite populations. On the other hand, for the same range of parameters, population subdivision does not lead to a substantial increase in linkage disequilibrium between loci. Those results are consistent with established practice.  相似文献   

4.
A new deterministic method for predicting simultaneous inbreeding coefficients at three and four loci is presented. The method involves calculating the conditional probability of IBD (identical by descent) at one locus given IBD at other loci, and multiplying this probability by the prior probability of the latter loci being simultaneously IBD. The conditional probability is obtained applying a novel regression model, and the prior probability from the theory of digenic measures of Weir and Cockerham. The model was validated for a finite monoecious population mating at random, with a constant effective population size, and with or without selfing, and also for an infinite population with a constant intermediate proportion of selfing. We assumed discrete generations. Deterministic predictions were very accurate when compared with simulation results, and robust to alternative forms of implementation. These simultaneous inbreeding coefficients were more sensitive to changes in effective population size than in marker spacing. Extensions to predict simultaneous inbreeding coefficients at more than four loci are now possible.  相似文献   

5.
F. Rousset 《Genetics》1996,142(4):1357-1362
Expected values of WRIGHT's F-statistics are functions of probabilities of identity in state. These values may be quite different under an infinite allele model and under stepwise mutation processes such as those occurring at microsatellite loci. However, a relationship between the probability of identity in state in stepwise mutation models and the distribution of coalescence times can be deduced from the relationship between probabilities of identity by descent and the distribution of coalescence times. The values of F(IS) and F(ST) can be computed using this property. Examination of the conditional probability of identity in state given some coalescence time and of the distribution of coalescence times are also useful for explaining the properties of F(IS) and F(ST) at high mutation rate loci, as shown here in an island model of population structure.  相似文献   

6.
For finite populations, differences in individual histories can cause between-locus allelic dependencies even for unlinked loci. The main motivation for this study is to quantify the effect of such dependencies on genotypic match probabilities. We compare the two-locus match probability, the probability that two individuals (four gametes) chosen at random will have the same genotype at both loci, with the probability computed as the product of the one-locus match probabilities. It is demonstrated that the product rule probability always underestimates the two-locus match probability. For highly mutable minisatellite loci, these probabilities can differ by an order of magnitude or more. A simplified three-locus problem is explored, providing evidence that the degree of under-estimation worsens for more loci.  相似文献   

7.
Related individuals are identical by descent (IBD) at a genetic locus if they share the same DNA material from a common ancestor. Continuous gamete IBD data consist of the lengths of (in order) IBD and non-IBD regions along the genomes for gametes segregating from two related individuals and can be used to distinguish different relationships. Under the assumption that the crossovers follow a Poisson process, we show that the exact calculation of the likelihood of a particular relationship for a given gamete IBD datum is tractable. Greatgrandparent--greatgrandchild and cousin relationships are used as examples to illustrate our methods.  相似文献   

8.
Gametogenesis processes and multilocus gene identity by descent.   总被引:2,自引:1,他引:1       下载免费PDF全文
With few exceptions, the determination of unconditional probability of genes shared identical by descent (IBD) by relatives can be very difficult, especially if the relationship is complex or if multiple loci are involved. It is particularly difficult if one needs the IBD probability in a explicit form, expressed in terms of interlocus recombination fractions. In this paper, I will further extend the concept of gametogenesis process introduced elsewhere and indicate that it completely determines the gene IBD events of interest in pedigrees. I will demonstrate that the gametogenesis process not only serves as a convenient conceptual framework in considering IBD events in pedigrees but also provides a simple yet powerful tool to solve a wide range of seemingly difficult problems. In particular, I consider the problem of multilocus IBD probability for relative pairs, k siblings, and a group of pedigree members. In addition, I consider the problem of multilocus autozygosity probability and the problem of gene preservation in close relatives.  相似文献   

9.
Gradients of variation—or clines—have always intrigued biologists. Classically, they have been interpreted as the outcomes of antagonistic interactions between selection and gene flow. Alternatively, clines may also establish neutrally with isolation by distance (IBD) or secondary contact between previously isolated populations. The relative importance of natural selection and these two neutral processes in the establishment of clinal variation can be tested by comparing genetic differentiation at neutral genetic markers and at the studied trait. A third neutral process, surfing of a newly arisen mutation during the colonization of a new habitat, is more difficult to test. Here, we designed a spatially explicit approximate Bayesian computation (ABC) simulation framework to evaluate whether the strong cline in the genetically based reddish coloration observed in the European barn owl (Tyto alba) arose as a by‐product of a range expansion or whether selection has to be invoked to explain this colour cline, for which we have previously ruled out the actions of IBD or secondary contact. Using ABC simulations and genetic data on 390 individuals from 20 locations genotyped at 22 microsatellites loci, we first determined how barn owls colonized Europe after the last glaciation. Using these results in new simulations on the evolution of the colour phenotype, and assuming various genetic architectures for the colour trait, we demonstrate that the observed colour cline cannot be due to the surfing of a neutral mutation. Taking advantage of spatially explicit ABC, which proved to be a powerful method to disentangle the respective roles of selection and drift in range expansions, we conclude that the formation of the colour cline observed in the barn owl must be due to natural selection.  相似文献   

10.
We investigate the probabilities of identity-by-descent at three loci in order to find a signature which differentiates between the two types of crossing over events: recombination and gene conversion. We use a Markov chain to model coalescence, recombination, gene conversion and mutation in a sample of size two. Using numerical analysis, we calculate the total probability of identity-by-descent at the three loci, and partition these probabilities based on a partial ordering of coalescent events at the three loci. We use these results to compute the probabilities of four different patterns of conditional identity and non-identity at the three loci under recombination and gene conversion. Although recombination and gene conversion do make different predictions, the differences are not likely to be useful in distinguishing between them using three locus patterns between pairs of DNA sequences. This implies that measures of genetic identity in larger samples will be needed to distinguish between gene conversion and recombination.  相似文献   

11.
Begg CB  Eng KH  Hummer AJ 《Biometrics》2007,63(2):522-530
Cancer investigators frequently conduct studies to examine tumor samples from pairs of apparently independent primary tumors with a view to determine whether they share a "clonal" origin. The genetic fingerprints of the tumors are compared using a panel of markers, often representing loss of heterozygosity (LOH) at distinct genetic loci. In this article we evaluate candidate significance tests for this purpose. The relevant information is derived from the observed correlation of the tumors with respect to the occurrence of LOH at individual loci, a phenomenon that can be evaluated using Fisher's exact test. Information is also available from the extent to which losses at the same locus occur on the same parental allele. Data from these combined sources of information can be evaluated using a simple adaptation of Fisher's exact test. The test statistic is the total number of loci at which concordant mutations occur on the same parental allele, with higher values providing more evidence in favor of a clonal origin for the two tumors. The test is shown to have high power for detecting clonality for plausible models of the alternative (clonal) hypothesis, and for reasonable numbers of informative loci, preferably located on distinct chromosomal arms. The method is illustrated using studies to identify clonality in contralateral breast cancer. Interpretation of the results of these tests requires caution due to simplifying assumptions regarding the possible variability in mutation probabilities between loci, and possible imbalances in the mutation probabilities between parental alleles. Nonetheless, we conclude that the method represents a simple, powerful strategy for distinguishing independent tumors from those of clonal origin.  相似文献   

12.
Linkage and the Limits to Natural Selection   总被引:20,自引:11,他引:9  
N. H. Barton 《Genetics》1995,140(2):821-841
The probability of fixation of a favorable mutation is reduced if selection at other loci causes inherited variation in fitness. A general method for calculating the fixation probability of an allele that can find itself in a variety of genetic backgrounds is applied to find the effect of substitutions, fluctuating polymorphisms, and deleterious mutations in a large population. With loose linkage, r, the effects depend on the additive genetic variance in relative fitness, var (W), and act by reducing effective population size by (N/N(e)) = 1 + var (W)/2r(2). However, tightly linked loci can have a substantial effect not predictable from N(e). Linked deleterious mutations reduce the fixation probability of weakly favored alleles by exp(-2U/R), where U is the total mutation rate and R is the map length in Morgans. Substitutions can cause a greater reduction: an allele with advantage s < s(crit) = (π(2)/6) log(e) (S/s)[var(W)/R] is very unlikely to be fixed. (S is the advantage of the substitution impeding fixation.) Fluctuating polymorphisms at many (n) linked loci can also have a substantial effect, reducing fixation probability by exp [ &2Kn var(W)/R] [K = -1/E((u - u)(2)/uv) depending on the frequencies (u,v) at the selected polymorphisms]. Hitchhiking due to all three kinds of selection may substantially impede adaptation that depends on weakly favored alleles.  相似文献   

13.
Zaykin DV  Pudovkin A  Weir BS 《Genetics》2008,180(1):533-545
The correlation between alleles at a pair of genetic loci is a measure of linkage disequilibrium. The square of the sample correlation multiplied by sample size provides the usual test statistic for the hypothesis of no disequilibrium for loci with two alleles and this relation has proved useful for study design and marker selection. Nevertheless, this relation holds only in a diallelic case, and an extension to multiple alleles has not been made. Here we introduce a similar statistic, R(2), which leads to a correlation-based test for loci with multiple alleles: for a pair of loci with k and m alleles, and a sample of n individuals, the approximate distribution of n(k - 1)(m - 1)/(km)R(2) under independence between loci is chi((k-1)(m-1))(2). One advantage of this statistic is that it can be interpreted as the total correlation between a pair of loci. When the phase of two-locus genotypes is known, the approach is equivalent to a test for the overall correlation between rows and columns in a contingency table. In the phase-known case, R(2) is the sum of the squared sample correlations for all km 2 x 2 subtables formed by collapsing to one allele vs. the rest at each locus. We examine the approximate distribution under the null of independence for R(2) and report its close agreement with the exact distribution obtained by permutation. The test for independence using R(2) is a strong competitor to approaches such as Pearson's chi square, Fisher's exact test, and a test based on Cressie and Read's power divergence statistic. We combine this approach with our previous composite-disequilibrium measures to address the case when the genotypic phase is unknown. Calculation of the new multiallele test statistic and its P-value is very simple and utilizes the approximate distribution of R(2). We provide a computer program that evaluates approximate as well as "exact" permutational P-values.  相似文献   

14.
Existing methods for identity by descent (IBD) segment detection were designed for SNP array data, not sequence data. Sequence data have a much higher density of genetic variants and a different allele frequency distribution, and can have higher genotype error rates. Consequently, best practices for IBD detection in SNP array data do not necessarily carry over to sequence data. We present a method, IBDseq, for detecting IBD segments in sequence data and a method, SEQERR, for estimating genotype error rates at low-frequency variants by using detected IBD. The IBDseq method estimates probabilities of genotypes observed with error for each pair of individuals under IBD and non-IBD models. The ratio of estimated probabilities under the two models gives a LOD score for IBD. We evaluate several IBD detection methods that are fast enough for application to sequence data (IBDseq, Beagle Refined IBD, PLINK, and GERMLINE) under multiple parameter settings, and we show that IBDseq achieves high power and accuracy for IBD detection in sequence data. The SEQERR method estimates genotype error rates by comparing observed and expected rates of pairs of homozygote and heterozygote genotypes at low-frequency variants in IBD segments. We demonstrate the accuracy of SEQERR in simulated data, and we apply the method to estimate genotype error rates in sequence data from the UK10K and 1000 Genomes projects.  相似文献   

15.
Coalescent likelihood is the probability of observing the given population sequences under the coalescent model. Computation of coalescent likelihood under the infinite sites model is a classic problem in coalescent theory. Existing methods are based on either importance sampling or Markov chain Monte Carlo and are inexact. In this paper, we develop a simple method that can compute the exact coalescent likelihood for many data sets of moderate size, including real biological data whose likelihood was previously thought to be difficult to compute exactly. Our method works for both panmictic and subdivided populations. Simulations demonstrate that the practical range of exact coalescent likelihood computation for panmictic populations is significantly larger than what was previously believed. We investigate the application of our method in estimating mutation rates by maximum likelihood. A main application of the exact method is comparing the accuracy of approximate methods. To demonstrate the usefulness of the exact method, we evaluate the accuracy of program Genetree in computing the likelihood for subdivided populations.  相似文献   

16.
Hu XS 《Heredity》2005,94(3):338-346
The 'spatial' pattern of the correlation of pairwise relatedness among loci within a chromosome is an important aspect for an insight into genomic evolution in natural populations. In this article, a statistical genetic method is presented for estimating the correlation of pairwise relatedness among linked loci. The probabilities of identity-in-state (IIS) are related to the probabilities of identity-by-descent (IBS) for the two- and three-loci cases. By decomposing the joint probabilities of two- or three-loci IBD, the probability of pairwise relatedness at a single locus and its correlation among linked loci can be simultaneously estimated. To provide effective statistical methods for estimation, weighted least square (LS) and maximum likelihood (ML) methods are evaluated through extensive Monte Carlo simulations. Results show that the ML method gives a better performance than the weighted LS method with haploid genotypic data. However, there are no significant differences between the two methods when two- or three-loci diploid genotypic data are employed. Compared with the optimal size for haploid genotypic data, a smaller optimal sample size is predicted with diploid genotypic data.  相似文献   

17.
Meuwissen TH  Goddard ME 《Genetics》2007,176(4):2551-2560
A novel multipoint method, based on an approximate coalescence approach, to analyze multiple linked markers is presented. Unlike other approximate coalescence methods, it considers all markers simultaneously but only two haplotypes at a time. We demonstrate the use of this method for linkage disequilibrium (LD) mapping of QTL and estimation of effective population size. The method estimates identity-by-descent (IBD) probabilities between pairs of marker haplotypes. Both LD and combined linkage and LD mapping rely on such IBD probabilities. The method is approximate in that it considers only the information on a pair of haplotypes, whereas a full modeling of the coalescence process would simultaneously consider all haplotypes. However, full coalescence modeling is computationally feasible only for few linked markers. Using simulations of the coalescence process, the method is shown to give almost unbiased estimates of the effective population size. Compared to direct marker and haplotype association analyses, IBD-based QTL mapping showed clearly a higher power to detect a QTL and a more realistic confidence interval for its position. The modeling of LD could be extended to estimate other LD-related parameters such as recombination rates.  相似文献   

18.
Multilocus genotype probabilities, estimated using the assumption of independent association of alleles within and across loci, are subject to sampling fluctuation, since allele frequencies used in such computations are derived from samples drawn from a population. We derive exact sampling variances of estimated genotype probabilities and provide simple approximation of sampling variances. Computer simulations conducted using real DNA typing data indicate that, while the sampling distribution of estimated genotype probabilities is not symmetric around the point estimate, the confidence interval of estimated (single-locus or multilocus) genotype probabilities can be obtained from the sampling of a logarithmic transformation of the estimated values. This, in turn, allows an examination of heterogeneity of estimators derived from data on different reference populations. Applications of this theory to DNA typing data at VNTR loci suggest that use of different reference population data may yield significantly different estimates. However, significant differences generally occur with rare (less than 1 in 40,000) genotype probabilities. Conservative estimates of five-locus DNA profile probabilities are always less than 1 in 1 million in an individual from the United States, irrespective of the racial/ethnic origin.  相似文献   

19.
A linkage disequilibrium-based method for fine mapping quantitative trait loci (QTL) has been described that uses similarity between individuals' marker haplotypes to determine if QTL alleles are identical by descent (IBD) to model covariances among individuals' QTL alleles for a mixed linear model. Mapping accuracy with this method was found to be sensitive to the number of linked markers that was included in the haplotype when fitting the model at a putative position of the QTL. The objective of this study was to determine the optimal haplotype structure for this IBD-based method for fine mapping a QTL in a previously identified QTL region. Haplotypes consisting of 1, 2, 4, 6, or all 10 available markers were fit as a "sliding window" across the QTL region under ideal and nonideal simulated population conditions. It was found that using haplotypes of 4 or 6 markers as a sliding "window" resulted in the greatest mapping accuracy under nearly all conditions, although the true IBD state at a putative QTL position was most accurately predicted by IBD probabilities obtained using all markers. Using 4 or 6 markers resulted in greater discrimination of IBD probabilities between positions while maintaining sufficient accuracy of IBD probabilities to detect the QTL. Fitting IBD probabilities on the basis of a single marker resulted in the worst mapping accuracy under all conditions because it resulted in poor accuracy of IBD probabilities. In conclusion, for fine mapping using IBD methods, marker information must be used in a manner that results in sensitivity of IBD probabilities to the putative position of the QTL while maintaining sufficient accuracy of IBD probabilities to detect the QTL. Contrary to expectation, use of haplotypes of 4-6 markers to derive IBD probabilities, rather than all available markers, best fits these criteria. Thus for populations similar to those simulated here, optimal mapping accuracy for this IBD-based fine-mapping method is obtained with a haplotype structure including a subset of all available markers.  相似文献   

20.
Jochens A  Caliebe A  Rösler U  Krawczak M 《Genetics》2011,189(4):1403-1411
The rate of microsatellite mutation is dependent upon both the allele length and the repeat motif, but the exact nature of this relationship is still unknown. We analyzed data on the inheritance of human Y-chromosomal microsatellites in father-son duos, taken from 24 published reports and comprising 15,285 directly observable meioses. At the six microsatellites analyzed (DYS19, DYS389I, DYS390, DYS391, DYS392, and DYS393), a total of 162 mutations were observed. For each locus, we employed a maximum-likelihood approach to evaluate one of several single-step mutation models on the basis of the data. For five of the six loci considered, a novel logistic mutation model was found to provide the best fit according to Akaike's information criterion. This implies that the mutation probability at the loci increases (nonlinearly) with allele length at a rate that differs between upward and downward mutations. For DYS392, the best fit was provided by a linear model in which upward and downward mutation probabilities increase equally with allele length. This is the first study to empirically compare different microsatellite mutation models in a locus-specific fashion.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号