首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 296 毫秒
1.
Li YM  Xiang Y  Sun ZQ 《Human heredity》2008,65(3):121-128
Quantitative trait locus (QTL) mapping can be accomplished through the method of selective genotyping, which is based on the differences of frequencies between an upper sample and a lower sample in population. However, amplifying the differences in marker allele frequencies in extreme samples may increase the probability for QTL mapping. Shannon entropy, which is a nonlinear function of allele frequencies, can be used to amplify the differences in marker allele frequencies. In this paper, we present a novel measure for linkage disequilibrium (LD) between a marker and single QTL, that is based on the comparison of the entropy and conditional entropy in a marker in extreme samples of population. This measure of LD between the marker and the trait locus can be used when the marker allele frequencies are known in the extreme samples of a population. We investigate the mapping performance in both analytic and simulation scenarios of a single QTL linked to a single marker. Our results show that the measure has very reasonable performance. In addition, a simulation study is performed on the basis of the haplotype frequencies of 10 SNPs of angiotensin-I converting enzyme (ACE) genes.  相似文献   

2.
Xiong M  Fan R  Jin L 《Human heredity》2002,53(3):158-172
As a dense map of single nucleotide polymorphism (SNP) markers are available, population-based linkage disequilibrium (LD) mapping or association study is becoming one of the major tools for identifying quantitative trait loci (QTL) and for fine gene mapping. However, in many cases, LD between the marker and trait locus is not very strong. Approaches that maximize the potential of detecting LD will be essential for the success of LD mapping of QTL. In this paper, we propose two strategies for increasing the probability of detecting LD: (1) phenotypic selection and (2) haplotype LD mapping. To provide the foundations for LD mapping of QTL under selection, we develop analytic tools for assessing the impact of phenotypic selection on allele and haplotype frequencies, and LD under three trait models: single trait locus, two unlinked trait loci, and two linked trait loci with or without epistasis. In addition to a traditional chi(2) test, which compares the difference in allele or haplotype frequencies in the selected sample and population sample, we present multiple regression methods for LD mapping of QTL, and investigate which methods are effective in employing phenotypic selection for QTL mapping. We also develop a statistical framework for investigating and comparing the power of the single marker and multilocus haplotype test for LD mapping of QTL. Finally, the proposed methods are applied to mapping QTL influencing variation in systolic blood pressure in an isolated Chinese population.  相似文献   

3.
Miller JR  Hawthorne D 《Genetics》2005,171(3):1353-1364
Given the relative ease of identifying genetic markers linked to QTL (compared to finding the loci themselves), it is natural to ask whether linked markers can be used to address questions concerning the contemporary dynamics and recent history of the QTL. In particular, can a marker allele found associated with a QTL allele in a QTL mapping study be used to track population dynamics or the history of the QTL allele? For this strategy to succeed, the marker-QTL haplotype must persist in the face of recombination over the relevant time frame. Here we investigate the dynamics of marker-QTL haplotype frequencies under recombination, population structure, and divergent selection to assess the potential utility of linked markers for a population genetic study of QTL. For two scenarios, described as "secondary contact" and "novel allele," we use both deterministic and stochastic methods to describe the influence of gene flow between habitats, the strength of divergent selection, and the genetic distance between a marker and the QTL on the persistence of marker-QTL haplotypes. We find that for most reasonable values of selection on a locus (s < or = 0.5) and migration (m > 1%) between differentially selected populations, haplotypes of typically spaced markers (5 cM) and QTL do not persist long enough (>100 generations) to provide accurate inference of the allelic state at the QTL.  相似文献   

4.
A novel multiple regression method (RM) is developed to predict identity-by-descent probabilities at a locus L (IBDL), among individuals without pedigree, given information on surrounding markers and population history. These IBDL probabilities are a function of the increase in linkage disequilibrium (LD) generated by drift in a homogeneous population over generations. Three parameters are sufficient to describe population history: effective population size (Ne), number of generations since foundation (T), and marker allele frequencies among founders (p). IBDL are used in a simulation study to map a quantitative trait locus (QTL) via variance component estimation. RM is compared to a coalescent method (CM) in terms of power and robustness of QTL detection. Differences between RM and CM are small but significant. For example, RM is more powerful than CM in dioecious populations, but not in monoecious populations. Moreover, RM is more robust than CM when marker phases are unknown or when there is complete LD among founders or Ne is wrong, and less robust when p is wrong. CM utilises all marker haplotype information, whereas RM utilises information contained in each individual marker and all possible marker pairs but not in higher order interactions. RM consists of a family of models encompassing four different population structures, and two ways of using marker information, which contrasts with the single model that must cater for all possible evolutionary scenarios in CM.  相似文献   

5.
Linkage disequilibrium (LD) mapping can be successful if there is strong nonrandom association between marker alleles and an allele affecting a trait of interest. The principles of LD mapping of dichotomous traits are well understood, but less is known about LD mapping of a quantitative-trait locus (QTL). It is shown in this report that selective genotyping can increase the power to detect and map a rare allele of large effect at a QTL. Two statistical tests of the association between an allele and a quantitative character are proposed. These tests are approximately independent, so information from them can be combined. Analytic theory is developed to show that these two tests are effective in detecting the presence of a low-frequency allele with a relatively large effect on the character when the QTL is either already a candidate locus or closely linked to a marker locus that is in strong LD with the QTL. The latter situation is expected in a rapidly growing population in which the allele of large effect was present initially in one copy. Therefore, the proposed tests are useful under the same conditions as those for successful LD mapping of a dichotomous trait or disease. Simulations show that, for detection of the presence of a QTL, these tests are more powerful than a simple t-test. The tests also provide a basis for defining a measure of association, gamma, between a low-frequency allele at a putative QTL and a low-frequency allele at a marker locus.  相似文献   

6.
Deng HW  Li YM  Li MX  Liu PY 《Human heredity》2003,56(4):160-165
Hardy-Weinberg disequilibrium (HWD) measures have been proposed using dense markers to fine map a quantitative trait locus (QTL) to regions < approximately 1 cM. Earlier HWD measures may introduce bias in the fine mapping because they are dependent on marker allele frequencies across loci. Hence, HWD indices that do not depend on marker allele frequencies are desired for fine mapping. Based on our earlier work, here we present four new HWD indices that do not depend on marker allele frequencies. Two are for use when marker allele frequencies in a study population are known, and two are for use when marker allele frequencies in a study population are not known and are only known in the extreme samples. The new measures are a function of the genetic distance between the marker locus and a QTL. Through simulations, we investigated and compared the fine mapping performance of the new HWD measures with that of the earlier ones. Our results show that when marker allele frequencies vary across loci, the new measures presented here are more robust and powerful.  相似文献   

7.
Animal breeding faces one of the most significant changes of the past decades - the implementation of genomic selection. Genomic selection uses dense marker maps to predict the breeding value of animals with reported accuracies that are up to 0.31 higher than those of pedigree indexes, without the need to phenotype the animals themselves, or close relatives thereof. The basic principle is that because of the high marker density, each quantitative trait loci (QTL) is in linkage disequilibrium (LD) with at least one nearby marker. The process involves putting a reference population together of animals with known phenotypes and genotypes to estimate the marker effects. Marker effects have been estimated with several different methods that generally aim at reducing the dimensions of the marker data. Nearly all reported models only included additive effects. Once the marker effects are estimated, breeding values of young selection candidates can be predicted with reported accuracies up to 0.85. Although results from simulation studies suggest that different models may yield more accurate genomic estimated breeding values (GEBVs) for different traits, depending on the underlying QTL distribution of the trait, there is so far only little evidence from studies based on real data to support this. The accuracy of genomic predictions strongly depends on characteristics of the reference populations, such as number of animals, number of markers, and the heritability of the recorded phenotype. Another important factor is the relationship between animals in the reference population and the evaluated animals. The breakup of LD between markers and QTL across generations advocates frequent re-estimation of marker effects to maintain the accuracy of GEBVs at an acceptable level. Therefore, at low frequencies of re-estimating marker effects, it becomes more important that the model that estimates the marker effects capitalizes on LD information that is persistent across generations.  相似文献   

8.
Selective DNA pooling is an efficient method to identify chromosomal regions that harbor quantitative trait loci (QTL) by comparing marker allele frequencies in pooled DNA from phenotypically extreme individuals. Currently used single marker analysis methods can detect linkage of markers to a QTL but do not provide separate estimates of QTL position and effect, nor do they utilize the joint information from multiple markers. In this study, two interval mapping methods for analysis of selective DNA pooling data were developed and evaluated. One was based on least squares regression (LS-pool) and the other on approximate maximum likelihood (ML-pool). Both methods simultaneously utilize information from multiple markers and multiple families and can be applied to different family structures (half-sib, F2 cross and backcross). The results from these two interval mapping methods were compared with results from single marker analysis by simulation. The results indicate that both LS-pool and ML-pool provided greater power to detect the QTL than single marker analysis. They also provide separate estimates of QTL location and effect. With large family sizes, both LS-pool and ML-pool provided similar power and estimates of QTL location and effect as selective genotyping. With small family sizes, however, the LS-pool method resulted in severely biased estimates of QTL location for distal QTL but this bias was reduced with the ML-pool.  相似文献   

9.
针对数量性状位点的精细定位,本文采用群体的极端样本,利用稠密的标记位点,通过比较标记的熵和条件熵,给出了一个基于熵的指数。该指数是标记基因和性状位点间连锁不平衡系数的函数,它不依赖于标记基因的频率。该指数对应我们之前提出的数量性状位点精细定位的哈迪-温伯格不平衡(HWD)指数,但在精细定位数量性状位点时,本文提出的指数的效能要高于哈迪-温伯格不平衡(HWD)指数。通过计算机模拟,文章调查了不同遗传参数下该指数的性质。模拟结果表明该指数用作精细定位是有效的。  相似文献   

10.
We compared the accuracies of four genomic-selection prediction methods as affected by marker density, level of linkage disequilibrium (LD), quantitative trait locus (QTL) number, sample size, and level of replication in populations generated from multiple inbred lines. Marker data on 42 two-row spring barley inbred lines were used to simulate high and low LD populations from multiple inbred line crosses: the first included many small full-sib families and the second was derived from five generations of random mating. True breeding values (TBV) were simulated on the basis of 20 or 80 additive QTL. Methods used to derive genomic estimated breeding values (GEBV) were random regression best linear unbiased prediction (RR–BLUP), Bayes-B, a Bayesian shrinkage regression method, and BLUP from a mixed model analysis using a relationship matrix calculated from marker data. Using the best methods, accuracies of GEBV were comparable to accuracies from phenotype for predicting TBV without requiring the time and expense of field evaluation. We identified a trade-off between a method's ability to capture marker-QTL LD vs. marker-based relatedness of individuals. The Bayesian shrinkage regression method primarily captured LD, the BLUP methods captured relationships, while Bayes-B captured both. Under most of the study scenarios, mixed-model analysis using a marker-derived relationship matrix (BLUP) was more accurate than methods that directly estimated marker effects, suggesting that relationship information was more valuable than LD information. When markers were in strong LD with large-effect QTL, or when predictions were made on individuals several generations removed from the training data set, however, the ranking of method performance was reversed and BLUP had the lowest accuracy.  相似文献   

11.

Background

Information for mapping of quantitative trait loci (QTL) comes from two sources: linkage disequilibrium (non-random association of allele states) and cosegregation (non-random association of allele origin). Information from LD can be captured by modeling conditional means and variances at the QTL given marker information. Similarly, information from cosegregation can be captured by modeling conditional covariances. Here, we consider a Bayesian model based on gene frequency (BGF) where both conditional means and variances are modeled as a function of the conditional gene frequencies at the QTL. The parameters in this model include these gene frequencies, additive effect of the QTL, its location, and the residual variance. Bayesian methodology was used to estimate these parameters. The priors used were: logit-normal for gene frequencies, normal for the additive effect, uniform for location, and inverse chi-square for the residual variance. Computer simulation was used to compare the power to detect and accuracy to map QTL by this method with those from least squares analysis using a regression model (LSR).

Results

To simplify the analysis, data from unrelated individuals in a purebred population were simulated, where only LD information contributes to map the QTL. LD was simulated in a chromosomal segment of 1 cM with one QTL by random mating in a population of size 500 for 1000 generations and in a population of size 100 for 50 generations. The comparison was studied under a range of conditions, which included SNP density of 0.1, 0.05 or 0.02 cM, sample size of 500 or 1000, and phenotypic variance explained by QTL of 2 or 5%. Both 1 and 2-SNP models were considered. Power to detect the QTL for the BGF, ranged from 0.4 to 0.99, and close or equal to the power of the regression using least squares (LSR). Precision to map QTL position of BGF, quantified by the mean absolute error, ranged from 0.11 to 0.21 cM for BGF, and was better than the precision of LSR, which ranged from 0.12 to 0.25 cM.

Conclusions

In conclusion given a high SNP density, the gene frequency model can be used to map QTL with considerable accuracy even within a 1 cM region.  相似文献   

12.
Localization of a quantitative trait locus via a Bayesian approach   总被引:1,自引:0,他引:1  
A Bayesian approach to the direct mapping of a quantitative trait locus (QTL), fully utilizing information from multiple linked gene markers, is presented in this paper. The joint posterior distribution (a mixture distribution modeling the linkage between a biallelic QTL and N gene markers) is computationally challenging and invites exploration via Markov chain Monte Carlo methods. The parameter's complete marginal posterior densities are obtained, allowing a diverse range of inferences. Parameters estimated include the QTL genotype probabilities for the sires and the offspring, the allele frequencies for the QTL, and the position and additive and dominance effects of the QTL. The methodology is applied through simulation to a half-sib design to form an outbred pedigree structure where there is an entire class of missing information. The capacity of the technique to accurately estimate parameters is examined for a range of scenarios.  相似文献   

13.
OBJECTIVES: Describe the inflation in nonparametric multipoint LOD scores due to inter-marker linkage disequilibrium (LD) across many markers with varied allele frequencies. METHOD: Using simulated two-generation families with and without parents, we conducted nonparametric multipoint linkage analysis with 2 to 10 markers with minor allele frequencies (MAF) of 0.5 and 0.1. RESULTS: Misspecification of population haplotype frequencies by assuming linkage equilibrium caused inflated multipoint LOD scores due to inter-marker LD when parental genotypes were not included. Inflation increased as more markers in LD were included and decreased as markers in equilibrium were added. When marker allele frequencies were unequal, the r2 measure of LD was a better predictor of inflation than D'. CONCLUSION: This observation strongly supports the evaluation of LD in multipoint linkage analyses, and further suggests that unaccounted for LD may be suspected when two-point and multipoint linkage analyses show a marked disparity in regions with elevated r2 measures of LD. Given the increasing popularity of high-density genome-wide SNP screens, inter-marker LD should be a concern in future linkage studies.  相似文献   

14.
A key question for the implementation of marker-assisted selection (MAS) using markers in linkage disequilibrium with quantitative trait loci (QTLs) is how many markers surrounding each QTL should be used to ensure the marker or marker haplotypes are in sufficient linkage disequilibrium (LD) with the QTL. In this paper we compare the accuracy of MAS using either single markers or marker haplotypes in an Angus cattle data set consisting of 9323 genome-wide single nucleotide polymorphisms (SNPs) genotyped in 379 Angus cattle. The extent of LD in the data set was such that the average marker-marker r2 was 0.2 at 200 kb. The accuracy of MAS increased as the number of markers in the haplotype surrounding the QTL increased, although only when the number of markers in the haplotype was 4 or greater did the accuracy exceed that achieved when the SNP in the highest LD with the QTL was used. A large number of phenotypic records (>1000) were required to accurately estimate the effects of the haplotypes.  相似文献   

15.
Positional cloning by linkage disequilibrium   总被引:6,自引:0,他引:6       下载免费PDF全文
Recently, metric linkage disequilibrium (LD) maps that assign an LD unit (LDU) location for each marker have been developed (Maniatis et al. 2002). Here we present a multiple pairwise method for positional cloning by LD within a composite likelihood framework and investigate the operating characteristics of maps in physical units (kb) and LDU for two bodies of data (Daly et al. 2001; Jeffreys et al. 2001) on which current ideas of blocks are based. False-negative indications of a disease locus (type II error) were examined by selecting one single-nucleotide polymorphism (SNP) at a time as causal and taking its allelic count (0, 1, or 2, for the three genotypes) as a pseudophenotype, Y. By use of regression and correlation, association between every pseudophenotype and the allelic count of each SNP locus (X) was based on an adaptation of the Malecot model, which includes a parameter for location of the putative gene. By expressing locations in kb or LDU, greater power for localization was observed when the LDU map was fitted. The efficiency of the kb map, relative to the LDU map, to describe LD varied from a maximum of 0.87 to a minimum of 0.36, with a mean of 0.62. False-positive indications of a disease locus (type I error) were examined by simulating an unlinked causal SNP and the allele count was used as a pseudophenotype. The type I error was in good agreement with Wald's likelihood theorem for both metrics and all models that were tested. Unlike tests that select only the most significant marker, haplotype, or haploset, these methods are robust to large numbers of markers in a candidate region. Contrary to predictions from tagging SNPs that retain haplotype diversity, the sample with smaller size but greater SNP density gave less error. The locations of causal SNPs were estimated with the same precision in blocks and steps, suggesting that block definition may be less useful than anticipated for mapping a causal SNP. These results provide a guide to efficient positional cloning by SNPs and a benchmark against which the power of positional cloning by haplotype-based alternatives may be measured.  相似文献   

16.
Although the effects of linkage disequilibrium (LD) on partition of genetic variance have received attention in quantitative genetics, there has been little discussion on how this phenomenon affects attribution of variance to a given locus. This paper reinforces the point that standard metrics used for assessing the contribution of a locus to variance can be misleading when there is linkage LD and that factors such as distribution of effects and of allelic frequencies over loci, or existence of frequency-dependent effects, play a role as well. An apparently new metric is proposed for measuring how much of the variability is contributed by a locus when LD exists. Effects of intervening factors, such as type and extent of LD, number of loci, distribution of effects, and of allelic frequencies over loci, as well as a model for generating frequency-dependent effects, are illustrated via hypothetical simulation scenarios. Implications on the interpretation of genome-wide association studies (GWAS), as typically carried out in human genetics, where single marker regression and the assumption of a sole quantitative trait locus (QTL) are common, are discussed. It is concluded that the standard attributions to variance contributed by a single QTL from a GWAS analysis may be misleading, conceptually and statistically, when a trait is complex and affected by sets of many genes in linkage disequilibrium. Yet another factor to consider in the “missing heritability” saga?.  相似文献   

17.
Meuwissen TH  Goddard ME 《Genetics》2007,176(4):2551-2560
A novel multipoint method, based on an approximate coalescence approach, to analyze multiple linked markers is presented. Unlike other approximate coalescence methods, it considers all markers simultaneously but only two haplotypes at a time. We demonstrate the use of this method for linkage disequilibrium (LD) mapping of QTL and estimation of effective population size. The method estimates identity-by-descent (IBD) probabilities between pairs of marker haplotypes. Both LD and combined linkage and LD mapping rely on such IBD probabilities. The method is approximate in that it considers only the information on a pair of haplotypes, whereas a full modeling of the coalescence process would simultaneously consider all haplotypes. However, full coalescence modeling is computationally feasible only for few linked markers. Using simulations of the coalescence process, the method is shown to give almost unbiased estimates of the effective population size. Compared to direct marker and haplotype association analyses, IBD-based QTL mapping showed clearly a higher power to detect a QTL and a more realistic confidence interval for its position. The modeling of LD could be extended to estimate other LD-related parameters such as recombination rates.  相似文献   

18.
Fan R  Jung J  Jin L 《Genetics》2006,172(1):663-686
In this article, population-based regression models are proposed for high-resolution linkage disequilibrium mapping of quantitative trait loci (QTL). Two regression models, the "genotype effect model" and the "additive effect model," are proposed to model the association between the markers and the trait locus. The marker can be either diallelic or multiallelic. If only one marker is used, the method is similar to a classical setting by Nielsen and Weir, and the additive effect model is equivalent to the haplotype trend regression (HTR) method by Zaykin et al. If two/multiple marker data with phase ambiguity are used in the analysis, the proposed models can be used to analyze the data directly. By analytical formulas, we show that the genotype effect model can be used to model the additive and dominance effects simultaneously; the additive effect model takes care of the additive effect only. On the basis of the two models, F-test statistics are proposed to test association between the QTL and markers. By a simulation study, we show that the two models have reasonable type I error rates for a data set of moderate sample size. The noncentrality parameter approximations of F-test statistics are derived to make power calculation and comparison. By a simulation study, it is found that the noncentrality parameter approximations of F-test statistics work very well. Using the noncentrality parameter approximations, we compare the power of the two models with that of the HTR. In addition, a simulation study is performed to make a comparison on the basis of the haplotype frequencies of 10 SNPs of angiotensin-1 converting enzyme (ACE) genes.  相似文献   

19.
Statistics for linkage disequilibrium (LD), the non-random association of alleles at two loci, depend on the frequencies of the alleles at the loci under consideration. Here, we examine the r(2) measure of LD and its mathematical relationship to allele frequencies, quantifying the constraints on its maximum value. Assuming independent uniform distributions for the allele frequencies of two biallelic loci, we find that the mean maximum value of r(2) is approximately 0.43051, and that r(2) can exceed a threshold of 4/5 in only approximately 14.232% of the allele frequency space. If one locus is assumed to have known allele frequencies--the situation in an association study in which LD between a known marker locus and an unknown trait locus is of interest--we find that the mean maximum value of r(2) is greatest when the known locus has a minor allele frequency of approximately 0.30131. We find that in 1/4 of the space of allowed values of minor allele frequencies and haplotype frequencies at a pair of loci, the unconstrained maximum r(2) allowing for the possibility of recombination between the loci exceeds the constrained maximum assuming that no recombination has occurred. Finally, we use r(max)(2) to examine the connection between r(2) and the D(') measure of linkage disequilibrium, finding that r(2)/r(max)(2)=D('2) for approximately 72.683% of the space of allowed values of (p(a),p(b),p(ab)). Our results concerning the properties of r(2) have the potential to inform the interpretation of unusual LD behavior and to assist in the design of LD-based association-mapping studies.  相似文献   

20.

Background

A haplotype approach to genomic prediction using high density data in dairy cattle as an alternative to single-marker methods is presented. With the assumption that haplotypes are in stronger linkage disequilibrium (LD) with quantitative trait loci (QTL) than single markers, this study focuses on the use of haplotype blocks (haploblocks) as explanatory variables for genomic prediction. Haploblocks were built based on the LD between markers, which allowed variable reduction. The haploblocks were then used to predict three economically important traits (milk protein, fertility and mastitis) in the Nordic Holstein population.

Results

The haploblock approach improved prediction accuracy compared with the commonly used individual single nucleotide polymorphism (SNP) approach. Furthermore, using an average LD threshold to define the haploblocks (LD≥0.45 between any two markers) increased the prediction accuracies for all three traits, although the improvement was most significant for milk protein (up to 3.1 % improvement in prediction accuracy, compared with the individual SNP approach). Hotelling’s t-tests were performed, confirming the improvement in prediction accuracy for milk protein. Because the phenotypic values were in the form of de-regressed proofs, the improved accuracy for milk protein may be due to higher reliability of the data for this trait compared with the reliability of the mastitis and fertility data. Comparisons between best linear unbiased prediction (BLUP) and Bayesian mixture models also indicated that the Bayesian model produced the most accurate predictions in every scenario for the milk protein trait, and in some scenarios for fertility.

Conclusions

The haploblock approach to genomic prediction is a promising method for genomic selection in animal breeding. Building haploblocks based on LD reduced the number of variables without the loss of information. This method may play an important role in the future genomic prediction involving while genome sequences.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号