首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genotype data from the Illumina Linkage III SNP panel (n = 4,720 SNPs) and the Affymetrix 10 k mapping array (n = 11,120 SNPs) were used to test the effects of linkage disequilibrium (LD) between SNPs in a linkage analysis in the Collaborative Study on the Genetics of Alcoholism pedigree collection (143 pedigrees; 1,614 individuals). The average r2 between adjacent markers across the genetic map was 0.099 +/- 0.003 in the Illumina III panel and 0.17 +/- 0.003 in the Affymetrix 10 k array. In order to determine the effect of LD between marker loci in a nonparametric multipoint linkage analysis, markers in strong LD with another marker (r2 > 0.40) were removed (n = 471 loci in the Illumina panel; n = 1,804 loci in the Affymetrix panel) and the linkage analysis results were compared to the results using the entire marker sets. In all analyses using the ALDX1 phenotype, 8 linkage regions on 5 chromosomes (2, 7, 10, 11, X) were detected (peak markers p < 0.01), and the Illumina panel detected an additional region on chromosome 6. Analysis of the same pedigree set and ALDX1 phenotype using short tandem repeat markers (STRs) resulted in 3 linkage regions on 3 chromosomes (peak markers p < 0.01). These results suggest that in this pedigree set, LD between loci with spacing similar to the SNP panels tested may not significantly affect the overall detection of linkage regions in a genome scan. Moreover, since the data quality and information content are greatly improved in the SNP panels over STR genotyping methods, new linkage regions may be identified due to higher information content and data quality in a dense SNP linkage panel.  相似文献   

2.
3.
Dense SNP maps can be highly informative for linkage studies. But when parental genotypes are missing, multipoint linkage scores can be inflated in regions with substantial marker-marker linkage disequilibrium (LD). Such regions were observed in the Affymetrix SNP genotypes for the Genetic Analysis Workshop 14 (GAW14) Collaborative Study on the Genetics of Alcoholism (COGA) dataset, providing an opportunity to test a novel simulation strategy for studying this problem. First, an inheritance vector (with or without linkage present) is simulated for each replicate, i.e., locations of recombinations and transmission of parental chromosomes are determined for each meiosis. Then, two sets of founder haplotypes are superimposed onto the inheritance vector: one set that is inferred from the actual data and which contains the pattern of LD; and one set created by randomly selecting parental alleles based on the known allele frequencies, with no correlation (LD) between markers. Applying this strategy to a map of 176 SNPs (66 Mb of chromosome 7) for 100 replicates of 116 sibling pairs, significant inflation of multipoint linkage scores was observed in regions of high LD when parental genotypes were set to missing, with no linkage present. Similar inflation was observed in analyses of the COGA data for these affected sib pairs with parental genotypes set to missing, but not after reducing the marker map until r2 between any pair of markers was 相似文献   

4.
OBJECTIVES: Describe the inflation in nonparametric multipoint LOD scores due to inter-marker linkage disequilibrium (LD) across many markers with varied allele frequencies. METHOD: Using simulated two-generation families with and without parents, we conducted nonparametric multipoint linkage analysis with 2 to 10 markers with minor allele frequencies (MAF) of 0.5 and 0.1. RESULTS: Misspecification of population haplotype frequencies by assuming linkage equilibrium caused inflated multipoint LOD scores due to inter-marker LD when parental genotypes were not included. Inflation increased as more markers in LD were included and decreased as markers in equilibrium were added. When marker allele frequencies were unequal, the r2 measure of LD was a better predictor of inflation than D'. CONCLUSION: This observation strongly supports the evaluation of LD in multipoint linkage analyses, and further suggests that unaccounted for LD may be suspected when two-point and multipoint linkage analyses show a marked disparity in regions with elevated r2 measures of LD. Given the increasing popularity of high-density genome-wide SNP screens, inter-marker LD should be a concern in future linkage studies.  相似文献   

5.
Model misspecification and multipoint linkage analysis.   总被引:9,自引:0,他引:9  
Pairwise linkage analysis is robust to genetic model misspecification provided dominance is correctly specified, the primary effect being inflation of the recombination fraction. By contrast, we show that multipoint analysis under misspecified models is not robust when a putative disease locus is placed between close flanking markers, with potentially spuriously negative multipoint lod scores being produced. The problem is due to incorrect attribution of segregation of a disease allele and the consequent conclusion of (unlikely) double crossovers between flanking markers. As a possible solution, we propose the use of high disease allele frequencies, as this allows probabilistically for nonsegregation (through parental homozygosity or dual matings). We show analytically and through analysis of pedigree data simulated under a two-locus heterogeneity model that using a disease allele frequency of 0.05 in the dominant case and 0.25 in the recessive case is quite robust in producing positive multipoint lod scores with close flanking markers across a broad range of conditions including varying allele frequencies, epistasis, genetic heterogeneity and phenocopies.  相似文献   

6.
We have compared the efficiency of the lod score test which assumes heterogeneity (lod2) to the standard lod score test which assumes homogeneity (lod1) when three-point linkage analysis is used in successive map intervals. If it is assumed that a gene located midway between two linked marker loci is responsible for a proportion of disease cases, then the lod1 test loses power relative to the lod2 test, as the proportion of linked families decreases, as the flanking markers are more closely linked, and as more map intervals are tested. Moreover, when multipoint analysis is used, linkage for a disease gene is more likely to be incorrectly excluded from a complete and dense linkage map if true genetic heterogeneity is ignored. We thus conclude that, in general, the lod2 linkage test is more efficient for detecting a true linkage when a complete genetic marker map is screened for a heterogeneous disorder.  相似文献   

7.
Computational constraints currently limit exact multipoint linkage analysis to pedigrees of moderate size. We introduce new algorithms that allow analysis of larger pedigrees by reducing the time and memory requirements of the computation. We use the observed pedigree genotypes to reduce the number of inheritance patterns that need to be considered. The algorithms are implemented in a new version (version 2.1) of the software package GENEHUNTER. Performance gains depend on marker heterozygosity and on the number of pedigree members available for genotyping, but typically are 10-1,000-fold, compared with the performance of the previous release (version 2.0). As a result, families with up to 30 bits of inheritance information have been analyzed, and further increases in family size are feasible. In addition to computation of linkage statistics and haplotype determination, GENEHUNTER can also perform single-locus and multilocus transmission/disequilibrium tests. We describe and implement a set of permutation tests that allow determination of empirical significance levels in the presence of linkage disequilibrium among marker loci.  相似文献   

8.
Genomewide linkage studies are tending toward the use of single-nucleotide polymorphisms (SNPs) as the markers of choice. However, linkage disequilibrium (LD) between tightly linked SNPs violates the fundamental assumption of linkage equilibrium (LE) between markers that underlies most multipoint calculation algorithms currently available, and this leads to inflated affected-relative-pair allele-sharing statistics when founders' multilocus genotypes are unknown. In this study, we investigate the impact that the degree of LD, marker allele frequency, and association type have on estimating the probabilities of sharing alleles identical by descent in multipoint calculations and hence on type I error rates of different sib-pair linkage approaches that assume LE. We show that marker-marker LD does not inflate type I error rates of affected sib pair (ASP) statistics in the whole parameter space, and that, in any case, discordant sib pairs (DSPs) can be used to control for marker-marker LD in ASPs. We advocate the ASP/DSP design with appropriate sib-pair statistics that test the difference in allele sharing between ASPs and DSPs.  相似文献   

9.
Summary Multipoint linkage analysis of choroideremia (TCD) and seven X chromosomal restriction fragment length polymorphisms (RFLPs) was carried out in 18 Finnish TCD families. The data place TCD distal to PGK and DXS72, very close to DXYS1 and DXYS5 (Zmax = 24 at = 0) and proximal to DXYS4 and DXYS12. This agrees with the data obtained from other linkage studies and from physical mapping. All the TCD males and carrier females studied have the same DXYS1 allele in coupling with TCD. In Northeastern Finland, 66/69 chromosomes carrying TCD had the same haplotype at loci DXS72, DXYS1, DXYS4, and DXYS12. The same haplotype is seen in only 15/99 chromosomes not carrying TCD. Moreover, in 71/104 non-TCD chromosomes, the haplotype at six marker loci is different from those seen in any of the 76 TCD chromosomes. This supports the previously described hypothesis that the large Northern Finnish choroideremia pedigrees, comprising a total of over 80 living patients representing more than a fifth of all TCD patients described worldwide, carry the same mutation. These linkage and haplotype data provide improved opportunities for prenatal diagnosis based on RFLP studies.  相似文献   

10.
Stephan W  Song YS  Langley CH 《Genetics》2006,172(4):2647-2663
We analyzed a three-locus model of genetic hitchhiking with one locus experiencing positive directional selection and two partially linked neutral loci. Following the original hitchhiking approach by Maynard Smith and Haigh, our analysis is purely deterministic. In the first half of the selected phase after a favored mutation has entered the population, hitchhiking may lead to a strong increase of linkage disequilibrium (LD) between the two neutral sites if both are <0.1 s away from the selected site (where s is the selection coefficient). In the second half of the selected phase, the main effect of hitchhiking is to destroy LD. This occurs very quickly (before the end of the selected phase) when the selected site is between both neutral loci. This pattern cannot be attributed to the well-known variation-reducing effect of hitchhiking but is a consequence of secondary hitchhiking effects on the recombinants created in the selected phase. When the selected site is outside the neutral loci (which are, say, <0.1s apart), however, a fast decay of LD is observed only if the selected site is in the immediate neighborhood of one of the neutral sites (i.e., if the recombination rate r between the selected site and one of the neutral sites satisfies r<0.1 s). If the selected site is far away from the neutral sites (say, r > 0.3 s), the decay rate of LD approaches that of neutrality. Averaging over a uniform distribution of initial gamete frequencies shows that the expected LD at the end of the hitchhiking phase is driven toward zero, while the variance is increased when the selected site is well outside the two neutral sites. When the direction of LD is polarized with respect to the more common allele at each neutral site, hitchhiking creates more positive than negative linkage disequilibrium. Thus, hitchhiking may have a distinctively patterned LD-reducing effect, in particular near the target of selection.  相似文献   

11.
Thomas A 《Human heredity》2007,64(1):16-26
We review recent developments of MCMC integration methods for computations on graphical models for two applications in statistical genetics: modelling allelic association and pedigree based linkage analysis. We discuss and illustrate estimation of graphical models from haploid and diploid genotypes, and the importance of MCMC updating schemes beyond what is strictly necessary for irreducibility. We then outline an approach combining these methods to compute linkage statistics when alleles at the marker loci are in linkage disequilibrium. Other extensions suitable for analysis of SNP genotype data in pedigrees are also discussed and programs that implement these methods, and which are available from the author's web site, are described. We conclude with a discussion of how this still experimental approach might be further developed.  相似文献   

12.
The lowdown on linkage disequilibrium   总被引:18,自引:0,他引:18       下载免费PDF全文
Gaut BS  Long AD 《The Plant cell》2003,15(7):1502-1506
  相似文献   

13.
The calculation of multipoint likelihoods of pedigree data is crucial for extracting the full available information needed for both parametric and nonparametric linkage analysis. Recent mathematical advances in both the Elston-Stewart and Lander-Green algorithms for computing exact multipoint likelihoods of pedigree data have enabled researchers to analyze data sets containing more markers and more individuals both faster and more efficiently. This paper presents novel algorithms that further extend the computational boundary of the Elston-Stewart algorithm. They have been implemented into the software package VITESSE v. 2 and are shown to be several orders of magnitude faster than the original implementation of the Elston-Stewart algorithm in VITESSE v. 1 on a variety of real pedigree data. VITESSE v. 2 was faster by a factor ranging from 168 to over 1,700 on these data sets, thus making a qualitative difference in the analysis. The main algorithm is based on the faster computation of the conditional probability of a component nuclear family within the pedigree by summing over the joint genotypes of the children instead of the parents as done in the VITESSE v. 1. This change in summation allows the parent-child transmission part of the calculation to be not only computed for each parent separately, but also for each locus separately by using inheritance vectors as is done in the Lander-Green algorithm. Computing both of these separately can lead to substantial computational savings. The use of inheritance vectors in the nuclear family calculation represents a partial synthesis of the techniques of the Lander-Green algorithm into the Elston-Stewart algorithm. In addition, the technique of local set recoding is introduced to further reduce the complexity of the nuclear family computation. These new algorithms, however, are not universally faster on all types of pedigree data compared to the method implemented in VITESSE v. 1 of summing over the parents. Therefore, a hybrid algorithm is introduced which combines the strength of both summation methods by using a numerical heuristic to decide which of the two to use for a given nuclear family within the pedigree and is shown to be faster than either method on its own. Finally, this paper discusses various complexity issues regarding both the Elston-Stewart and Lander-Green algorithms and possible future directions of further synthesis.  相似文献   

14.
Li YM  Xiang Y 《Journal of genetics》2011,90(3):453-457
We conclude that composite linkage disequilibrium (LD) measures be adopted in population-based LD mapping or association mapping studies since it is unaffected by Hardy-Weinberg disequilibrium. Although some properties of composite LD measures have been recently studied, the effects of genotyping errors on composite LD measures have not been examined. In this report, we derived deterministic formulas to evaluate the impact of genotyping errors on the composite LD measures Δ'AB and rAB, and compared the robustness of Δ'AB and rAB in the presence of genotyping errors. The results showed that Δ'AB and rAB depend on the allele frequencies and the assumed error model, and show varying degrees of robustness in the presence of errors. In general, whether there is HWD or not, rAB is more robust than Δ'AB except some special cases and the difference of robustness between Δ'AB and rAB becomes less severe as the difference between the frequencies of two SNP alleles A and B becomes smaller.  相似文献   

15.
We previously developed a method of partitioning genetic variance of a quantitative trait to loci in specific chromosomal regions. In this paper, we compare this method--multipoint IBD (identical by descent) method (MIM)--with parametric multipoint linkage analysis (MLINK). A simulation study was performed comparing the methods for the major-locus, mixed, and two-locus models. The criterion for comparisons between MIM and MLINK was the average lod score from multiple replicates of simulated data sets. The effect of gene frequency, dominance, model misspecification, marker spacing, and informativeness are also considered in a smaller set of simulations. Within the context of the models examined, the MIM approach was found to be comparable in power with parametric multipoint linkage analysis when (a) parental data are unknown, (b) the effect of the major locus is small and there is additional genetic variation, or (c) the parameters of the major-locus model are misspecified. The performance of the MIM method relative to MLINK was markedly lower when the allele frequency at the trait locus was .2 versus .5, particularly for the case when parental data were assumed to be known. Dominance at the trait major locus, as well as marker spacing and heterozygosity, did not appear to have a large effect on the ELOD comparisons.  相似文献   

16.
A selective sweep describes the reduction of diversity due to strong positive selection. If the mutation rate to a selectively beneficial allele is sufficiently high, Pennings and Hermisson (Mol Biol Evol 23(5):1076–1084, 2006a) have shown, that it becomes likely, that a selective sweep is caused by several individuals. Such an event is called a soft sweep and the complementary event of a single origin of the beneficial allele, the classical case, a hard sweep. We give analytical expressions for the linkage disequilibrium (LD) between two neutral loci linked to the selected locus, depending on the recurrent mutation to the beneficial allele, measured by D and ${\widehat{\sigma_D^2}}$ , a quantity introduced by Ohta and Kimura (Genetics 63(1):229–238, 1969), and conclude that the LD-pattern of a soft sweep differs substantially from that of a hard sweep due to haplotype structure. The analytical results are compared with simulations.  相似文献   

17.
Methods based on variance components are powerful tools for linkage analysis of quantitative traits, because they allow simultaneous consideration of all pedigree members. The central idea is to identify loci making a significant contribution to the population variance of a trait, by use of allele-sharing probabilities derived from genotyped marker loci. The technique is only as powerful as the methods used to infer these probabilities, but, to date, no implementation has made full use of the inheritance information in mapping data. Here we present a new implementation that uses an exact multipoint algorithm to extract the full probability distribution of allele sharing at every point in a mapped region. At each locus in the region, the program fits a model that partitions total phenotypic variance into components due to environmental factors, a major gene at the locus, and other unlinked genes. Numerical methods are used to derive maximum-likelihood estimates of the variance components, under the assumption of multivariate normality. A likelihood-ratio test is then applied to detect any significant effect of the hypothesized major gene. Simulations show the method to have greater power than does traditional sib-pair analysis. The method is freely available in a new release of the software package GENEHUNTER.  相似文献   

18.
Huang J  Jiang Y 《Human heredity》2001,52(2):83-98
We study the properties of a modified lod score method for testing linkage that incorporates linkage disequilibrium (LD-lod). By examination of its score statistic, we show that the LD-lod score method adaptively combines two sources of information: (a) the IBD sharing score which is informative for linkage regardless of the existence of LD and (b) the contrast between allele-specific IBD sharing scores which is informative for linkage only in the presence of LD. We also consider the connection between the LD-lod score method and the transmission-disequilibrium test (TDT) for triad data and the mean test for affected sib pair (ASP) data. We show that, for triad data, the recessive LD-lod test is asymptotically equivalent to the TDT; and for ASP data, it is an adaptive combination of the TDT and the ASP mean test. We demonstrate that the LD-lod score method has relatively good statistical efficiency in comparison with the ASP mean test and the TDT for a broad range of LD and the genetic models considered in this report. Therefore, the LD-lod score method is an interesting approach for detecting linkage when the extent of LD is unknown, such as in a genome-wide screen with a dense set of genetic markers.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号