首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Most linkage programs assume linkage equilibrium among multiple linked markers. This assumption may lead to bias for tightly linked markers where strong linkage disequilibrium (LD) exists. We used simulated data from Genetic Analysis Workshop 14 to examine the possible effect of LD on multipoint linkage analysis. Single-nucleotide polymorphism packets from a non-disease-related region that was generated with LD were used for both model-free and parametric linkage analyses. Results showed that high LD among markers can induce false-positive evidence of linkage for affected sib-pair analysis when parental data are missing. Bias can be eliminated with parental data and can be reduced when additional markers not in LD are included in the analyses.  相似文献   

2.
Genomewide linkage studies are tending toward the use of single-nucleotide polymorphisms (SNPs) as the markers of choice. However, linkage disequilibrium (LD) between tightly linked SNPs violates the fundamental assumption of linkage equilibrium (LE) between markers that underlies most multipoint calculation algorithms currently available, and this leads to inflated affected-relative-pair allele-sharing statistics when founders' multilocus genotypes are unknown. In this study, we investigate the impact that the degree of LD, marker allele frequency, and association type have on estimating the probabilities of sharing alleles identical by descent in multipoint calculations and hence on type I error rates of different sib-pair linkage approaches that assume LE. We show that marker-marker LD does not inflate type I error rates of affected sib pair (ASP) statistics in the whole parameter space, and that, in any case, discordant sib pairs (DSPs) can be used to control for marker-marker LD in ASPs. We advocate the ASP/DSP design with appropriate sib-pair statistics that test the difference in allele sharing between ASPs and DSPs.  相似文献   

3.
Dense SNP maps can be highly informative for linkage studies. But when parental genotypes are missing, multipoint linkage scores can be inflated in regions with substantial marker-marker linkage disequilibrium (LD). Such regions were observed in the Affymetrix SNP genotypes for the Genetic Analysis Workshop 14 (GAW14) Collaborative Study on the Genetics of Alcoholism (COGA) dataset, providing an opportunity to test a novel simulation strategy for studying this problem. First, an inheritance vector (with or without linkage present) is simulated for each replicate, i.e., locations of recombinations and transmission of parental chromosomes are determined for each meiosis. Then, two sets of founder haplotypes are superimposed onto the inheritance vector: one set that is inferred from the actual data and which contains the pattern of LD; and one set created by randomly selecting parental alleles based on the known allele frequencies, with no correlation (LD) between markers. Applying this strategy to a map of 176 SNPs (66 Mb of chromosome 7) for 100 replicates of 116 sibling pairs, significant inflation of multipoint linkage scores was observed in regions of high LD when parental genotypes were set to missing, with no linkage present. Similar inflation was observed in analyses of the COGA data for these affected sib pairs with parental genotypes set to missing, but not after reducing the marker map until r2 between any pair of markers was 相似文献   

4.
OBJECTIVES: Describe the inflation in nonparametric multipoint LOD scores due to inter-marker linkage disequilibrium (LD) across many markers with varied allele frequencies. METHOD: Using simulated two-generation families with and without parents, we conducted nonparametric multipoint linkage analysis with 2 to 10 markers with minor allele frequencies (MAF) of 0.5 and 0.1. RESULTS: Misspecification of population haplotype frequencies by assuming linkage equilibrium caused inflated multipoint LOD scores due to inter-marker LD when parental genotypes were not included. Inflation increased as more markers in LD were included and decreased as markers in equilibrium were added. When marker allele frequencies were unequal, the r2 measure of LD was a better predictor of inflation than D'. CONCLUSION: This observation strongly supports the evaluation of LD in multipoint linkage analyses, and further suggests that unaccounted for LD may be suspected when two-point and multipoint linkage analyses show a marked disparity in regions with elevated r2 measures of LD. Given the increasing popularity of high-density genome-wide SNP screens, inter-marker LD should be a concern in future linkage studies.  相似文献   

5.
Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17.  相似文献   

6.
Current genome-wide linkage-mapping single-nucleotide polymorphism (SNP) panels with densities of 0.3 cM are likely to have increased intermarker linkage disequilibrium (LD) compared to 5-cM microsatellite panels. The resulting difference in haplotype frequencies versus that predicted may affect multipoint linkage analysis with ungenotyped founders; a common haplotype may be assumed to be rare, leading to inflation of identical-by-descent (IBD) allele-sharing estimates and evidence for linkage. Using data simulated for the Genetic Analysis Workshop 14, we assessed bias in allele-sharing measures and nonparametric linkage (NPL all) and Kong and Cox LOD (KC-LOD) scores in a targeted analysis of regions with and without LD and with and without genes. Using over 100 replicates, we found that if founders were not genotyped, multipoint IBD estimates and delta parameters were modestly inflated and NPL all and KC-LOD scores were biased upwards in the region with LD and no gene; rather than centering on the null, the mean NPL all and KC-LOD scores were 0.51 +/- 0.91 and 0.19 +/- 0.38, respectively. Reduction of LD by dropping markers reduced this upward bias. These trends were not seen in the non-LD region with no gene. In regions with genes (with and without LD), a slight loss in power with dropping markers was suggested. These results indicate that LD should be considered in dense scans; removal of markers in LD may reduce false-positive results although information may also be lost. Methods to address LD in a high-throughput manner are needed for efficient, robust genomic scans with dense SNPs.  相似文献   

7.
Li B  Leal SM 《Human heredity》2008,65(4):199-208
Missing genotype data can increase false-positive evidence for linkage when either parametric or nonparametric analysis is carried out ignoring intermarker linkage disequilibrium (LD). Previously it was demonstrated by Huang et al. [1] that no bias occurs in this situation for affected sib-pairs with unrelated parents when either both parents are genotyped or genotype data is available for two additional unaffected siblings when parental genotypes are missing. However, this is not the case for autosomal recessive consanguineous pedigrees, where missing genotype data for any pedigree member within a consanguinity loop can increase false-positive evidence of linkage. False-positive evidence for linkage is further increased when cryptic consanguinity is present. The amount of false-positive evidence for linkage, and which family members aid in its reduction, is highly dependent on which family members are genotyped. When parental genotype data is available, the false-positive evidence for linkage is usually not as strong as when parental genotype data is unavailable. For a pedigree with an affected proband whose first-cousin parents have been genotyped, further reduction in the false-positive evidence of linkage can be obtained by including genotype data from additional affected siblings of the proband or genotype data from the proband's sibling-grandparents. For the situation, when parental genotypes are unavailable, false-positive evidence for linkage can be reduced by including genotype data from either unaffected siblings of the proband or the proband's married-in-grandparents in the analysis.  相似文献   

8.
The reintroduction of biallelic markers, now in the form of single-nucleotide polymorphisms (SNPs), has again raised concerns about the practicality of the use of markers with low heterozygosity for genomic screening for complex traits, even if thousands of such markers are available. Like the early blood-group markers (e.g., Rh and MNS), tightly linked biallelic SNPs can be combined into composite markers with heterozygosity similar to that of short-tandem-repeat polymorphisms. The assumptions that underlie the equivalence between single-locus multiallelic and composite markers are presented. We used computer simulation to determine the power of the Haseman-Elston test for linkage with composite markers when not all of these assumptions hold. The Genometric Analysis Simulation Program was used to simulate continuous and discrete traits, one single-locus four-allele marker, and six biallelic markers. We studied composite markers created from pairs, trios, and quartets of biallelic markers in nuclear families and in independent sib pairs. The power to detect linkage with a two-point approach for composite markers and with a multipoint approach that incorporated all six biallelic markers was compared with that for a single-locus, four-allele reference marker. Although the power to detect linkage with a single biallelic marker was considerably less than that of the reference marker, the power to detect linkage with two- and three-locus composite markers was quite similar to that of the reference marker. The power to detect linkage with four-locus composite markers was similar to that of a multipoint approach.  相似文献   

9.
The maximum-likelihood-binomial (MLB) method, based on the binomial distribution of parental marker alleles among affected offspring, recently was shown to provide promising results by two-point linkage analysis of affected-sibship data. In this article, we extend the MLB method to multipoint linkage analysis, using the general framework of hidden Markov models. Furthermore, we perform a large simulation study to investigate the robustness and power of the MLB method, compared with those of the maximum-likelihood-score (MLS) method as implemented in MAPMAKER/SIBS, in the multipoint analysis of different affected-sibship samples. Analyses of multiple-affected sibships by means of the MLS were conducted by consideration of all possible sib pairs, with (weighted MLS [MLSw]) or without (unweighted MLS [MLSu]) application of a classic weighting procedure. In simulations under the null hypothesis, the MLB provided very consistent type I errors regardless of the type of family sample (sib pairs or multiple-affected sibships), as did the MLS for samples with sib pairs only. When samples included multiple-affected sibships, the MLSu led to inflation of low type I errors, whereas the MLSw yielded very conservative tests. Power comparisons showed that the MLB generally was more powerful than the MLS, except in recessive models with allele frequencies <.3. Missing parental marker data did not strongly influence type I error and power results in these multipoint analyses. The MLB approach, which in a natural way accounts for multiple-affected sibships and which provides a simple likelihood-ratio test for linkage, is an interesting alternative for multipoint analysis of sibships.  相似文献   

10.
Once genetic linkage has been identified for a complex disease, the next step is often association analysis, in which single-nucleotide polymorphisms (SNPs) within the linkage region are genotyped and tested for association with the disease. If a SNP shows evidence of association, it is useful to know whether the linkage result can be explained, in part or in full, by the candidate SNP. We propose a novel approach that quantifies the degree of linkage disequilibrium (LD) between the candidate SNP and the putative disease locus through joint modeling of linkage and association. We describe a simple likelihood of the marker data conditional on the trait data for a sample of affected sib pairs, with disease penetrances and disease-SNP haplotype frequencies as parameters. We estimate model parameters by maximum likelihood and propose two likelihood-ratio tests to characterize the relationship of the candidate SNP and the disease locus. The first test assesses whether the candidate SNP and the disease locus are in linkage equilibrium so that the SNP plays no causal role in the linkage signal. The second test assesses whether the candidate SNP and the disease locus are in complete LD so that the SNP or a marker in complete LD with it may account fully for the linkage signal. Our method also yields a genetic model that includes parameter estimates for disease-SNP haplotype frequencies and the degree of disease-SNP LD. Our method provides a new tool for detecting linkage and association and can be extended to study designs that include unaffected family members.  相似文献   

11.
12.
Genotype data from the Illumina Linkage III SNP panel (n = 4,720 SNPs) and the Affymetrix 10 k mapping array (n = 11,120 SNPs) were used to test the effects of linkage disequilibrium (LD) between SNPs in a linkage analysis in the Collaborative Study on the Genetics of Alcoholism pedigree collection (143 pedigrees; 1,614 individuals). The average r2 between adjacent markers across the genetic map was 0.099 +/- 0.003 in the Illumina III panel and 0.17 +/- 0.003 in the Affymetrix 10 k array. In order to determine the effect of LD between marker loci in a nonparametric multipoint linkage analysis, markers in strong LD with another marker (r2 > 0.40) were removed (n = 471 loci in the Illumina panel; n = 1,804 loci in the Affymetrix panel) and the linkage analysis results were compared to the results using the entire marker sets. In all analyses using the ALDX1 phenotype, 8 linkage regions on 5 chromosomes (2, 7, 10, 11, X) were detected (peak markers p < 0.01), and the Illumina panel detected an additional region on chromosome 6. Analysis of the same pedigree set and ALDX1 phenotype using short tandem repeat markers (STRs) resulted in 3 linkage regions on 3 chromosomes (peak markers p < 0.01). These results suggest that in this pedigree set, LD between loci with spacing similar to the SNP panels tested may not significantly affect the overall detection of linkage regions in a genome scan. Moreover, since the data quality and information content are greatly improved in the SNP panels over STR genotyping methods, new linkage regions may be identified due to higher information content and data quality in a dense SNP linkage panel.  相似文献   

13.
OBJECTIVES: Linkage disequilibrium (LD) between closely spaced SNPs can be accommodated in linkage analysis by specifying the multi-SNP haplotype frequencies, if known. Phased haplotypes in candidate regions can provide gold standard haplotype frequency estimates, and may be of inherent interest as markers. We evaluated the effects of different methods of haplotype frequency estimation, and the use of marker phase information, on linkage analysis of a multi-SNP cluster in a candidate region for Alzheimer's disease (AD). METHODS: We performed parametric linkage analysis of a five-SNP cluster in extended pedigrees to compare the use of: (1) haplotype frequencies estimated by molecular phase determination, maximum likelihood estimation, or by assuming linkage equilibrium (LE); (2) AD families or controls as the frequency source; and (3) unphased or molecularly phased SNP data. RESULTS: There was moderate to strong pairwise LD among the five SNPs. Falsely assuming LE substantially inflated the LOD score, but the method of haplotype frequency estimation and particular sample used made little difference provided that LD was accommodated. Use of phased haplotypes produced a modest increase in the LOD score over unphased SNPs. CONCLUSIONS: Ignoring LD between markers can lead to substantially inflated evidence for linkage in LOD score analysis of extended pedigrees with missing data. Use of marker phase information in linkage analysis may be important in disease studies where the costs of family recruitment and phenotyping greatly exceed the costs of phase determination.  相似文献   

14.
Fan R  Jung J 《Human heredity》2003,56(4):166-187
This paper proposes variance component models for high resolution joint linkage disequilibrium (LD) and linkage mapping of quantitative trait loci (QTL) based on sibship data; this can include population data if independent individuals are treated as single sibships. One application of these models is late onset complex disease gene mapping, when parental data are not available. The models simultaneously incorporate both LD and linkage information. The LD information is contained in mean coefficients of sibship data. The linkage information is contained in the variance-covariance matrices of trait values for sibships with at least two siblings. We derive formulas for calculating the probability of sharing two trait alleles identical by descent (IBD) for sibpairs in interval mapping of QTL; this is the coefficient of dominant variance of the trait covariance of sibpairs on major QTL. To investigate the performance of the formulas, we calculate the numerical values via the formulas and get satisfactory approximations. We compare the power and sample sizes for both LD and linkage mapping. By simulation and theoretical analysis, we compare the results with those of Fulker and Abecasis "AbAw" approach. It is well known that the resolution of linkage analysis can be low for complex disease gene mapping. LD mapping, on the other hand, can increase mapping precision and is useful in high resolution mapping. Linkage analysis is less sensitive to population subdivisions and admixtures. The level of LD is sensitive to population stratification which may easily lead to spurious association. Performing a joint analysis of LD and linkage mapping can help to overcome the limits of both approaches. Moreover, the advantages of the two complementary strategies can be utilized maximally. In practice, linkage analysis may be performed using pedigree data to identify suggestive linkage between markers and trait loci based on a sparse marker map. In the presence of linkage, joint LD and linkage mapping can be carried out to do fine gene mapping based on a dense genetic map using both pedigree and population data. Population and pedigree data of any type can be combined to perform a joint analysis of high resolution LD and linkage mapping of QTL by generalizing the method.  相似文献   

15.
Model-free linkage analysis using likelihoods.   总被引:6,自引:2,他引:4       下载免费PDF全文
Misspecification of transmission model parameters can produce artifactually negative lod scores at small recombination fractions and in multipoint analysis. To avoid this problem, we have tried to devise a test that aims to detect a genetic effect at a particular locus, rather than attempting to estimate the map position of a locus with specified effect. Maximizing likelihoods over transmission model parameters, as well as linkage parameters, can produce seriously biased parameter estimates and so yield tests that lack power for the detection of linkage. However, constraining the transmission model parameters to produce the correct population prevalence largely avoids this problem. For computational convenience, we recommend that the likelihoods under linkage and non-linkage are independently maximized over a limited set of transmission models, ranging from Mendelian dominant to null effect and from null effect to Mendelian recessive. In order to test for a genetic effect at a given map position, the likelihood under linkage is maximized over admixture, the proportion of families linked. Application to simulated data for a wide range of transmission models in both affected sib pairs and pedigrees demonstrates that the new method is well behaved under the null hypothesis and provides a powerful test for linkage when it is present. This test requires no specification of transmission model parameters, apart from an approximate estimate of the population prevalence. It can be applied equally to sib pairs and pedigrees, and, since it does not diminish the lod score at test positions very close to a marker, it is suitable for application to multipoint data.  相似文献   

16.
The results of sib-pair linkage studies may be compromised if a substantial number of putative sib pairs are not actually sib pairs. For classification of pairs in a sib-pair genome scan, I propose multipoint methods that are based on a Markov-process model of allele sharing along the chromosome. These methods can be implemented by standard algorithms that compute multipoint marker allele-sharing probabilities for sib pairs. When marker data from at least half the genome are used, misclassification rates are small. The methods will be implemented in an upcoming version of the computer software package S.A.G.E.  相似文献   

17.
Autism is characterized by impairments in reciprocal communication and social interaction and by repetitive and stereotyped patterns of activities and interests. Evidence for a strong underlying genetic predisposition comes from twin and family studies, although susceptibility genes have not yet been identified. A whole-genome screen for linkage, using 83 sib pairs with autism, has been completed, and 119 markers have been genotyped in 13 candidate regions in a further 69 sib pairs. The addition of new families and markers provides further support for previous reports of linkages on chromosomes 7q and 16p. Two new regions of linkage have also been identified on chromosomes 2q and 17q. The most significant finding was a multipoint maximum LOD score (MLS) of 3.74 at marker D2S2188 on chromosome 2; this MLS increased to 4.80 when only sib pairs fulfilling strict diagnostic criteria were included. The susceptibility region on chromosome 7 was the next most significant, generating a multipoint MLS of 3.20 at marker D7S477. Chromosome 16 generated a multipoint MLS of 2.93 at D16S3102, whereas chromosome 17 generated a multipoint MLS of 2.34 at HTTINT2. With the addition of new families, there was no increased allele sharing at a number of other loci originally showing some evidence of linkage. These results support the continuing collection of multiplex sib-pair families to identify autism-susceptibility genes.  相似文献   

18.
We performed multipoint linkage analysis using 83 markers from the SNP Consortium (TSC) SNP linkage map in 3 regions covering 190 cM previously scanned with microsatellite markers and found to be linked to type 2 diabetes. Since the average linkage disequilibrium present in the TSC SNP marker clusters is relatively low, we assumed the intracluster genetic distances were a reasonable small nonzero distance (0.03 cM) and performed linkage analysis using GENEHUNTER PLUS and ASM linkage analysis software. We found that for the pedigree structures and missing data patterns in our samples the average information content in all three regions and the LOD score curves in two regions obtained from the TSC SNP markers were similar to results obtained from microsatellite marker maps with 10 cM average spacing. We also give an algorithm which extends the Lander-Green algorithm to permit multipoint linkage analysis of clusters of tightly linked markers with arbitrarily high levels of intracluster linkage disequilibrium.  相似文献   

19.
Wang T  Elston RC 《Human heredity》2005,60(3):134-142
The lack of replication of model-free linkage analyses performed on complex diseases raises questions about the robustness of these methods to various biases. The confounding effect of population stratification on a genetic association study has long been recognized in the genetic epidemiology community. Because the estimation of the number of alleles shared identical by descent (IBD) does not depend on the marker allele frequency when founders of families are observed, model-free linkage analysis is usually thought to be robust to population stratification. However, for common complex diseases, the genotypes of founders are often unobserved and therefore population stratification has the potential to impair model-free linkage analysis. Here, we demonstrate that, when some or all of the founder genotypes are missing, population stratification can introduce deleterious effects on various model-free linkage methods or designs. For an affected sib pair design, it can cause excess false-positive discoveries even when the trait distribution is homogeneous among subpopulations. After incorporating a control group of discordant sib pairs or for a quantitative trait, two circumstances must be met for population stratification to be a confounder: the distributions for both the marker and the trait must be heterogeneous among subpopulations. When this occurs, the bias can result in either a liberal, and hence invalid, test or a conservative test. Bias can be eliminated or alleviated by inclusion of founders' or other family members' genotype data. When this is not possible, new methods need to be developed to be robust to population stratification.  相似文献   

20.
Single-marker linkage-disequilibrium (LD) methods cannot fully describe disequilibrium in an entire chromosomal region surrounding a disease allele. With the advent of myriad tightly linked microsatellite markers, we have an opportunity to extend LD analysis from single markers to multiple-marker haplotypes. Haplotype analysis has increased statistical power to disclose the presence of a disease locus in situations where it correctly reflects the historical process involved. For maximum efficiency, evidence of LD ought to come not just from a single haplotype, which may well be rare, but in addition from many similar haplotypes that could have descended from the same ancestral founder but have been trimmed in succeeding generations. We present such an analysis, called the "trimmed-haplotype method." We focus on chromosomal regions that are small enough that disequilibrium in significant portions of them may have been preserved in some pedigrees and yet that contain enough markers to minimize coincidental occurrence of the haplotype in the absence of a disease allele: perhaps regions 1-2 cM in length. In general, we could have no idea what haplotype an ancestral founder carried generations ago, nor do we usually have a precise chromosomal location for the disease-susceptibility locus. Therefore, we must search through all possible haplotypes surrounding multiple locations. Since such repeated testing obliterates the sampling distribution of the test, we employ bootstrap methods to calculate significance levels. Trimmed-haplotype analysis is performed on family data in which genotypes have been assembled into haplotypes. It can be applied either to conventional parent-affected-offspring triads or to multiplex pedigrees. We present a method for summarizing the LD evidence, in any pedigree, that can be employed in trimmed-haplotype analysis as well as in other methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号