首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Dense SNP maps can be highly informative for linkage studies. But when parental genotypes are missing, multipoint linkage scores can be inflated in regions with substantial marker-marker linkage disequilibrium (LD). Such regions were observed in the Affymetrix SNP genotypes for the Genetic Analysis Workshop 14 (GAW14) Collaborative Study on the Genetics of Alcoholism (COGA) dataset, providing an opportunity to test a novel simulation strategy for studying this problem. First, an inheritance vector (with or without linkage present) is simulated for each replicate, i.e., locations of recombinations and transmission of parental chromosomes are determined for each meiosis. Then, two sets of founder haplotypes are superimposed onto the inheritance vector: one set that is inferred from the actual data and which contains the pattern of LD; and one set created by randomly selecting parental alleles based on the known allele frequencies, with no correlation (LD) between markers. Applying this strategy to a map of 176 SNPs (66 Mb of chromosome 7) for 100 replicates of 116 sibling pairs, significant inflation of multipoint linkage scores was observed in regions of high LD when parental genotypes were set to missing, with no linkage present. Similar inflation was observed in analyses of the COGA data for these affected sib pairs with parental genotypes set to missing, but not after reducing the marker map until r2 between any pair of markers was 相似文献   

2.
Most multipoint linkage programs assume linkage equilibrium among the markers being studied. The assumption is appropriate for the study of sparsely spaced markers with intermarker distances exceeding a few centimorgans, because linkage equilibrium is expected over these intervals for almost all populations. However, with recent advancements in high-throughput genotyping technology, much denser markers are available, and linkage disequilibrium (LD) may exist among the markers. Applying linkage analyses that assume linkage equilibrium to dense markers may lead to bias. Here, we demonstrated that, when some or all of the parental genotypes are missing, assuming linkage equilibrium among tightly linked markers where strong LD exists can cause apparent oversharing of multipoint identity by descent (IBD) between sib pairs and false-positive evidence for multipoint model-free linkage analysis of affected sib pair data. LD can also mimic linkage between a disease locus and multiple tightly linked markers, thus causing false-positive evidence of linkage using parametric models, particularly when heterogeneity LOD score approaches are applied. Bias can be eliminated by inclusion of parental genotype data and can be reduced when additional unaffected siblings are included in the analysis.  相似文献   

3.
Current genome-wide linkage-mapping single-nucleotide polymorphism (SNP) panels with densities of 0.3 cM are likely to have increased intermarker linkage disequilibrium (LD) compared to 5-cM microsatellite panels. The resulting difference in haplotype frequencies versus that predicted may affect multipoint linkage analysis with ungenotyped founders; a common haplotype may be assumed to be rare, leading to inflation of identical-by-descent (IBD) allele-sharing estimates and evidence for linkage. Using data simulated for the Genetic Analysis Workshop 14, we assessed bias in allele-sharing measures and nonparametric linkage (NPL all) and Kong and Cox LOD (KC-LOD) scores in a targeted analysis of regions with and without LD and with and without genes. Using over 100 replicates, we found that if founders were not genotyped, multipoint IBD estimates and delta parameters were modestly inflated and NPL all and KC-LOD scores were biased upwards in the region with LD and no gene; rather than centering on the null, the mean NPL all and KC-LOD scores were 0.51 +/- 0.91 and 0.19 +/- 0.38, respectively. Reduction of LD by dropping markers reduced this upward bias. These trends were not seen in the non-LD region with no gene. In regions with genes (with and without LD), a slight loss in power with dropping markers was suggested. These results indicate that LD should be considered in dense scans; removal of markers in LD may reduce false-positive results although information may also be lost. Methods to address LD in a high-throughput manner are needed for efficient, robust genomic scans with dense SNPs.  相似文献   

4.
Single-nucleotide polymorphisms (SNPs) are rapidly replacing microsatellites as the markers of choice for genetic linkage studies and many other studies of human pedigrees. Here, we describe an efficient approach for modeling linkage disequilibrium (LD) between markers during multipoint analysis of human pedigrees. Using a gene-counting algorithm suitable for pedigree data, our approach enables rapid estimation of allele and haplotype frequencies within clusters of tightly linked markers. In addition, with the use of a hidden Markov model, our approach allows for multipoint pedigree analysis with large numbers of SNP markers organized into clusters of markers in LD. Simulation results show that our approach resolves previously described biases in multipoint linkage analysis with SNPs that are in LD. An updated version of the freely available Merlin software package uses the approach described here to perform many common pedigree analyses, including haplotyping and haplotype frequency estimation, parametric and nonparametric multipoint linkage analysis of discrete traits, variance-components and regression-based analysis of quantitative traits, calculation of identity-by-descent or kinship coefficients, and case selection for follow-up association studies. To illustrate the possibilities, we examine a data set that provides evidence of linkage of psoriasis to chromosome 17.  相似文献   

5.
Linkage disequilibrium (LD) content was calculated for the Genetic Analysis Workshop 14 Affymetrix and Illumina single-nucleotide polymorphism (SNP) genome scans of the Collaborative Study on the Genetics of Alcoholism samples. Pair-wise LD was measured as both D' and r2 on 505 pedigree founder individuals. The r2 estimates were then used to correct the multipoint identity by descent matrix (MIBD) calculation to account for LD and LOD scores on chromosomes 3 and 18 were calculated for COGA's ttdt3 electrophysiological trait using those MIBDs. Extensive LD was observed throughout both marker sets, and it was higher in Affymetrix's more dense SNP map. However, SNP density did not solely account for Affymetrix's higher LD. MIBD estimation procedures assume linkage equilibrium to construct genotypes of non-genotyped pedigree founder individuals, and dense SNP genotyping maps are likely to contain moderate to high LD between markers. LOD score plots calculated after correction for LD followed the same general pattern as uncorrected ones. Since in our study almost half of the pedigree founders were genotyped, it is possible that LD had a minor impact on the LOD scores. Caution should probably be taken when using high density SNP maps when many non-genotyped founders are present in the study pedigrees.  相似文献   

6.
Model misspecification and multipoint linkage analysis.   总被引:9,自引:0,他引:9  
Pairwise linkage analysis is robust to genetic model misspecification provided dominance is correctly specified, the primary effect being inflation of the recombination fraction. By contrast, we show that multipoint analysis under misspecified models is not robust when a putative disease locus is placed between close flanking markers, with potentially spuriously negative multipoint lod scores being produced. The problem is due to incorrect attribution of segregation of a disease allele and the consequent conclusion of (unlikely) double crossovers between flanking markers. As a possible solution, we propose the use of high disease allele frequencies, as this allows probabilistically for nonsegregation (through parental homozygosity or dual matings). We show analytically and through analysis of pedigree data simulated under a two-locus heterogeneity model that using a disease allele frequency of 0.05 in the dominant case and 0.25 in the recessive case is quite robust in producing positive multipoint lod scores with close flanking markers across a broad range of conditions including varying allele frequencies, epistasis, genetic heterogeneity and phenocopies.  相似文献   

7.
Most linkage programs assume linkage equilibrium among multiple linked markers. This assumption may lead to bias for tightly linked markers where strong linkage disequilibrium (LD) exists. We used simulated data from Genetic Analysis Workshop 14 to examine the possible effect of LD on multipoint linkage analysis. Single-nucleotide polymorphism packets from a non-disease-related region that was generated with LD were used for both model-free and parametric linkage analyses. Results showed that high LD among markers can induce false-positive evidence of linkage for affected sib-pair analysis when parental data are missing. Bias can be eliminated with parental data and can be reduced when additional markers not in LD are included in the analyses.  相似文献   

8.
Chronic lymphocytic leukemia (CLL) and other B-cell lymphoproliferative disorders (LPDs) show clear evidence of familial aggregation, but the inherited basis is largely unknown. To identify a susceptibility gene for CLL, we conducted a genomewide linkage analysis of 115 pedigrees, using a high-density single-nucleotide polymorphism (SNP) array containing 11,560 markers. Multipoint linkage analyses were undertaken using both nonparametric (model-free) and parametric (model-based) methods. Our results confirm that the presence of high linkage disequilibrium (LD) between SNP markers can lead to inflated nonparametric linkage (NPL) and LOD scores. After the removal of high-LD SNPs, we obtained a maximum NPL of 3.14 (P=.0008) on chromosome 11p11. The same genomic position also yielded the highest multipoint heterogeneity LOD (HLOD) score under both dominant (HLOD 1.95) and recessive (HLOD 2.78) models. In addition, four other chromosomal positions (5q22-23, 6p22, 10q25, and 14q32) displayed HLOD scores >1.15 (which corresponds to a nominal P value <.01). None of the regions coincided with areas of common chromosomal abnormalities frequently observed for CLL. These findings strengthen the argument for an inherited predisposition to CLL and related B-cell LPDs.  相似文献   

9.
We used a maximum-likelihood based multipoint linkage approach implemented in SOLAR to examine simultaneously linkage for three electrophysiological endophenotypes from the Collaborative Study of the Genetics of Alcoholism: TTTH1, TTTH2, and TTTH3. These endophenotypes have been identified as markers of alcohol dependence susceptibility. Data were from 905 individuals in 143 families. Measured covariates considered included sex, age at electrophysiology data collection, habitual smoking status, and the maximum number of drinks consumed in a 24-hour period. Comparisons were made among genome-wide univariate, bivariate, and trivariate linkage analyses using genotypes based on microsatellite markers supplied by the Center for Inherited Disease Research, and genotypes based on single-nucleotide polymorphism markers provided by Illumina. All LODs were corrected to a standard equivalent to 1 degree of freedom. Using the trivariate approach and the microsatellite-based genotypes, we estimated a maximum multipoint linkage signal of LOD = 2.66 on chromosome 7q at 157 cM. Analyses using the Illumina SNP genotypes produced similar results, yielding a maximum multipoint LOD of 2.95 on 7q at 174 cM. These regions of interest correspond to those identified in the univariate and bivariate linkage screens. Our results suggest that trivariate multipoint linkage analyses have utility in the further characterization of chromosomal regions potentially containing genes influencing the phenotypes being examined. Based on a comparison of the number of LOD scores achieving statistical significance, our results suggest that the microsatellite- and Illumina SNP-based genotypes have similar utility for detecting genomic regions of interest.  相似文献   

10.
11.
Genomewide linkage studies are tending toward the use of single-nucleotide polymorphisms (SNPs) as the markers of choice. However, linkage disequilibrium (LD) between tightly linked SNPs violates the fundamental assumption of linkage equilibrium (LE) between markers that underlies most multipoint calculation algorithms currently available, and this leads to inflated affected-relative-pair allele-sharing statistics when founders' multilocus genotypes are unknown. In this study, we investigate the impact that the degree of LD, marker allele frequency, and association type have on estimating the probabilities of sharing alleles identical by descent in multipoint calculations and hence on type I error rates of different sib-pair linkage approaches that assume LE. We show that marker-marker LD does not inflate type I error rates of affected sib pair (ASP) statistics in the whole parameter space, and that, in any case, discordant sib pairs (DSPs) can be used to control for marker-marker LD in ASPs. We advocate the ASP/DSP design with appropriate sib-pair statistics that test the difference in allele sharing between ASPs and DSPs.  相似文献   

12.
We report the results of a genomewide scan for age-related macular degeneration (AMD) in 158 multiplex families. AMD classification was based on fundus photography and was assigned a grade ranging from 1 (no disease) to 5 (exudative disease). Genotyping was performed by the National Heart, Lung, and Blood Institute Mammalian Genotyping Service at Marshfield (404 short tandem repeat markers). The sample included 158 families with two or more siblings with AMD, 490 affected individuals, 101 unaffected individuals, and 38 whose affection status was unknown. Relative pairs included 511 affected sibling, 28 avuncular, 53 cousin, 7 grandparent-grandchild, and 9 grand-avuncular pairs. Two-point parametric and multipoint parametric and nonparametric analyses were performed. Maximum two-point LOD scores of 1.0-2.0 were found for markers on chromosomes 1, 2, 8, 10, 14, 15, and 22. Multipoint analyses were consistent with the two-point results for chromosomes 1, 2, 8, 10, and 22 and provided evidence for additional linkage regions on chromosomes 3, 6, 8, 12, 16, and X. Our signals on chromosomes 1q, 6p, and 10q are consistent with some other previously published results. Significant linkage to AMD was found for one marker on chromosome 2, two adjacent markers on chromosome 3, two adjacent markers on chromosome 6, and seven contiguous markers on chromosome 8, with empirical P values of .00001. The consistency of many of the other signals across both two-point and multipoint, as well as parametric and nonparametric, analyses indicate several other regions worthy of follow-up.  相似文献   

13.
Genotype data from the Illumina Linkage III SNP panel (n = 4,720 SNPs) and the Affymetrix 10 k mapping array (n = 11,120 SNPs) were used to test the effects of linkage disequilibrium (LD) between SNPs in a linkage analysis in the Collaborative Study on the Genetics of Alcoholism pedigree collection (143 pedigrees; 1,614 individuals). The average r2 between adjacent markers across the genetic map was 0.099 +/- 0.003 in the Illumina III panel and 0.17 +/- 0.003 in the Affymetrix 10 k array. In order to determine the effect of LD between marker loci in a nonparametric multipoint linkage analysis, markers in strong LD with another marker (r2 > 0.40) were removed (n = 471 loci in the Illumina panel; n = 1,804 loci in the Affymetrix panel) and the linkage analysis results were compared to the results using the entire marker sets. In all analyses using the ALDX1 phenotype, 8 linkage regions on 5 chromosomes (2, 7, 10, 11, X) were detected (peak markers p < 0.01), and the Illumina panel detected an additional region on chromosome 6. Analysis of the same pedigree set and ALDX1 phenotype using short tandem repeat markers (STRs) resulted in 3 linkage regions on 3 chromosomes (peak markers p < 0.01). These results suggest that in this pedigree set, LD between loci with spacing similar to the SNP panels tested may not significantly affect the overall detection of linkage regions in a genome scan. Moreover, since the data quality and information content are greatly improved in the SNP panels over STR genotyping methods, new linkage regions may be identified due to higher information content and data quality in a dense SNP linkage panel.  相似文献   

14.
We have completed a genome scan of a 12-generation, 3,400-member pedigree with schizophrenia. Samples from 210 individuals were collected from the pedigree. We performed an "affecteds-only" genome-scan analysis using 43 members of the pedigree. The affected individuals included 29 patients with schizophrenia, 10 with schizoaffective disorders, and 4 with psychosis not otherwise specified. Two sets of white-European allele frequencies were used-one from a Swedish control population (46 unrelated individuals) and one from the pedigree (210 individuals). All analyses pointed to the same region: D6S264, located at 6q25.2, showed a maximum LOD score of 3.45 when allele frequencies in the Swedish control population were used, compared with a maximum LOD score of 2.59 when the pedigree's allele frequencies were used. We analyzed additional markers in the 6q25 region and found a maximum LOD score of 6.6 with marker D6S253, as well as a 6-cM haplotype (markers D6S253-D6S264) that segregated, after 12 generations, with the majority of the affected individuals. Multipoint analysis was performed with the markers in the 6q25 region, and a maximum LOD score of 7.7 was obtained. To evaluate the significance of the genome scan, we simulated the complete analysis under the assumption of no linkage. The results showed that a LOD score >2.2 should be considered as suggestive of linkage, whereas a LOD score >3.7 should be considered as significant. These results suggest that a common ancestral region was inherited by the affected individuals in this large pedigree.  相似文献   

15.
Familial hypobetalipoproteinemia (FHBL) is an apparently autosomal dominant disorder of lipid metabolism characterized by less than fifth percentile age- and sex-specific levels of apolipoprotein beta (apobeta) and low-density lipoprotein-cholesterol. In a minority of cases, FHBL is due to truncation-producing mutations in the apobeta gene on chromosome 2p23-24. Previously, we reported on a four-generation FHBL kindred in which we had ruled out linkage of the trait to the apobeta gene. To locate other loci containing genes for low apobeta levels in the kindred, a genomewide search was conducted. Regions on 3p21.1-22 with two-point LOD scores >1.5 were identified. Additional markers were typed in the region of these signals. Two-point LOD scores in the region of D3S2407 increased to 3.35 at O = 0. GENEHUNTER confirmed this finding with an nonparametric multipoint LOD score of 7.5 (P=.0004). Additional model-free analyses were conducted with the square root of the apobeta level as the phenotype. Results from the Loki and SOLAR programs further confirmed linkage of FHBL to 3p21.1-22. Weaker linkage to a region near D19S916 was also indicated by Loki and SOLAR. Thus, a heretofore unidentified genetic susceptibility locus for FHBL may reside on chromosome 3.  相似文献   

16.
In complex disease studies, it is crucial to perform multipoint linkage analysis with many markers and to use robust nonparametric methods that take account of all pedigree information. Currently available methods fall short in both regards. In this paper, we describe how to extract complete multipoint inheritance information from general pedigrees of moderate size. This information is captured in the multipoint inheritance distribution, which provides a framework for a unified approach to both parametric and nonparametric methods of linkage analysis. Specifically, the approach includes the following: (1) Rapid exact computation of multipoint LOD scores involving dozens of highly polymorphic markers, even in the presence of loops and missing data. (2) Non-parametric linkage (NPL) analysis, a powerful new approach to pedigree analysis. We show that NPL is robust to uncertainty about mode of inheritance, is much more powerful than commonly used nonparametric methods, and loses little power relative to parametric linkage analysis. NPL thus appears to be the method of choice for pedigree studies of complex traits. (3) Information-content mapping, which measures the fraction of the total inheritance information extracted by the available marker data and points out the regions in which typing additional markers is most useful. (4) Maximum-likelihood reconstruction of many-marker haplotypes, even in pedigrees with missing data. We have implemented NPL analysis, LOD-score computation, information-content mapping, and haplotype reconstruction in a new computer package, GENEHUNTER. The package allows efficient multipoint analysis of pedigree data to be performed rapidly in a single user-friendly environment.  相似文献   

17.
Mild/moderate (common) myopia is a very common disorder, with both genetic and environmental influences. The environmental factors are related to near work and can be measured. There are no known genetic loci for common myopia. Our goal is to find evidence for a myopia susceptibility gene causing common myopia. Cycloplegic and manifest refraction were performed on 44 large American families of Ashkenazi Jewish descent, each with at least two affected siblings. Individuals with at least -1.00 diopter or lower in each meridian of both eyes were classified as myopic. Microsatellite genotyping with 387 markers was performed by the Center for Inherited Disease Research. Linkage analyses were conducted with parametric and nonparametric methods by use of 12 different penetrance models. The family-based association test was used for an association scan. A maximum multipoint parametric heterogeneity LOD (HLOD) score of 3.54 was observed at marker D22S685, and nonparametric linkage analyses gave consistent results, with a P value of.0002 at this marker. The parametric multipoint HLOD scores exceeded 3.0 for a 4-cM interval, and significant evidence of genetic heterogeneity was observed. This genomewide scan is the first step toward identifying a gene on chromosome 22 with an influence on common myopia. At present, we are following up our linkage results on chromosome 22 with a dense map of >1,500 single-nucleotide-polymorphism markers for fine mapping and association analyses. Identification of a susceptibility locus in this region may eventually lead to a better understanding of gene-environment interactions in the causation of this complex trait.  相似文献   

18.
Advances in dinucleotide-based genetic maps open possibilities for large scale genotyping at high resolution. The current rate-limiting steps in use of these dense maps is data interpretation (allele definition), data entry, and statistical calculations. We have recently reported automated allele identification methods. Here we show that a 10-cM framework map of the human X chromosome can be analyzed on two lanes of an automated sequencer per individual (10–12 loci per lane). We use this map and analysis strategy to generate allele data for an X-linked recessive spastic paraplegia family with a known PLP mutation. We analyzed 198 genotypes in a single gel and used the data to test three methods of data analysis: manual meiotic breakpoint mapping, automated concordance analysis, and whole chromosome multipoint linkage analysis. All methods pinpointed the correct location of the gene. We propose that multipoint exclusion mapping may permit valid inflation of LOD scores using the equation max LOD — (next best LOD).  相似文献   

19.
OBJECTIVES: Linkage disequilibrium (LD) between closely spaced SNPs can be accommodated in linkage analysis by specifying the multi-SNP haplotype frequencies, if known. Phased haplotypes in candidate regions can provide gold standard haplotype frequency estimates, and may be of inherent interest as markers. We evaluated the effects of different methods of haplotype frequency estimation, and the use of marker phase information, on linkage analysis of a multi-SNP cluster in a candidate region for Alzheimer's disease (AD). METHODS: We performed parametric linkage analysis of a five-SNP cluster in extended pedigrees to compare the use of: (1) haplotype frequencies estimated by molecular phase determination, maximum likelihood estimation, or by assuming linkage equilibrium (LE); (2) AD families or controls as the frequency source; and (3) unphased or molecularly phased SNP data. RESULTS: There was moderate to strong pairwise LD among the five SNPs. Falsely assuming LE substantially inflated the LOD score, but the method of haplotype frequency estimation and particular sample used made little difference provided that LD was accommodated. Use of phased haplotypes produced a modest increase in the LOD score over unphased SNPs. CONCLUSIONS: Ignoring LD between markers can lead to substantially inflated evidence for linkage in LOD score analysis of extended pedigrees with missing data. Use of marker phase information in linkage analysis may be important in disease studies where the costs of family recruitment and phenotyping greatly exceed the costs of phase determination.  相似文献   

20.
Prostate cancer is one of the most common cancers among men and has long been recognized to occur in familial clusters. Brothers and sons of affected men have a 2-3-fold increased risk of developing prostate cancer. However, identification of genetic susceptibility loci for prostate cancer has been extremely difficult. Although the suggestion of linkage has been reported for many chromosomes, the most promising regions have been difficult to replicate. In this study, we compare genome linkage scans using microsatellites with those using single-nucleotide polymorphisms (SNPs), performed in 467 men with prostate cancer from 167 families. For the microsatellites, the ABI Prism Linkage Mapping Set version 2, with 402 microsatellite markers, was used, and, for the SNPs, the Early Access Affymetrix Mapping 10K array was used. Our results show that the presence of linkage disequilibrium (LD) among SNPs can lead to inflated LOD scores, and this seems to be an artifact due to the assumption of linkage equilibrium that is required by the current genetic-linkage software. After excluding SNPs with high LD, we found a number of new LOD-score peaks with values of at least 2.0 that were not found by the microsatellite markers: chromosome 8, with a maximum model-free LOD score of 2.2; chromosome 2, with a LOD score of 2.1; chromosome 6, with a LOD score of 4.2; and chromosome 12, with a LOD score of 3.9. The LOD scores for chromosomes 6 and 12 are difficult to interpret, because they occurred only at the extreme ends of the chromosomes. The greatest gain provided by the SNP markers was a large increase in the linkage information content, with an average information content of 61% for the SNPs, versus an average of 41% for the microsatellite markers. The strengths and weaknesses of microsatellite versus SNP markers are illustrated by the results of our genome linkage scans.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号