首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 9 毫秒
1.
We present the results of extensive simulations that emulate the development and distribution of linkage disequilibrium (LD) between single-nucleotide polymorphisms (SNPs) and a gene locus that is phenotypically stratified into two classes (disease phenotype and wild-type phenotype). Our approach, based on coalescence theory, allows an explicit modeling of the demographic history of the population without conditioning on the age of the mutation, and serves as an efficient tool to carry out simulations. More specifically, we compare the influence that a constant population size or an exponentially growing population has on the amount of LD. These results indicate that attempts to locate single disease genes are most likely successful in small and constant populations. On the other hand, if we consider an exponentially growing population that started to expand from an initially constant population of reasonable size, then our simulations indicate a lower success rate. The power to detect association is enhanced if haplotypes constructed from several SNPs are used as markers. The versatility of the coalescence approach also allows the analysis of other relevant factors that influence the chances that a disease gene will be located. We show that several alleles leading to the same disease have no substantial influence on the amount of LD, as long as the differences between the disease-causing alleles are confined to the same region of the gene locus and as long as each allele occurs in an appreciable frequency. Our simulations indicate that mapping of less-frequent diseases is more likely to be successful. Moreover, we show that successful attempts to map complex diseases depend crucially on the phenotype-genotype correlations of all alleles at the disease locus. An analysis of lipoprotein lipase data indicates that our simulations capture the major features of LD occurring in biological data.  相似文献   

2.
In this report, we describe a simple correction for multiple testing of single-nucleotide polymorphisms (SNPs) in linkage disequilibrium (LD) with each other, on the basis of the spectral decomposition (SpD) of matrices of pairwise LD between SNPs. This method provides a useful alternative to more computationally intensive permutation tests. A user-friendly interface (SNPSpD) for performing this correction is available online (http://genepi.qimr.edu.au/general/daleN/SNPSpD/). Additionally, output from SNPSpD includes eigenvalues, principal-component coefficients, and factor "loadings" after varimax rotation, enabling the selection of a subset of SNPs that optimize the information in a genomic region.  相似文献   

3.
Common genetic polymorphisms may explain a portion of the heritable risk for common diseases. Within candidate genes, the number of common polymorphisms is finite, but direct assay of all existing common polymorphism is inefficient, because genotypes at many of these sites are strongly correlated. Thus, it is not necessary to assay all common variants if the patterns of allelic association between common variants can be described. We have developed an algorithm to select the maximally informative set of common single-nucleotide polymorphisms (tagSNPs) to assay in candidate-gene association studies, such that all known common polymorphisms either are directly assayed or exceed a threshold level of association with a tagSNP. The algorithm is based on the r(2) linkage disequilibrium (LD) statistic, because r(2) is directly related to statistical power to detect disease associations with unassayed sites. We show that, at a relatively stringent r(2) threshold (r2>0.8), the LD-selected tagSNPs resolve >80% of all haplotypes across a set of 100 candidate genes, regardless of recombination, and tag specific haplotypes and clades of related haplotypes in nonrecombinant regions. Thus, if the patterns of common variation are described for a candidate gene, analysis of the tagSNP set can comprehensively interrogate for main effects from common functional variation. We demonstrate that, although common variation tends to be shared between populations, tagSNPs should be selected separately for populations with different ancestries.  相似文献   

4.
Clarke GM  Cardon LR 《Genetics》2005,171(4):2085-2095
Parent-offspring trios are widely collected for disease gene-mapping studies and are being extensively genotyped as part of the International HapMap Project. With dense maps of markers on trios, the effects of LD and linkage can be separated, allowing estimation of recombination rates in a model-free setting. Here we define a model-free multipoint method on the basis of dense sequence polymorphism data from parent-offspring trios to estimate intermarker recombination rates. We use simulations to show that this method has up to 92% power to detect recombination hotspots of intensity 25 times background over a region of size 10 kb typed at density 1 marker per 2.5 kb and almost 100% power to detect large hotspots of intensity >125 times background over regions of size 10 kb typed with just 1 marker per 5 kb (alpha = 0.05). We found strong agreement at megabase scales between estimates from our method applied to HapMap trio data and estimates from the genetic map. At finer scales, using Centre d'Etude du Polymorphisme Humain (CEPH) pedigree data across a 10-Mb region of chromosome 20, a comparison of population recombination rate estimates obtained from our method with estimates obtained using a coalescent-based approximate-likelihood method implemented in PHASE 2.0 shows detection of the same coldspots and most hotspots: The Spearman rank correlation between the estimates from our method and those from PHASE is 0.58 (p < 2.2(-16)).  相似文献   

5.

Background  

Genome-wide association studies with single nucleotide polymorphisms (SNPs) show great promise to identify genetic determinants of complex human traits. In current analyses, genotype calling and imputation of missing genotypes are usually considered as two separated tasks. The genotypes of SNPs are first determined one at a time from allele signal intensities. Then the missing genotypes, i.e., no-calls caused by not perfectly separated signal clouds, are imputed based on the linkage disequilibrium (LD) between multiple SNPs. Although many statistical methods have been developed to improve either genotype calling or imputation of missing genotypes, treating the two steps independently can lead to loss of genetic information.  相似文献   

6.
Single-nucleotide polymorphisms (SNPs) are commonly used to study genetics for common diseases and predict pharmacological response. The selection of likely informative SNPs in association studies depends on their allele frequencies and on the linkage disequilibrium (LD) between SNPs, both of which may show interethnic differences. Among three populations consisting of 207 Chinese, 858 French, and 395 Spanish, we compared the allele frequency distributions of 64 intragenic SNPs of 35 candidate genes for cardiovascular diseases. Twenty-eight of these SNPs from 12 genes were also examined for intragenic LD. About 20% of SNPs were restricted to Europeans, being monomorphic in Chinese, among them mostly nonsynonymous coding SNPs and noncoding SNPs. Only 1.6% of SNPs were specific in Chinese, commensurate with the detection of these SNPs almost exclusively in Caucasians. Similarly, these SNPs were more often rare (<0.1 minor allele frequency) in Chinese (44.3%) than in Europeans (31.1%). The variant allele frequencies and intermarker LDs in terms of D' and Delta(2) were highly correlated between French and Spanish populations (r = 0.98-0.99, p < 0.001). However, only moderate correlations of allele frequencies and D' were found between the Chinese and the European populations (r = 0.7 and 0.3, respectively) despite a high correlation of Delta(2) values (r = 0.8). These results suggest that ethnic considerations are important in the selection of SNPs for association studies of candidate genes, as this may affect the power of the study as well as the likelihood of asking relevant questions and getting medically meaningful answers.  相似文献   

7.
Genetic diversity in modern sunflower (Helianthus annuus L.) cultivars (elite oilseed inbred lines) has been shaped by domestication and breeding bottlenecks and wild and exotic allele introgressionthe former narrowing and the latter broadening genetic diversity. To assess single nucleotide polymorphism (SNP) frequencies, nucleotide diversity, and linkage disequilibrium (LD) in modern cultivars, alleles were resequenced from 81 genic loci distributed throughout the sunflower genome. DNA polymorphisms were abundant; 1078 SNPs (1/45.7 bp) and 178 insertions-deletions (INDELs) (1/277.0 bp) were identified in 49.4 kbp of DNA/genotype. SNPs were twofold more frequent in noncoding (1/32.1 bp) than coding (1/62.8 bp) sequences. Nucleotide diversity was only slightly lower in inbred lines (θ = 0.0094) than wild populations (θ = 0.0128). Mean haplotype diversity was 0.74. When extraploted across the genome (~3500 Mbp), sunflower was predicted to harbor at least 76.4 million common SNPs among modern cultivar alleles. LD decayed more slowly in inbred lines than wild populations (mean LD declined to 0.32 by 5.5 kbp in the former, the maximum physical distance surveyed), a difference attributed to domestication and breeding bottlenecks. SNP frequencies and LD decay are sufficient in modern sunflower cultivars for very high-density genetic mapping and high-resolution association mapping.  相似文献   

8.
One approach to identify potentially important segments of the human genome is to search for DNA regions with nonrandom patterns of human sequence variation. Previous studies have investigated these patterns primarily in and around candidate gene regions. Here, we determined patterns of DNA sequence variation in 2.5 Mb of finished sequence from five regions on human chromosome 21. By sequencing 13 individual chromosomes, we identified 1460 single-nucleotide polymorphisms (SNPs) and obtained unambiguous haplotypes for all chromosomes. For all five chromosomal regions, we observed segments with high linkage disequilibrium (LD), extending from 1.7 to>81 kb (average 21.7 kb), disrupted by segments of similar or larger size with no significant LD between SNPs. At least 25% of the contig sequences consisted of segments with high LD between SNPs. Each of these segments was characterized by a restricted number of observed haplotypes,with the major haplotype found in over 60% of all chromosomes. In contrast, the interspersed segments with low LD showed significantly more haplotype patterns. The position and extent of the segments of high LD with restricted haplotype variability did not coincide with the location of coding sequences. Our results indicate that LD and haplotype patterns need to be investigated with closely spaced SNPs throughout the human genome, independent of the location of coding sequences, to reliably identify regions with significant LD useful for disease association studies.  相似文献   

9.
Li N  Stephens M 《Genetics》2003,165(4):2213-2233
We introduce a new statistical model for patterns of linkage disequilibrium (LD) among multiple SNPs in a population sample. The model overcomes limitations of existing approaches to understanding, summarizing, and interpreting LD by (i) relating patterns of LD directly to the underlying recombination process; (ii) considering all loci simultaneously, rather than pairwise; (iii) avoiding the assumption that LD necessarily has a "block-like" structure; and (iv) being computationally tractable for huge genomic regions (up to complete chromosomes). We examine in detail one natural application of the model: estimation of underlying recombination rates from population data. Using simulation, we show that in the case where recombination is assumed constant across the region of interest, recombination rate estimates based on our model are competitive with the very best of current available methods. More importantly, we demonstrate, on real and simulated data, the potential of the model to help identify and quantify fine-scale variation in recombination rate from population data. We also outline how the model could be useful in other contexts, such as in the development of more efficient haplotype-based methods for LD mapping.  相似文献   

10.
We have developed an online program, WCLUSTAG, for tag SNP selection that allows the user to specify variable tagging thresholds for different SNPs. Tag SNPs are selected such that a SNP with user-specified tagging threshold C will have a minimum R2 of C with at least one tag SNP. This flexible feature is useful for researchers who wish to prioritize genomic regions or SNPs in an association study. AVAILABILITY: The online WCLUSTAG program is available at http://bioinfo.hku.hk/wclustag/  相似文献   

11.
DNA polymorphisms and linkage disequilibrium in the angiotensinogen gene   总被引:4,自引:0,他引:4  
A number of recent studies have implicated the angiotensinogen gene in the aetiology of essential hypertension in Caucasian, Japanese and African Caribbean subjects. We have genotyped 153 healthy white Caucasian subjects at a dinucleotide repeat polymorphism and seven diallelic sites in the coding or flanking regions of the angiotensinogen gene, including one polymorphism not previously studied. We have also documented patterns of linkage disequilibrium between polymorphisms. There is evidence of variation in the frequency of several mutations when compared with published results from other Caucasian control populations, possibly due to cryptic ethnic differences between these groups. This should be considered in the design and interpretation of studies of the angiotensinogen gene. Received: 10 November 1995 / Revised: 25 March 1996  相似文献   

12.
Effectiveness of marker-assisted selection (MAS) and quantitative trait locus (QTL) mapping using population-wide linkage disequilibrium (LD) between markers and QTLs depends on the extent of LD and how it declines with distance between markers and QTLs in a population. Marker-QTL LD can be predicted from LD between markers. Our previous work evaluated LD measures between multi-allelic markers as predictors of usable LD of multi-allelic markers with QTLs. Since single nucleotide polymorphisms (SNPs) are the current marker of choice for high-density genotyping and LD-mapping of QTLs, the objective of this study was to use LD between multi-allelic markers to predict LD among biallelic SNPs or between SNPs and QTLs. Observable LD between multi-allelic markers was evaluated using nine measures. These included two pooled and standardized measures of LD between pairs of alleles at two markers based on Lewontin's LD measure, two pooled measures of squared correlations between alleles, one standardized measure using Hardy-Weinberg heterozygosities, and four measures based on the chi-square statistic for testing for association between alleles at two loci. The standardized chi-square measure that best predicted usable LD between multi-allelic markers and QTLs, based on our previous work, overestimated usable SNP-SNP or SNP-QTL LD. Instead, three other measures were found to be good predictors of usable SNP-SNP or SNP-QTL LD when LD is generated by drift. Therefore, the LD measure between multi-allelic markers that is best for predicting usable LD in a population depends on the type of markers (i.e. multi-allelic or biallelic) that will eventually be used for QTL mapping or MAS.  相似文献   

13.
Single-nucleotide polymorphisms (SNPs) may be extremely important for deciphering the impact of genetic variation on complex human diseases. The ultimate value of SNPs for linkage and association mapping studies depends in part on the distribution of SNP allele frequencies and intermarker linkage disequilibrium (LD) across populations. Limited information is available about these distributions on a genomewide scale, particularly for LD. Using 114 SNPs from 33 genes, we compared these distributions in five American populations (727 individuals) of African, European, Chinese, Hispanic, and Japanese descent. The allele frequencies were highly correlated across populations but differed by >20% for at least one pair of populations in 35% of SNPs. The correlation in LD was high for some pairs of populations but not for others (e.g., Chinese American or Japanese American vs. any other population). Regardless of population, average minor-allele frequencies were significantly higher for SNPs in noncoding regions (20%-25%) than for SNPs in coding regions (12%-16%). Interestingly, we found that intermarker LD may be strongest with pairs of SNPs in which both markers are nonconservative substitutions, compared to pairs of SNPs where at least one marker is a conservative substitution. These results suggest that population differences and marker location within the gene may be important factors in the selection of SNPs for use in the study of complex disease with linkage or association mapping methods.  相似文献   

14.
The positional cloning of genes underlying common complex diseases relies on the identification of linkage disequilibrium (LD) between genetic markers and disease. We have examined 127 polymorphisms in three genomic regions in a sample of 575 chromosomes from unrelated individuals of British ancestry. To establish phase, 800 individuals were genotyped in 160 families. The fine structure of LD was found to be highly irregular. Forty-five percent of the variation in disequilibrium measures could be explained by physical distance. Additional factors, such as allele frequency, type of polymorphism, and genomic location, explained <5% of the variation. Nevertheless, disequilibrium was occasionally detectable at 500 kb and was present for over one-half of marker pairs separated by <50 kb. Although these findings are encouraging for the prospects of a genomewide LD map, they suggest caution in interpreting localization due to allelic association.  相似文献   

15.
16.
Wang S  Huang S  Liu N  Chen L  Oh C  Zhao H 《BMC genetics》2005,6(Z1):S28
There is currently a great interest in using single-nucleotide polymorphisms (SNPs) in genetic linkage and association studies because of the abundance of SNPs as well as the availability of high-throughput genotyping technologies. In this study, we compared the performance of whole-genome scans using SNPs with microsatellites on 143 pedigrees from the Collaborative Studies on Genetics of Alcoholism provided by Genetic Analysis Workshop 14. A total of 315 microsatellites and 10,081 SNPs from Affymetrix on 22 autosomal chromosomes were used in our analyses. We found that the results from the two scans had good overall concordance. One region on chromosome 2 and two regions on chromosome 7 showed significant linkage signals (i.e., NPL >or= 2) for alcoholism from both the SNP and microsatellite scans. The different results observed between the two scans may be explained by the difference observed in information content between the SNPs and the microsatellites.  相似文献   

17.
The aim of the present analysis is to combine evidence for association from the two most commonly used designs in genetic association analysis, the case-control design and the transmission disequilibrium test (TDT) design. The cases here are affected offspring from nuclear families and are used in both the case-control and TDT designs. As a result, inference from these designs is not independent. We applied a simple logistic regression method for combining evidence for association from case-control and TDT designs to single-nucleotide polymorphism data purchased on a region on chromosome 3, replicate 1 of the Aipotu population. Combining the evidence from the case-control and TDT designs yielded a 5-10% reduction in the standard errors of the relative risk estimates. The authors did not know the results before the analyses were conducted.  相似文献   

18.
We measured linkage disequilibrium in mostly noncoding regions of Cryptomeria japonica, a conifer belonging to Cupressaceae. Linkage disequilibrium was extensive and did not decay even at a distance of 100 kb. The average estimate of the population recombination rate per base pair was 1.55 × 10(-5) and was <1/70 of that in the coding regions. We discuss the impact of low recombination rates in a large part of the genome on association studies.  相似文献   

19.

Background  

Human genome contains millions of common single nucleotide polymorphisms (SNPs) and these SNPs play an important role in understanding the association between genetic variations and human diseases. Many SNPs show correlated genotypes, or linkage disequilibrium (LD), thus it is not necessary to genotype all SNPs for association study. Many algorithms have been developed to find a small subset of SNPs called tag SNPs that are sufficient to infer all the other SNPs. Algorithms based on the r 2 LD statistic have gained popularity because r 2 is directly related to statistical power to detect disease associations. Most of existing r 2 based algorithms use pairwise LD. Recent studies show that multi-marker LD can help further reduce the number of tag SNPs. However, existing tag SNP selection algorithms based on multi-marker LD are both time-consuming and memory-consuming. They cannot work on chromosomes containing more than 100 k SNPs using length-3 tagging rules.  相似文献   

20.
CTLA4 and CD28 are important regulators of T lymphocyte activation. Gene region 2q33 carrying genes for both CTLA4 and CD28 has been shown to be linked to many autoimmune diseases. Disease associations with particular CTLA4 gene polymorphisms have been reported. Recently, first lines of evidence emerged for functional effects of CTLA4 gene polymorphisms. Two independent studies reported a reduced inhibitory function of CTLA4 in individuals with certain CTLA4 genotypes: those with a high number of microsatellite repeats in one study and those with allele +49*G in exon 1 in the other one. We analyzed the strength of linkage disequilibrium between the three known CTLA4 polymorphisms among 577 independent chromosomes. Our results show that the polymorphisms previously suggested to be the functional risk factors nearly always occur together in a very frequent haplotype. Due to this strong linkage disequilibrium, we conclude that the previous reports studying merely a single polymorphism could not distinguish which variation actually caused the functional difference. Hence, either mutagenesis approaches or studies with data on all linked polymorphisms are still needed to determine the genuine functional risk polymorphism in this gene region.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号