首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 19 毫秒
1.
The genetic etiology of most cancers remains largely unclear and it has been hypothesised that common genetic variants with modest effects on disease susceptibility cause the bulk of this unexplained risk. Case-control association studies are considered the most effective strategy to identify these low-penetrance genes. While traditionally, such studies have focused on putative functional single nucleotide polymorphisms (SNPs) in candidate genes, a more comprehensive approach can now be taken, as a result of a number of recent developments: the mapping of the human genome, including the identification of almost ten million SNPs; and the development of high-throughput genotyping technologies that enable hundreds of thousands of SNPs to be genotyped in a single reaction, in multiple subjects and at an affordable cost. All common genomic variation can be captured by genotyping SNPs in gene-, pathway- or genome-wide-based strategies and these are now being applied to many diseases, including cancer. We present an outline of each of these approaches, including recent published examples, and discuss a number of challenges that remain to be addressed.  相似文献   

2.
Although intensive studies have attempted to elucidate the genetic background of bronchial asthma (BA), one of the most common of the chronic inflammatory diseases in human populations, genetic factors associated with its pathogenesis are still not well understood. We surveyed 29 possible candidate genes for this disease for single nucleotide polymorphisms (SNPs), the most frequent type of genetic variation, in genomic DNAs from Japanese BA patients. We identified 33 SNPs, only four of which had been reported previously, among 14 of those genes. We also performed association studies using 585 BA patients and 343 normal controls for these SNPs. Of the 33 SNPs tested, 32 revealed no positive association with BA, but a T924C polymorphism in the thromboxane A2 receptor gene showed significant association (chi2=4.71, P=0.030), especially with respect to adult patients (chi2=6.20, P=0.013). Our results suggest that variants of the TBXA2R gene or some nearby gene(s) may play an important role in the pathogenesis of adult BA.  相似文献   

3.
In modern genetic epidemiology studies, the association between the disease and a genomic region, such as a candidate gene, is often investigated using multiple SNPs. We propose a multilocus test of genetic association that can account for genetic effects that might be modified by variants in other genes or by environmental factors. We consider use of the venerable and parsimonious Tukey's 1-degree-of-freedom model of interaction, which is natural when individual SNPs within a gene are associated with disease through a common biological mechanism; in contrast, many standard regression models are designed as if each SNP has unique functional significance. On the basis of Tukey's model, we propose a novel but computationally simple generalized test of association that can simultaneously capture both the main effects of the variants within a genomic region and their interactions with the variants in another region or with an environmental exposure. We compared performance of our method with that of two standard tests of association, one ignoring gene-gene/gene-environment interactions and the other based on a saturated model of interactions. We demonstrate major power advantages of our method both in analysis of data from a case-control study of the association between colorectal adenoma and DNA variants in the NAT2 genomic region, which are well known to be related to a common biological phenotype, and under different models of gene-gene interactions with use of simulated data.  相似文献   

4.
A great majority of genetic markers discovered in recent genome-wide association studies have small effect sizes, and they explain only a small fraction of the genetic contribution to the diseases. How many more variants can we expect to discover and what study sizes are needed? We derive the connection between the cumulative risk of the SNP variants to the latent genetic risk model and heritability of the disease. We determine the sample size required for case-control studies in order to achieve a certain expected number of discoveries in a collection of most significant SNPs. Assuming similar allele frequencies and effect sizes of the currently validated SNPs, complex phenotypes such as type-2 diabetes would need approximately 800 variants to explain its 40% heritability. Much smaller numbers of variants are needed if we assume rare-variants but higher penetrance models. We estimate that up to 50,000 cases and an equal number of controls are needed to discover 800 common low-penetrant variants among the top 5000 SNPs. Under common and rare low-penetrance models, the very large studies required to discover the numerous variants are probably at the limit of practical feasibility. Under rare-variant with medium- to high-penetrance models (odds-ratios between 1.6 and 4.0), studies comparable in size to many existing studies are adequate provided the genotyping technology can interrogate more and rarer variants.  相似文献   

5.
It is common practice in genome-wide association studies (GWAS) to focus on the relationship between disease risk and genetic variants one marker at a time. When relevant genes are identified it is often possible to implicate biological intermediates and pathways likely to be involved in disease aetiology. However, single genetic variants typically explain small amounts of disease risk. Our idea is to construct allelic scores that explain greater proportions of the variance in biological intermediates, and subsequently use these scores to data mine GWAS. To investigate the approach''s properties, we indexed three biological intermediates where the results of large GWAS meta-analyses were available: body mass index, C-reactive protein and low density lipoprotein levels. We generated allelic scores in the Avon Longitudinal Study of Parents and Children, and in publicly available data from the first Wellcome Trust Case Control Consortium. We compared the explanatory ability of allelic scores in terms of their capacity to proxy for the intermediate of interest, and the extent to which they associated with disease. We found that allelic scores derived from known variants and allelic scores derived from hundreds of thousands of genetic markers explained significant portions of the variance in biological intermediates of interest, and many of these scores showed expected correlations with disease. Genome-wide allelic scores however tended to lack specificity suggesting that they should be used with caution and perhaps only to proxy biological intermediates for which there are no known individual variants. Power calculations confirm the feasibility of extending our strategy to the analysis of tens of thousands of molecular phenotypes in large genome-wide meta-analyses. We conclude that our method represents a simple way in which potentially tens of thousands of molecular phenotypes could be screened for causal relationships with disease without having to expensively measure these variables in individual disease collections.  相似文献   

6.
Genome‐Wide Association studies (GWAS) offer an unbiased means to understand the genetic basis of traits by identifying single nucleotide polymorphisms (SNPs) linked to causal variants of complex phenotypes. GWAS have identified a host of susceptibility SNPs associated with many important human diseases, including diseases associated with aging. In an effort to understand the genetics of broad resistance to age‐associated diseases (i.e., ‘wellness’), we performed a meta‐analysis of human GWAS. Toward that end, we compiled 372 GWAS that identified 1775 susceptibility SNPs to 105 unique diseases and used these SNPs to create a genomic landscape of disease susceptibility. This map was constructed by partitioning the genome into 200 kb ‘bins’ and mapping the 1775 susceptibility SNPs to bins based on their genomic location. Investigation of these data revealed significant heterogeneity of disease association within the genome, with 92% of bins devoid of disease‐associated SNPs. In contrast, 10 bins (0.06%) were significantly (P < 0.05) enriched for susceptibility to multiple diseases, 5 of which formed two highly significant peaks of disease association (P < 0.0001). These peaks mapped to the Major Histocompatibility (MHC) locus on 6p21 and the INK4/ARF (CDKN2a/b) tumor suppressor locus on 9p21.3. Provocatively, all 10 significantly enriched bins contained genes linked to either inflammation or cellular senescence pathways, and SNPs near regulators of senescence were particularly associated with disease of aging (e.g., cancer, atherosclerosis, type 2 diabetes, glaucoma). This analysis suggests that germline genetic heterogeneity in the regulation of immunity and cellular senescence influences the human healthspan.  相似文献   

7.
A considerable and unanticipated plasticity of the human genome, manifested as inter-individual copy number variation, has been discovered. These structural changes constitute a major source of inter-individual genetic variation that could explain variable penetrance of inherited (Mendelian and polygenic) diseases and variation in the phenotypic expression of aneuploidies and sporadic traits, and might represent a major factor in the aetiology of complex, multifactorial traits. For these reasons, an effort should be made to discover all common and rare copy number variants (CNVs) in the human population. This will also enable systematic exploration of both SNPs and CNVs in association studies to identify the genomic contributors to the common disorders and complex traits.  相似文献   

8.
9.
10.
Recent technological progress has permitted the efficient performance of genome-wide association studies (GWAS) to map genetic variants associated with common diseases. Here, we analyzed 2,893 single nucleotide polymorphisms (SNPs) that have been identified in 593 published GWAS as associated with a disease phenotype with respect to their genomic location. In absolute numbers, most significant SNPs are located in intergenic regions and introns. When compared to their representation on the chips, there is essentially overrepresentation of nonsynonymous coding SNPs (nsSNPs), synonymous coding SNPs, and SNPs in untranscribed regions upstream of genes among the disease associated SNPs. A Gene Ontology term analysis showed that genes putatively causing a phenotype often code for membrane associated proteins or signal transduction genes.  相似文献   

11.
We conducted a comprehensive study of copy number variants (CNVs) well-tagged by SNPs (r(2)≥ 0.8) by analyzing their effect on gene expression and their association with disease susceptibility and other complex human traits. We tested whether these CNVs were more likely to be functional than frequency-matched SNPs as trait-associated loci or as expression quantitative trait loci (eQTLs) influencing phenotype by altering gene regulation. Our study found that CNV-tagging SNPs are significantly enriched for cis eQTLs; furthermore, we observed that trait associations from the NHGRI catalog show an overrepresentation of SNPs tagging CNVs relative to frequency-matched SNPs. We found that these SNPs tagging CNVs are more likely to affect multiple expression traits than frequency-matched variants. Given these findings on the functional relevance of CNVs, we created an online resource of expression-associated CNVs (eCNVs) using the most comprehensive population-based map of CNVs to inform future studies of complex traits. Although previous studies of common CNVs that can be typed on existing platforms and/or interrogated by SNPs in genome-wide association studies concluded that such CNVs appear unlikely to have a major role in the genetic basis of several complex diseases examined, our findings indicate that it would be premature to dismiss the possibility that even common CNVs may contribute to complex phenotypes and at least some common diseases.  相似文献   

12.
Li H 《Human genetics》2012,131(9):1395-1401
Many common human diseases are complex and are expected to be highly heterogeneous, with multiple causative loci and multiple rare and common variants at some of the causative loci contributing to the risk of these diseases. Data from the genome-wide association studies (GWAS) and metadata such as known gene functions and pathways provide the possibility of identifying genetic variants, genes and pathways that are associated with complex phenotypes. Single-marker-based tests have been very successful in identifying thousands of genetic variants for hundreds of complex phenotypes. However, these variants only explain very small percentages of the heritabilities. To account for the locus- and allelic-heterogeneity, gene-based and pathway-based tests can be very useful in the next stage of the analysis of GWAS data. U-statistics, which summarize the genomic similarity between pair of individuals and link the genomic similarity to phenotype similarity, have proved to be very useful for testing the associations between a set of single nucleotide polymorphisms and the phenotypes. Compared to single marker analysis, the advantages afforded by the U-statistics-based methods is large when the number of markers involved is large. We review several formulations of U-statistics in genetic association studies and point out the links of these statistics with other similarity-based tests of genetic association. Finally, potential application of U-statistics in analysis of the next-generation sequencing data and rare variants association studies are discussed.  相似文献   

13.
Stringer S  Wray NR  Kahn RS  Derks EM 《PloS one》2011,6(11):e27964
Complex diseases are often highly heritable. However, for many complex traits only a small proportion of the heritability can be explained by observed genetic variants in traditional genome-wide association (GWA) studies. Moreover, for some of those traits few significant SNPs have been identified. Single SNP association methods test for association at a single SNP, ignoring the effect of other SNPs. We show using a simple multi-locus odds model of complex disease that moderate to large effect sizes of causal variants may be estimated as relatively small effect sizes in single SNP association testing. This underestimation effect is most severe for diseases influenced by numerous risk variants. We relate the underestimation effect to the concept of non-collapsibility found in the statistics literature. As described, continuous phenotypes generated with linear genetic models are not affected by this underestimation effect. Since many GWA studies apply single SNP analysis to dichotomous phenotypes, previously reported results potentially underestimate true effect sizes, thereby impeding identification of true effect SNPs. Therefore, when a multi-locus model of disease risk is assumed, a multi SNP analysis may be more appropriate.  相似文献   

14.
Linkage disequilibrium (LD) is an essential metric for selecting single-nucleotide polymorphisms (SNPs) to use in genetic studies and identifying causal variants from significant tag SNPs. The explosion in the number of polymorphisms that can now be genotyped by commercial arrays makes the interpretation of triangular correlation plots, commonly used for visualizing LD, extremely difficult in particular when large genomics regions need to be considered or when SNPs in perfect LD are not adjacent but scattered across a genomic region. We developed ArchiLD, a user-friendly graphical application for the hierarchical visualization of LD in human populations. The software provides a powerful framework for analyzing LD patterns with a particular focus on blocks of SNPs in perfect linkage as defined by r2. Thanks to its integration with the UCSC Genome Browser, LD plots can be easily overlapped with additional data on regulation, conservation and expression. ArchiLD is an intuitive solution for the visualization of LD across large or highly polymorphic genomic regions. Its ease of use and its integration with the UCSC Genome Browser annotation potential facilitates the interpretation of association results and enables a more informed selection of tag SNPs for genetic studies.  相似文献   

15.
Chen R  Davydov EV  Sirota M  Butte AJ 《PloS one》2010,5(10):e13574
Many DNA variants have been identified on more than 300 diseases and traits using Genome-Wide Association Studies (GWASs). Some have been validated using deep sequencing, but many fewer have been validated functionally, primarily focused on non-synonymous coding SNPs (nsSNPs). It is an open question whether synonymous coding SNPs (sSNPs) and other non-coding SNPs can lead to as high odds ratios as nsSNPs. We conducted a broad survey across 21,429 disease-SNP associations curated from 2,113 publications studying human genetic association, and found that nsSNPs and sSNPs shared similar likelihood and effect size for disease association. The enrichment of disease-associated SNPs around the 80(th) base in the first introns might provide an effective way to prioritize intronic SNPs for functional studies. We further found that the likelihood of disease association was positively associated with the effect size across different types of SNPs, and SNPs in the 3' untranslated regions, such as the microRNA binding sites, might be under-investigated. Our results suggest that sSNPs are just as likely to be involved in disease mechanisms, so we recommend that sSNPs discovered from GWAS should also be examined with functional studies.  相似文献   

16.
Ongoing modernization in India has elevated the prevalence of many complex genetic diseases associated with a western lifestyle and diet to near-epidemic proportions. However, although India comprises more than one sixth of the world's human population, it has largely been omitted from genomic surveys that provide the backdrop for association studies of genetic disease. Here, by genotyping India-born individuals sampled in the United States, we carry out an extensive study of Indian genetic variation. We analyze 1,200 genome-wide polymorphisms in 432 individuals from 15 Indian populations. We find that populations from India, and populations from South Asia more generally, constitute one of the major human subgroups with increased similarity of genetic ancestry. However, only a relatively small amount of genetic differentiation exists among the Indian populations. Although caution is warranted due to the fact that United States–sampled Indian populations do not represent a random sample from India, these results suggest that the frequencies of many genetic variants are distinctive in India compared to other parts of the world and that the effects of population heterogeneity on the production of false positives in association studies may be smaller in Indians (and particularly in Indian-Americans) than might be expected for such a geographically and linguistically diverse subset of the human population.  相似文献   

17.
Genetic variation in the human population may lead to functional variants of genes that contribute to risk for common chronic diseases such as cancer. In an effort to detect such possible predisposing variants, we constructed haplotypes for a candidate gene and tested their efficacy in association studies. We developed haplotypes consisting of 14 biallelic neutral-sequence variants that span 142 kb of the ATM locus. ATM is the gene responsible for the autosomal recessive disease ataxia-telangiectasia (AT). These ATM noncoding single-nucleotide polymorphisms (SNPs) were genotyped in nine CEPH families (89 individuals) and in 260 DNA samples from four different ethnic origins. Analysis of these data with an expectation-maximization algorithm revealed 22 haplotypes at this locus, with three major haplotypes having frequencies > or = .10. Tests for recombination and linkage disequilibrium (LD) show reduced recombination and extensive LD at the ATM locus, in all four ethnic groups studied. The most striking example was found in the study population of European ancestry, in which no evidence for recombination could be discerned. The potential of ATM haplotypes for detection of genetic variants through association studies was tested by analysis of 84 individuals carrying one of three ATM coding SNPs. Each coding SNP was detected by association with an ATM haplotype. We demonstrate that association studies with haplotypes for candidate genes have significant potential for the detection of genetic backgrounds that contribute to disease.  相似文献   

18.
Although the introduction of genome-wide association studies (GWAS) have greatly increased the number of genes associated with common diseases, only a small proportion of the predicted genetic contribution has so far been elucidated. Studying the cumulative variation of polymorphisms in multiple genes acting in functional pathways may provide a complementary approach to the more common single SNP association approach in understanding genetic determinants of common disease. We developed a novel pathway-based method to assess the combined contribution of multiple genetic variants acting within canonical biological pathways and applied it to data from 14,000 UK individuals with 7 common diseases. We tested inflammatory pathways for association with Crohn''s disease (CD), rheumatoid arthritis (RA) and type 1 diabetes (T1D) with 4 non-inflammatory diseases as controls. Using a variable selection algorithm, we identified variants responsible for the pathway association and evaluated their use for disease prediction using a 10 fold cross-validation framework in order to calculate out-of-sample area under the Receiver Operating Curve (AUC). The generalisability of these predictive models was tested on an independent birth cohort from Northern Finland. Multiple canonical inflammatory pathways showed highly significant associations (p 10−3–10−20) with CD, T1D and RA. Variable selection identified on average a set of 205 SNPs (149 genes) for T1D, 350 SNPs (189 genes) for RA and 493 SNPs (277 genes) for CD. The pattern of polymorphisms at these SNPS were found to be highly predictive of T1D (91% AUC) and RA (85% AUC), and weakly predictive of CD (60% AUC). The predictive ability of the T1D model (without any parameter refitting) had good predictive ability (79% AUC) in the Finnish cohort. Our analysis suggests that genetic contribution to common inflammatory diseases operates through multiple genes interacting in functional pathways.  相似文献   

19.
Genome-wide association studies (GWAS) have detected many disease associations. However, the reported variants tend to explain small fractions of risk, and there are doubts about issues such as the portability of findings over different ethnic groups or the relative roles of rare versus common variants in the genetic architecture of complex disease. Studying the degree of sharing of disease-associated variants across populations can help in solving these issues. We present a comprehensive survey of GWAS replicability across 28 diseases. Most loci and SNPs discovered in Europeans for these conditions have been extensively replicated using peoples of European and East Asian ancestry, while the replication with individuals of African ancestry is much less common. We found a strong and significant correlation of Odds Ratios across Europeans and East Asians, indicating that underlying causal variants are common and shared between the two ancestries. Moreover, SNPs that failed to replicate in East Asians map into genomic regions where Linkage Disequilibrium patterns differ significantly between populations. Finally, we observed that GWAS with larger sample sizes have detected variants with weaker effects rather than with lower frequencies. Our results indicate that most GWAS results are due to common variants. In addition, the sharing of disease alleles and the high correlation in their effect sizes suggest that most of the underlying causal variants are shared between Europeans and East Asians and that they tend to map close to the associated marker SNPs.  相似文献   

20.
Substance abuse vulnerability loci: converging genome scanning data   总被引:4,自引:0,他引:4  
Classical genetic studies suggest strong complex genetic contributions to a predisposition to abuse multiple addictive substances. Until recently, there were no reproducible genome scanning data identifying chromosomal positions likely to contain allelic variants that predispose the carrier to illegal substance addiction. Nominal results of linkage-based genome scanning studies for ethanol and nicotine addictions failed to display much agreement. Our recent data from association-based genome scans for illegal addictions, and reanalyses of previous results now provide a substantial body of converging results. The 15 reproducible chromosomal loci identified here are good candidates to harbor allelic variants that alter human substance abuse vulnerabilities. We discuss several approaches to identifying the specific gene variants that underlie these convergent association and linkage observations, and the impact that these convergent observations should have on understanding important human addictive disorders.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号