首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome-wide association studies (GWAS) have identified hundreds of associated loci across many common diseases. Most risk variants identified by GWAS will merely be tags for as-yet-unknown causal variants. It is therefore possible that identification of the causal variant, by fine mapping, will identify alleles with larger effects on genetic risk than those currently estimated from GWAS replication studies. We show that under plausible assumptions, whilst the majority of the per-allele relative risks (RR) estimated from GWAS data will be close to the true risk at the causal variant, some could be considerable underestimates. For example, for an estimated RR in the range 1.2-1.3, there is approximately a 38% chance that it exceeds 1.4 and a 10% chance that it is over 2. We show how these probabilities can vary depending on the true effects associated with low-frequency variants and on the minor allele frequency (MAF) of the most associated SNP. We investigate the consequences of the underestimation of effect sizes for predictions of an individual's disease risk and interpret our results for the design of fine mapping experiments. Although these effects mean that the amount of heritability explained by known GWAS loci is expected to be larger than current projections, this increase is likely to explain a relatively small amount of the so-called "missing" heritability.  相似文献   

2.
Late onset Alzheimer’s disease (LOAD) is a genetically complex and clinically heterogeneous disease. Recent large-scale genome wide association studies (GWAS) have identified more than twenty loci that modify risk for AD. Despite the identification of these loci, little progress has been made in identifying the functional variants that explain the association with AD risk. Thus, we sought to determine whether the novel LOAD GWAS single nucleotide polymorphisms (SNPs) alter expression of LOAD GWAS genes and whether expression of these genes is altered in AD brains. The majority of LOAD GWAS SNPs occur in gene dense regions under large linkage disequilibrium (LD) blocks, making it unclear which gene(s) are modified by the SNP. Thus, we tested for brain expression quantitative trait loci (eQTLs) between LOAD GWAS SNPs and SNPs in high LD with the LOAD GWAS SNPs in all of the genes within the GWAS loci. We found a significant eQTL between rs1476679 and PILRB and GATS, which occurs within the ZCWPW1 locus. PILRB and GATS expression levels, within the ZCWPW1 locus, were also associated with AD status. Rs7120548 was associated with MTCH2 expression, which occurs within the CELF1 locus. Additionally, expression of several genes within the CELF1 locus, including MTCH2, were highly correlated with one another and were associated with AD status. We further demonstrate that PILRB, as well as other genes within the GWAS loci, are most highly expressed in microglia. These findings together with the function of PILRB as a DAP12 receptor supports the critical role of microglia and neuroinflammation in AD risk.  相似文献   

3.
4.
李以格  张丹丹 《遗传》2021,(3):203-214
结直肠癌(colorectal cancer, CRC)是受遗传与环境因素共同影响的复杂疾病,其中遗传因素发挥重要作用。至今,全基因组关联研究(genome-wide association studies, GWAS)已经发现了大量与结直肠癌风险相关的遗传变异。随之而来的后GWAS时代,越来越多的研究侧重于利用多组学数据和功能实验对潜在的致病位点进行解析。分析表明绝大多数风险单核苷酸多态性(single nucleotide polymorphism,SNP)位于非编码区,可能通过影响转录因子结合、表观遗传修饰、染色质可及性、基因组高级结构等,调控靶基因表达。本文对后GWAS时代结直肠癌致病位点的机制研究进行综述,阐述了后GWAS对于理解结直肠癌分子机制的重要意义,并探讨了结直肠癌GWAS的应用和前景,为实现GWAS成果转化提供参考。  相似文献   

5.
Large genome-wide association studies (GWAS) have identified many genetic loci associated with risk for myocardial infarction (MI) and coronary artery disease (CAD). Concurrently, efforts such as the National Institutes of Health (NIH) Roadmap Epigenomics Project and the Encyclopedia of DNA Elements (ENCODE) Consortium have provided unprecedented data on functional elements of the human genome. In the present study, we systematically investigate the biological link between genetic variants associated with this complex disease and their impacts on gene function. First, we examined the heritability of MI/CAD according to genomic compartments. We observed that single nucleotide polymorphisms (SNPs) residing within nearby regulatory regions show significant polygenicity and contribute between 59–71% of the heritability for MI/CAD. Second, we showed that the polygenicity and heritability explained by these SNPs are enriched in histone modification marks in specific cell types. Third, we found that a statistically higher number of 45 MI/CAD-associated SNPs that have been identified from large-scale GWAS studies reside within certain functional elements of the genome, particularly in active enhancer and promoter regions. Finally, we observed significant heterogeneity of this signal across cell types, with strong signals observed within adipose nuclei, as well as brain and spleen cell types. These results suggest that the genetic etiology of MI/CAD is largely explained by tissue-specific regulatory perturbation within the human genome.  相似文献   

6.
Although genome-wide association studies (GWAS) have identified hundreds of complex trait loci, the pathomechanisms of most remain elusive. Studying the genetics of risk factors predisposing to disease is an attractive approach to identify targets for functional studies. Intracranial aneurysms (IA) are rupture-prone pouches at cerebral artery branching sites. IA is a complex disease for which GWAS have identified five loci with strong association and a further 14 loci with suggestive association. To decipher potential underlying disease mechanisms, we tested whether there are IA loci that convey their effect through elevating blood pressure (BP), a strong risk factor of IA. We performed a meta-analysis of four population-based Finnish cohorts (n(FIN) = 11 266) not selected for IA, to assess the association of previously identified IA candidate loci (n = 19) with BP. We defined systolic BP (SBP), diastolic BP, mean arterial pressure, and pulse pressure as quantitative outcome variables. The most significant result was further tested for association in the ICBP-GWAS cohort of 200 000 individuals. We found that the suggestive IA locus at 5q23.2 in PRDM6 was significantly associated with SBP in individuals of European descent (p(FIN) = 3.01E-05, p(ICBP-GWAS) = 0.0007, p(ALL) = 8.13E-07). The risk allele of IA was associated with higher SBP. PRDM6 encodes a protein predominantly expressed in vascular smooth muscle cells. Our study connects a complex disease (IA) locus with a common risk factor for the disease (SBP). We hypothesize that common variants in PRDM6 can contribute to altered vascular wall structure, hence increasing SBP and predisposing to IA. True positive associations often fail to reach genome-wide significance in GWAS. Our findings show that analysis of traditional risk factors as intermediate phenotypes is an effective tool for deciphering hidden heritability. Further, we demonstrate that common disease loci identified in a population isolate may bear wider significance.  相似文献   

7.
Genome-wide association studies (GWAS) have identified at least 133 ulcerative colitis (UC) associated loci. The role of genetic factors in clinical practice is not clearly defined. The relevance of genetic variants to disease pathogenesis is still uncertain because of not characterized gene–gene and gene–environment interactions. We examined the predictive value of combining the 133 UC risk loci with genetic interactions in an ongoing inflammatory bowel disease (IBD) GWAS. The Wellcome Trust Case–Control Consortium (WTCCC) IBD GWAS was used as a replication cohort. We applied logic regression (LR), a novel adaptive regression methodology, to search for high-order interactions. Exploratory genotype correlations with UC sub-phenotypes [extent of disease, need of surgery, age of onset, extra-intestinal manifestations and primary sclerosing cholangitis (PSC)] were conducted. The combination of 133 UC loci yielded good UC risk predictability [area under the curve (AUC) of 0.86]. A higher cumulative allele score predicted higher UC risk. Through LR, several lines of evidence for genetic interactions were identified and successfully replicated in the WTCCC cohort. The genetic interactions combined with the gene-smoking interaction significantly improved predictability in the model (AUC, from 0.86 to 0.89, P = 3.26E?05). Explained UC variance increased from 37 to 42 % after adding the interaction terms. A within case analysis found suggested genetic association with PSC. Our study demonstrates that the LR methodology allows the identification and replication of high-order genetic interactions in UC GWAS datasets. UC risk can be predicted by a 133 loci and improved by adding gene–gene and gene–environment interactions.  相似文献   

8.
Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein-protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease.  相似文献   

9.
Maria Masotti  Bin Guo  Baolin Wu 《Biometrics》2019,75(4):1076-1085
Genetic variants associated with disease outcomes can be used to develop personalized treatment. To reach this precision medicine goal, hundreds of large‐scale genome‐wide association studies (GWAS) have been conducted in the past decade to search for promising genetic variants associated with various traits. They have successfully identified tens of thousands of disease‐related variants. However, in total these identified variants explain only part of the variation for most complex traits. There remain many genetic variants with small effect sizes to be discovered, which calls for the development of (a) GWAS with more samples and more comprehensively genotyped variants, for example, the NHLBI Trans‐Omics for Precision Medicine (TOPMed) Program is planning to conduct whole genome sequencing on over 100 000 individuals; and (b) novel and more powerful statistical analysis methods. The current dominating GWAS analysis approach is the “single trait” association test, despite the fact that many GWAS are conducted in deeply phenotyped cohorts including many correlated and well‐characterized outcomes, which can help improve the power to detect novel variants if properly analyzed, as suggested by increasing evidence that pleiotropy, where a genetic variant affects multiple traits, is the norm in genome‐phenome associations. We aim to develop pleiotropy informed powerful association test methods across multiple traits for GWAS. Since it is generally very hard to access individual‐level GWAS phenotype and genotype data for those existing GWAS, due to privacy concerns and various logistical considerations, we develop rigorous statistical methods for pleiotropy informed adaptive multitrait association test methods that need only summary association statistics publicly available from most GWAS. We first develop a pleiotropy test, which has powerful performance for truly pleiotropic variants but is sensitive to the pleiotropy assumption. We then develop a pleiotropy informed adaptive test that has robust and powerful performance under various genetic models. We develop accurate and efficient numerical algorithms to compute the analytical P‐value for the proposed adaptive test without the need of resampling or permutation. We illustrate the performance of proposed methods through application to joint association test of GWAS meta‐analysis summary data for several glycemic traits. Our proposed adaptive test identified several novel loci missed by individual trait based GWAS meta‐analysis. All the proposed methods are implemented in a publicly available R package.  相似文献   

10.
While genome-wide association studies (GWAS) have primarily examined populations of European ancestry, more recent studies often involve additional populations, including admixed populations such as African Americans and Latinos. In admixed populations, linkage disequilibrium (LD) exists both at a fine scale in ancestral populations and at a coarse scale (admixture-LD) due to chromosomal segments of distinct ancestry. Disease association statistics in admixed populations have previously considered SNP association (LD mapping) or admixture association (mapping by admixture-LD), but not both. Here, we introduce a new statistical framework for combining SNP and admixture association in case-control studies, as well as methods for local ancestry-aware imputation. We illustrate the gain in statistical power achieved by these methods by analyzing data of 6,209 unrelated African Americans from the CARe project genotyped on the Affymetrix 6.0 chip, in conjunction with both simulated and real phenotypes, as well as by analyzing the FGFR2 locus using breast cancer GWAS data from 5,761 African-American women. We show that, at typed SNPs, our method yields an 8% increase in statistical power for finding disease risk loci compared to the power achieved by standard methods in case-control studies. At imputed SNPs, we observe an 11% increase in statistical power for mapping disease loci when our local ancestry-aware imputation framework and the new scoring statistic are jointly employed. Finally, we show that our method increases statistical power in regions harboring the causal SNP in the case when the causal SNP is untyped and cannot be imputed. Our methods and our publicly available software are broadly applicable to GWAS in admixed populations.  相似文献   

11.
Genome-wide association studies (GWAS) have been fruitful in identifying disease susceptibility loci for common and complex diseases. A remaining question is whether we can quantify individual disease risk based on genotype data, in order to facilitate personalized prevention and treatment for complex diseases. Previous studies have typically failed to achieve satisfactory performance, primarily due to the use of only a limited number of confirmed susceptibility loci. Here we propose that sophisticated machine-learning approaches with a large ensemble of markers may improve the performance of disease risk assessment. We applied a Support Vector Machine (SVM) algorithm on a GWAS dataset generated on the Affymetrix genotyping platform for type 1 diabetes (T1D) and optimized a risk assessment model with hundreds of markers. We subsequently tested this model on an independent Illumina-genotyped dataset with imputed genotypes (1,008 cases and 1,000 controls), as well as a separate Affymetrix-genotyped dataset (1,529 cases and 1,458 controls), resulting in area under ROC curve (AUC) of ∼0.84 in both datasets. In contrast, poor performance was achieved when limited to dozens of known susceptibility loci in the SVM model or logistic regression model. Our study suggests that improved disease risk assessment can be achieved by using algorithms that take into account interactions between a large ensemble of markers. We are optimistic that genotype-based disease risk assessment may be feasible for diseases where a notable proportion of the risk has already been captured by SNP arrays.  相似文献   

12.
More than 800 published genetic association studies have implicated dozens of potential risk loci in Parkinson's disease (PD). To facilitate the interpretation of these findings, we have created a dedicated online resource, PDGene, that comprehensively collects and meta-analyzes all published studies in the field. A systematic literature screen of -27,000 articles yielded 828 eligible articles from which relevant data were extracted. In addition, individual-level data from three publicly available genome-wide association studies (GWAS) were obtained and subjected to genotype imputation and analysis. Overall, we performed meta-analyses on more than seven million polymorphisms originating either from GWAS datasets and/or from smaller scale PD association studies. Meta-analyses on 147 SNPs were supplemented by unpublished GWAS data from up to 16,452 PD cases and 48,810 controls. Eleven loci showed genome-wide significant (P < 5 × 10(-8)) association with disease risk: BST1, CCDC62/HIP1R, DGKQ/GAK, GBA, LRRK2, MAPT, MCCC1/LAMP3, PARK16, SNCA, STK39, and SYT11/RAB25. In addition, we identified novel evidence for genome-wide significant association with a polymorphism in ITGA8 (rs7077361, OR 0.88, P = 1.3 × 10(-8)). All meta-analysis results are freely available on a dedicated online database (www.pdgene.org), which is cross-linked with a customized track on the UCSC Genome Browser. Our study provides an exhaustive and up-to-date summary of the status of PD genetics research that can be readily scaled to include the results of future large-scale genetics projects, including next-generation sequencing studies.  相似文献   

13.
He Y  Li C  Amos CI  Xiong M  Ling H  Jin L 《PloS one》2011,6(7):e22097
The genome-wide association study (GWAS) has become a routine approach for mapping disease risk loci with the advent of large-scale genotyping technologies. Multi-allelic haplotype markers can provide superior power compared with single-SNP markers in mapping disease loci. However, the application of haplotype-based analysis to GWAS is usually bottlenecked by prohibitive time cost for haplotype inference, also known as phasing. In this study, we developed an efficient approach to haplotype-based analysis in GWAS. By using a reference panel, our method accelerated the phasing process and reduced the potential bias generated by unrealistic assumptions in phasing process. The haplotype-based approach delivers great power and no type I error inflation for association studies. With only a medium-size reference panel, phasing error in our method is comparable to the genotyping error afforded by commercial genotyping solutions.  相似文献   

14.
It is hoped that an understanding of the genetic basis of Parkinson's disease (PD) will lead to an appreciation of the molecular pathogenesis of disease, which in turn will highlight potential points of therapeutic intervention. It is also hoped that such an understanding will allow identification of individuals at risk for disease prior to the onset of motor symptoms. A large amount of work has already been performed in the identification of genetic risk factors for PD and some of this work, particularly those efforts that focus on genes implicated in monogenic forms of PD, have been successful, although hard won. A new era of gene discovery has begun, with the application of genome wide association studies; these promise to facilitate the identification of common genetic risk loci for complex genetic diseases. This is the first of several high throughput technologies that promise to shed light on the (likely) myriad genetic factors involved in this complex, late-onset neurodegenerative disorder.  相似文献   

15.
Within the last 3 years, genome-wide association studies (GWAS) have had unprecedented success in identifying loci that are involved in common diseases. For example, more than 35 susceptibility loci have been identified for type 2 diabetes and 32 for obesity thus far. However, the causal gene and variant at a specific linkage disequilibrium block is often unclear. Using a combination of different mouse alleles, we can greatly facilitate the understanding of which candidate gene at a particular disease locus is associated with the disease in humans, and also provide functional analysis of variants through an allelic series, including analysis of hypomorph and hypermorph point mutations, and knockout and overexpression alleles. The phenotyping of these alleles for specific traits of interest, in combination with the functional analysis of the genetic variants, may reveal the molecular and cellular mechanism of action of these disease variants, and ultimately lead to the identification of novel therapeutic strategies for common human diseases. In this Commentary, we discuss the progress of GWAS in identifying common disease loci for metabolic disease, and the use of the mouse as a model to confirm candidate genes and provide mechanistic insights.  相似文献   

16.
Multiple sclerosis (MS) is a chronic inflammatory disease of the central nervous system. The observed type of heredity associated with MS is characteristic of polygenic diseases, which arises from a joint contribution of a number of independently acting or interacting polymorphic genes. Recently to identify the genes responsible for genetic predisposition to MS two main approaches have been applied: (1) analysis of association of individual “candidate genes” with the disease and (2) analysis of the wide spectrum of chromosomal loci (whole genome screen) linkage with the disease in families with several MS patients. In the last two years, a new method, which borrowed the best approaches of the previous studies, genome-wide association screening (GWAS), which is based on the modern high-throughput DNA analysis, has been developed. This review describes replicated (validated) results for individual genes and DNA loci located on the majority of chromosomes obtained using these three strategies as well as data on association of MS with allelic combinations of various genes.  相似文献   

17.
The increasing quantity and quality of functional genomic information motivate the assessment and integration of these data with association data, including data originating from genome-wide association studies (GWAS). We used previously described GWAS signals (“hits”) to train a regularized logistic model in order to predict SNP causality on the basis of a large multivariate functional dataset. We show how this model can be used to derive Bayes factors for integrating functional and association data into a combined Bayesian analysis. Functional characteristics were obtained from the Encyclopedia of DNA Elements (ENCODE), from published expression quantitative trait loci (eQTL), and from other sources of genome-wide characteristics. We trained the model using all GWAS signals combined, and also using phenotype specific signals for autoimmune, brain-related, cancer, and cardiovascular disorders. The non-phenotype specific and the autoimmune GWAS signals gave the most reliable results. We found SNPs with higher probabilities of causality from functional characteristics showed an enrichment of more significant p-values compared to all GWAS SNPs in three large GWAS studies of complex traits. We investigated the ability of our Bayesian method to improve the identification of true causal signals in a psoriasis GWAS dataset and found that combining functional data with association data improves the ability to prioritise novel hits. We used the predictions from the penalized logistic regression model to calculate Bayes factors relating to functional characteristics and supply these online alongside resources to integrate these data with association data.  相似文献   

18.
Penalized Multiple Regression (PMR) can be used to discover novel disease associations in GWAS datasets. In practice, proposed PMR methods have not been able to identify well-supported associations in GWAS that are undetectable by standard association tests and thus these methods are not widely applied. Here, we present a combined algorithmic and heuristic framework for PUMA (Penalized Unified Multiple-locus Association) analysis that solves the problems of previously proposed methods including computational speed, poor performance on genome-scale simulated data, and identification of too many associations for real data to be biologically plausible. The framework includes a new minorize-maximization (MM) algorithm for generalized linear models (GLM) combined with heuristic model selection and testing methods for identification of robust associations. The PUMA framework implements the penalized maximum likelihood penalties previously proposed for GWAS analysis (i.e. Lasso, Adaptive Lasso, NEG, MCP), as well as a penalty that has not been previously applied to GWAS (i.e. LOG). Using simulations that closely mirror real GWAS data, we show that our framework has high performance and reliably increases power to detect weak associations, while existing PMR methods can perform worse than single marker testing in overall performance. To demonstrate the empirical value of PUMA, we analyzed GWAS data for type 1 diabetes, Crohns''s disease, and rheumatoid arthritis, three autoimmune diseases from the original Wellcome Trust Case Control Consortium. Our analysis replicates known associations for these diseases and we discover novel etiologically relevant susceptibility loci that are invisible to standard single marker tests, including six novel associations implicating genes involved in pancreatic function, insulin pathways and immune-cell function in type 1 diabetes; three novel associations implicating genes in pro- and anti-inflammatory pathways in Crohn''s disease; and one novel association implicating a gene involved in apoptosis pathways in rheumatoid arthritis. We provide software for applying our PUMA analysis framework.  相似文献   

19.
A new approach for statistical association signal identification is developed in this paper. We consider a strategy for nonprecise signal identification by extending the well‐known signal detection and signal identification methods applicable to the multiple testing problem. Collection of statistical instruments under the presented approach is much broader than under the traditional signal identification methods, allowing more efficient signal discovery. Further assessments of maximal value and average statistics in signal discovery are improved. While our method does not attempt to detect individual predictors, it instead detects sets of predictors that are jointly associated with the outcome. Therefore, an important application would be in genome wide association study (GWAS), where it can be used to detect genes which influence the phenotype but do not contain any individually significant single nucleotide polymorphism (SNP). We compare power of the signal identification method based on extremes of single p‐values with the signal localization method based on average statistics for logarithms of p‐values. A simulation analysis informs the application of signal localization using the average statistics for wide signals discovery in Gaussian white noise process. We apply average statistics and the localization method to GWAS to discover better gene influences of regulating loci in a Chinese cohort developed for risk of nasopharyngeal carcinoma (NPC).  相似文献   

20.
Hirschsprung's disease (HSCR) is a congenital disorder, defined by partial or complete loss of the neuronal ganglion cells in the intestinal tract, which is caused by the failure of neural crest cells to migrate completely during intestinal development during fetal life. HSCR has a multifactorial etiology, and genetic factors play a key role in its pathogenesis; these include mutations within several gene loci. These have been identified by screening candidate genes, or by conducting genome wide association (GWAS) studies. However, only a small portion of them have been proposed as major genetic risk factors for the HSCR. In this review, we focus on those genes that have been identified as either low penetrant or high penetrant variants that determine the risk of Hirschsprung's disease. J. Cell. Biochem. 119: 28–33, 2018. © 2017 Wiley Periodicals, Inc.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号