首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 203 毫秒
1.
全基因组关联研究现状   总被引:6,自引:1,他引:5  
Han JW  Zhang XJ 《遗传》2011,33(1):25-35
在过去的5年中, 全基因组关联研究(Genome-wide association study, GWAS)方法已被证明是研究复杂疾病和性状遗传易感变异的一种有效手段。目前, 各国科学家在多种复杂疾病和性状中开展了大量的GWAS, 对肿瘤、糖尿病、心脏病、神经精神疾病、自身免疫及免疫相关疾病等复杂疾病以及一些常见性状(如身高、体重、血脂、色素等)的遗传易感基因研究取得了重大成果。截止到2010年9月11日, 运用GWAS开展了对近200种复杂疾病/性状的研究, 发现了3 000多个疾病相关的遗传变异。文章就GWAS的发展及其在复杂疾病/性状中的应用做一综述。  相似文献   

2.
Recent work has shown that much of the missing heritability of complex traits can be resolved by estimates of heritability explained by all genotyped SNPs. However, it is currently unknown how much heritability is missing due to poor tagging or additional causal variants at known GWAS loci. Here, we use variance components to quantify the heritability explained by all SNPs at known GWAS loci in nine diseases from WTCCC1 and WTCCC2. After accounting for expectation, we observed all SNPs at known GWAS loci to explain more heritability than GWAS-associated SNPs on average (). For some diseases, this increase was individually significant: for Multiple Sclerosis (MS) () and for Crohn''s Disease (CD) (); all analyses of autoimmune diseases excluded the well-studied MHC region. Additionally, we found that GWAS loci from other related traits also explained significant heritability. The union of all autoimmune disease loci explained more MS heritability than known MS SNPs () and more CD heritability than known CD SNPs (), with an analogous increase for all autoimmune diseases analyzed. We also observed significant increases in an analysis of Rheumatoid Arthritis (RA) samples typed on ImmunoChip, with more heritability from all SNPs at GWAS loci () and more heritability from all autoimmune disease loci () compared to known RA SNPs (including those identified in this cohort). Our methods adjust for LD between SNPs, which can bias standard estimates of heritability from SNPs even if all causal variants are typed. By comparing adjusted estimates, we hypothesize that the genome-wide distribution of causal variants is enriched for low-frequency alleles, but that causal variants at known GWAS loci are skewed towards common alleles. These findings have important ramifications for fine-mapping study design and our understanding of complex disease architecture.  相似文献   

3.
Genome-wide association studies (GWAS) have been successful in identifying single nucleotide polymorphisms (SNPs) associated with many traits and diseases. However, at existing sample sizes, these variants explain only part of the estimated heritability. Leverage of GWAS results from related phenotypes may improve detection without the need for larger datasets. The Bayesian conditional false discovery rate (cFDR) constitutes an upper bound on the expected false discovery rate (FDR) across a set of SNPs whose p values for two diseases are both less than two disease-specific thresholds. Calculation of the cFDR requires only summary statistics and have several advantages over traditional GWAS analysis. However, existing methods require distinct control samples between studies. Here, we extend the technique to allow for some or all controls to be shared, increasing applicability. Several different SNP sets can be defined with the same cFDR value, and we show that the expected FDR across the union of these sets may exceed expected FDR in any single set. We describe a procedure to establish an upper bound for the expected FDR among the union of such sets of SNPs. We apply our technique to pairwise analysis of p values from ten autoimmune diseases with variable sharing of controls, enabling discovery of 59 SNP-disease associations which do not reach GWAS significance after genomic control in individual datasets. Most of the SNPs we highlight have previously been confirmed using replication studies or larger GWAS, a useful validation of our technique; we report eight SNP-disease associations across five diseases not previously declared. Our technique extends and strengthens the previous algorithm, and establishes robust limits on the expected FDR. This approach can improve SNP detection in GWAS, and give insight into shared aetiology between phenotypically related conditions.  相似文献   

4.
Although many methods are available to test sequence variants for association with complex diseases and traits, methods that specifically seek to identify causal variants are less developed. Here we develop and evaluate a Bayesian hierarchical regression method that incorporates prior information on the likelihood of variant causality through weighting of variant effects. By simulation studies using both simulated and real sequence variants, we compared a standard single variant test for analyzing variant-disease association with the proposed method using different weighting schemes. We found that by leveraging linkage disequilibrium of variants with known GWAS signals and sequence conservation (phastCons), the proposed method provides a powerful approach for detecting causal variants while controlling false positives.  相似文献   

5.
Genome-wide association studies (GWAS) provide an important approach to identifying common genetic variants that predispose to human disease. A typical GWAS may genotype hundreds of thousands of single nucleotide polymorphisms (SNPs) located throughout the human genome in a set of cases and controls. Logistic regression is often used to test for association between a SNP genotype and case versus control status, with corresponding odds ratios (ORs) typically reported only for those SNPs meeting selection criteria. However, when these estimates are based on the original data used to detect the variant, the results are affected by a selection bias sometimes referred to the "winner's curse" (Capen and others, 1971). The actual genetic association is typically overestimated. We show that such selection bias may be severe in the sense that the conditional expectation of the standard OR estimator may be quite far away from the underlying parameter. Also standard confidence intervals (CIs) may have far from the desired coverage rate for the selected ORs. We propose and evaluate 3 bias-reduced estimators, and also corresponding weighted estimators that combine corrected and uncorrected estimators, to reduce selection bias. Their corresponding CIs are also proposed. We study the performance of these estimators using simulated data sets and show that they reduce the bias and give CI coverage close to the desired level under various scenarios, even for associations having only small statistical power.  相似文献   

6.
全基因组关联分析的进展与反思   总被引:1,自引:0,他引:1  
Tu X  Shi LS  Wang F  Wang Q 《生理科学进展》2010,41(2):87-94
全基因组关联分析(genomewide association study,GWAS)是应用人类基因组中数以百万计的单核苷酸多态性(single nucleotide polymorphism,SNP)为标记进行病例-对照关联分析,以期发现影响复杂性疾病发生的遗传特征的一种新策略。近年来,随着人类基因组计划和基因组单倍体图谱计划的实施,人们已通过GWAS方法发现并鉴定了大量与人类性状或复杂性疾病关联的遗传变异,为进一步了解控制人类复杂性疾病发生的遗传特征提供了重要的线索。然而,由于造成复杂性疾病/性状的因素较多,而且GWAS研究系统较为复杂,因此目前GWAS本身亦存在诸多的问题。本文将从研究方式、研究对象、遗传标记,以及统计分析等方面,探讨GWAS的研究现状以及存在的潜在问题,并展望GWAS今后的发展方向。  相似文献   

7.
Genome-wide association studies (GWAS) have identified thousands of genetic variants that are associated with complex traits. However, a stringent significance threshold is required to identify robust genetic associations. Leveraging relevant auxiliary covariates has the potential to boost statistical power to exceed the significance threshold. Particularly, abundant pleiotropy and the non-random distribution of SNPs across various functional categories suggests that leveraging GWAS test statistics from related traits and/or functional genomic data may boost GWAS discovery. While type 1 error rate control has become standard in GWAS, control of the false discovery rate can be a more powerful approach. The conditional false discovery rate (cFDR) extends the standard FDR framework by conditioning on auxiliary data to call significant associations, but current implementations are restricted to auxiliary data satisfying specific parametric distributions, typically GWAS p-values for related traits. We relax these distributional assumptions, enabling an extension of the cFDR framework that supports auxiliary covariates from arbitrary continuous distributions (“Flexible cFDR”). Our method can be applied iteratively, thereby supporting multi-dimensional covariate data. Through simulations we show that Flexible cFDR increases sensitivity whilst controlling FDR after one or several iterations. We further demonstrate its practical potential through application to an asthma GWAS, leveraging various functional genomic data to find additional genetic associations for asthma, which we validate in the larger, independent, UK Biobank data resource.  相似文献   

8.
张统雨  朱才业  杜立新  赵福平 《遗传》2017,39(6):491-500
全基因组关联分析(genome-wide association study, GWAS)是一种复杂性状功能基因鉴定的分析策略,已成为挖掘畜禽重要经济性状候选基因的重要手段。随着绵羊和山羊基因组完成和公布,以及不同密度的SNP (single nucleotide polymorphism)芯片的推出并进行商业化推广,不仅大大丰富了羊标记辅助选择可利用的分子标记,而且还为开展重要性状的分子机理的探索提供了重要技术支撑。本文主要针对羊角、羊毛、羊奶、生长发育、肉质、繁殖和疾病等重要性状的GWAS研究所用的群体、主要研究方法和研究结果进行了综述,并对GWAS方法研究现状进行了归纳,以期为进一步利用GWAS进行羊的各种性状的遗传基础研究提供参考。  相似文献   

9.
全基因组关联研究中的交互作用研究现状   总被引:2,自引:0,他引:2  
Li FG  Wang ZP  Hu G  Li H 《遗传》2011,33(9):901-910
利用高密度单核苷酸多态(Single nucleotide polymorphism,SNP)标记在全基因组范围内检测影响复杂疾病/性状的染色体区段或基因,已经成为目前遗传学领域新的突破点之一。在全基因组关联研究(Genome-wide association study,GWAS)取得大量成果之后,研究者们对在全基因范围内研究交互作用产生了极大的热情。近几年,对交互作用的研究,无论是在方法的研发、实际的应用以及统计学上的交互向生物学上的交互转化,还是在信息组学的整合,都呈现快速发展的趋势。已有很多策略和方法被尝试用于进行全基因组交互作用分析,这些研究推动了对复杂疾病/性状遗传机制的进一步认识。基于目前全基因组交互分析所采用的各类数据处理方法的理论与算法的异同,文章拟对目前使用较为广泛的回归类方法、机器学习方法、贝叶斯模型法、SNP筛选类方法和基于并行程序的方法等5类方法加以评述,着重介绍了这些方法的算法原理、计算效率以及差别之处,以期能够为相关领域的研究者提供参考。  相似文献   

10.
Kuo CL  Zaykin DV 《Genetics》2011,189(1):329-340
In recent years, genome-wide association studies (GWAS) have uncovered a large number of susceptibility variants. Nevertheless, GWAS findings provide only tentative evidence of association, and replication studies are required to establish their validity. Due to this uncertainty, researchers often focus on top-ranking SNPs, instead of considering strict significance thresholds to guide replication efforts. The number of SNPs for replication is often determined ad hoc. We show how the rank-based approach can be used for sample size allocation in GWAS as well as for deciding on a number of SNPs for replication. The basis of this approach is the "ranking probability": chances that at least j true associations will rank among top u SNPs, when SNPs are sorted by P-value. By employing simple but accurate approximations for ranking probabilities, we accommodate linkage disequilibrium (LD) and evaluate consequences of ignoring LD. Further, we relate ranking probabilities to the proportion of false discoveries among top u SNPs. A study-specific proportion can be estimated from P-values, and its expected value can be predicted for study design applications.  相似文献   

11.
Genome-wide association studies (GWAS) have rapidly become a powerful tool in genetic studies of complex diseases and traits. Traditionally, single marker-based tests have been used prevalently in GWAS and have uncovered tens of thousands of disease-associated SNPs. Network-assisted analysis (NAA) of GWAS data is an emerging area in which network-related approaches are developed and utilized to perform advanced analyses of GWAS data in order to study various human diseases or traits. Progress has been made in both methodology development and applications of NAA in GWAS data, and it has already been demonstrated that NAA results may enhance our interpretation and prioritization of candidate genes and markers. Inspired by the strong interest in and high demand for advanced GWAS data analysis, in this review article, we discuss the methodologies and strategies that have been reported for the NAA of GWAS data. Many NAA approaches search for subnetworks and assess the combined effects of multiple genes participating in the resultant subnetworks through a gene set analysis. With no restriction to pre-defined canonical pathways, NAA has the advantage of defining subnetworks with the guidance of the GWAS data under investigation. In addition, some NAA methods prioritize genes from GWAS data based on their interconnections in the reference network. Here, we summarize NAA applications to various diseases and discuss the available options and potential caveats related to their practical usage. Additionally, we provide perspectives regarding this rapidly growing research area.  相似文献   

12.
以单核苷酸多态性(Single-nucleotide polymorphism, SNP)为遗传标记, 采用全基因组关联研究(Genome-wide association studies, GWAS)的策略, 已经在660多种疾病(或性状)中发现了3800多个遗传易感基因区域。但是, 其中最显著关联的遗传变异或致病性的遗传变异位点及其生物学功能并不完全清楚。这些位点的鉴定有助于阐明复杂疾病的生物学机制, 以及发现新的疾病标记物。后GWAS时代的主要任务之一就是通过精细定位研究找到复杂疾病易感基因区域内最显著关联的易感位点或致病性的易感位点并阐明其生物学功能。针对常见变异, 可通过推断或重测序增加SNP密度, 寻找最显著关联的SNP位点, 并通过功能元件分析、表达数量性状位点(Expression quantitative trait locus, eQTL)分析和单体型分析等方法寻找功能性的SNP位点和易感基因。针对罕见变异, 则可采用重测序、罕见单体型分析、家系分析和负荷检验等方法进行精细定位。文章对这些策略和所面临的问题进行了综述。  相似文献   

13.
The genome-wide association study (GWAS) approach has discovered hundreds of genetic variants associated with diseases and quantitative traits. However, despite clinical overlap and statistical correlation between many phenotypes, GWAS are generally performed one-phenotype-at-a-time. Here we compare the performance of modelling multiple phenotypes jointly with that of the standard univariate approach. We introduce a new method and software, MultiPhen, that models multiple phenotypes simultaneously in a fast and interpretable way. By performing ordinal regression, MultiPhen tests the linear combination of phenotypes most associated with the genotypes at each SNP, and thus potentially captures effects hidden to single phenotype GWAS. We demonstrate via simulation that this approach provides a dramatic increase in power in many scenarios. There is a boost in power for variants that affect multiple phenotypes and for those that affect only one phenotype. While other multivariate methods have similar power gains, we describe several benefits of MultiPhen over these. In particular, we demonstrate that other multivariate methods that assume the genotypes are normally distributed, such as canonical correlation analysis (CCA) and MANOVA, can have highly inflated type-1 error rates when testing case-control or non-normal continuous phenotypes, while MultiPhen produces no such inflation. To test the performance of MultiPhen on real data we applied it to lipid traits in the Northern Finland Birth Cohort 1966 (NFBC1966). In these data MultiPhen discovers 21% more independent SNPs with known associations than the standard univariate GWAS approach, while applying MultiPhen in addition to the standard approach provides 37% increased discovery. The most associated linear combinations of the lipids estimated by MultiPhen at the leading SNPs accurately reflect the Friedewald Formula, suggesting that MultiPhen could be used to refine the definition of existing phenotypes or uncover novel heritable phenotypes.  相似文献   

14.
The detrimental effects of the winner’s curse, including overestimation of the genetic effects of associated variants and underestimation of sufficient sample sizes for replication studies are well-recognized in genome-wide association studies (GWAS). These effects can be expected to worsen as the field moves from GWAS into whole genome sequencing. To date, few studies have reported statistical adjustments to the naive estimates, due to the lack of suitable statistical methods and computational tools. We have developed an efficient genome-wide non-parametric method that explicitly accounts for the threshold, ranking, and allele frequency effects in whole genome scans. Here, we implement the method to provide bias-reduced estimates via bootstrap re-sampling (BR-squared) for association studies of both disease status and quantitative traits, and we report the results of applying BR-squared to GWAS of psoriasis and HbA1c. We observed over 50% reduction in the genetic effect size estimation for many associated SNPs. This translates into a greater than fourfold increase in sample size requirements for successful replication studies, which in part explains some of the apparent failures in replicating the original signals. Our analysis suggests that adjusting for the winner’s curse is critical for interpreting findings from whole genome scans and planning replication and meta-GWAS studies, as well as in attempts to translate findings into the clinical setting.  相似文献   

15.
全基因组关联研究的深度分析策略   总被引:1,自引:1,他引:1  
Quan C  Zhang XJ 《遗传》2011,33(2):100-108
2005年至今,全基因组关联研究(Genome-wide association study,GWAS)发现了大量复杂疾病/性状相关变异。近来,科学家们关注的焦点又集中在了如何利用GWAS数据进行深入分析,期待发现更多复杂疾病/性状的易感基因。一些新的策略和方法已经被尝试应用到复杂疾病/性状GWAS的后续研究中,例如深入分析GWAS数据;鉴定新的复杂疾病/性状易感基因/位点;国际合作和Meta分析;易感区域精细定位及测序;多种疾病共同易感基因研究;以及基因型填补,基于通路的关联分析,基因-基因、基因-环境交互作用和上位研究等。这些策略和方法的应用弥补了经典GWAS的一些不足之处,进一步推动了人类对复杂疾病/性状遗传机制的认识。文章对上述研究的策略、方法以及所面临的问题和挑战进行了综述,为读者描绘了GWAS后期工作的一个简要框架。  相似文献   

16.
Although approaches for performing genome‐wide association studies (GWAS) are well developed, conventional GWAS requires high‐density genotyping of large numbers of individuals from a diversity panel. Here we report a method for performing GWAS that does not require genotyping of large numbers of individuals. Instead XP‐GWAS (extreme‐phenotype GWAS) relies on genotyping pools of individuals from a diversity panel that have extreme phenotypes. This analysis measures allele frequencies in the extreme pools, enabling discovery of associations between genetic variants and traits of interest. This method was evaluated in maize (Zea mays) using the well‐characterized kernel row number trait, which was selected to enable comparisons between the results of XP‐GWAS and conventional GWAS. An exome‐sequencing strategy was used to focus sequencing resources on genes and their flanking regions. A total of 0.94 million variants were identified and served as evaluation markers; comparisons among pools showed that 145 of these variants were statistically associated with the kernel row number phenotype. These trait‐associated variants were significantly enriched in regions identified by conventional GWAS. XP‐GWAS was able to resolve several linked QTL and detect trait‐associated variants within a single gene under a QTL peak. XP‐GWAS is expected to be particularly valuable for detecting genes or alleles responsible for quantitative variation in species for which extensive genotyping resources are not available, such as wild progenitors of crops, orphan crops, and other poorly characterized species such as those of ecological interest.  相似文献   

17.
18.
There is increasing evidence that pleiotropy, the association of multiple traits with the same genetic variants/loci, is a very common phenomenon. Cross-phenotype association tests are often used to jointly analyze multiple traits from a genome-wide association study (GWAS). The underlying methods, however, are often designed to test the global null hypothesis that there is no association of a genetic variant with any of the traits, the rejection of which does not implicate pleiotropy. In this article, we propose a new statistical approach, PLACO, for specifically detecting pleiotropic loci between two traits by considering an underlying composite null hypothesis that a variant is associated with none or only one of the traits. We propose testing the null hypothesis based on the product of the Z-statistics of the genetic variants across two studies and derive a null distribution of the test statistic in the form of a mixture distribution that allows for fractions of variants to be associated with none or only one of the traits. We borrow approaches from the statistical literature on mediation analysis that allow asymptotic approximation of the null distribution avoiding estimation of nuisance parameters related to mixture proportions and variance components. Simulation studies demonstrate that the proposed method can maintain type I error and can achieve major power gain over alternative simpler methods that are typically used for testing pleiotropy. PLACO allows correlation in summary statistics between studies that may arise due to sharing of controls between disease traits. Application of PLACO to publicly available summary data from two large case-control GWAS of Type 2 Diabetes and of Prostate Cancer implicated a number of novel shared genetic regions: 3q23 (ZBTB38), 6q25.3 (RGS17), 9p22.1 (HAUS6), 9p13.3 (UBAP2), 11p11.2 (RAPSN), 14q12 (AKAP6), 15q15 (KNL1) and 18q23 (ZNF236).  相似文献   

19.
王钰嫣  王子兴  胡耀达  王蕾  李宁  张彪  韩伟  姜晶梅 《遗传》2017,39(8):707-716
全基因组关联研究(genome-wide association study, GWAS)自2005年首次发表以来已不断增进人们对疾病遗传机制的认识,结合系统生物学并改进统计分析方法是对GWAS数据进行深度挖掘的重要途径。通路分析(pathway analysis)将GWAS所检测的遗传变异根据一定的生物学含义组合为集合进行分析,有利于发现对疾病单独效应小却在通路中相互关联的遗传变异,更有利于进行生物学解释。当前通路分析在GWAS数据上已有较为广泛的应用并取得初步成果。与此同时,通路分析的统计方法仍在不断发展。本文旨在介绍现有直接以SNP为对象的GWAS通路分析算法,根据方法中是否采用核函数分为非核算法和核算法两大类,其中非核算法主要包括基因功能富集分析(gene set enrichment analysis, GSEA)和分层贝叶斯优取(hierarchical Bayes prioritization, HBP),核算法包括线性核(linear kernel, LIN)、状态认证核(identity-by-status kernel, IBS)和尺度不变核(powered exponential kernel)。通过介绍这些方法的计算原理和优缺点,以期为新算法的构建提供更好的思路,为GWAS领域研究方法的选择提供参考。  相似文献   

20.
Recent advances in genotyping methodologies have allowed genome-wide association studies (GWAS) to accurately identify genetic variants that associate with common or pathological complex traits. Although most GWAS have focused on associations with single genetic variants, joint identification of multiple genetic variants, and how they interact, is essential for understanding the genetic architecture of complex phenotypic traits. Here, we propose an efficient stepwise method based on the Cochran-Mantel-Haenszel test (for stratified categorical data) to identify causal joint multiple genetic variants in GWAS. This method combines the CMH statistic with a stepwise procedure to detect multiple genetic variants associated with specific categorical traits, using a series of associated I × J contingency tables and a null hypothesis of no phenotype association. Through a new stratification scheme based on the sum of minor allele count criteria, we make the method more feasible for GWAS data having sample sizes of several thousands. We also examine the properties of the proposed stepwise method via simulation studies, and show that the stepwise CMH test performs better than other existing methods (e.g., logistic regression and detection of associations by Markov blanket) for identifying multiple genetic variants. Finally, we apply the proposed approach to two genomic sequencing datasets to detect linked genetic variants associated with bipolar disorder and obesity, respectively.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号