首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 140 毫秒
1.
There is increasing evidence that interindividual epigenetic variation is an etiological factor in common human diseases. Such epigenetic variation could be genetic or non-genetic in origin, and epigenome-wide association studies (EWASs) are underway for a wide variety of diseases/phenotypes. However, performing an EWAS is associated with a range of issues not typically encountered in genome-wide association studies (GWASs), such as the tissue to be analyzed. In many EWASs, it is not possible to analyze the target tissue in large numbers of live humans, and consequently surrogate tissues are employed, most commonly blood. But there is as yet no evidence demonstrating that blood is more informative than buccal cells, the other easily accessible tissue. To assess the potential of buccal cells for use in EWASs, we performed a comprehensive analysis of a buccal cell methylome using whole-genome bisulfite sequencing. Strikingly, a buccal vs. blood comparison reveals > 6X as many hypomethylated regions in buccal. These tissue-specific differentially methylated regions (tDMRs) are strongly enriched for DNaseI hotspots. Almost 75% of these tDMRs are not captured by commonly used DNA methylome profiling platforms such as Reduced Representational Bisulfite Sequencing and the Illumina Infinium HumanMethylation450 BeadChip, and they also display distinct genomic properties. Buccal hypo-tDMRs show a statistically significant enrichment near SNPs associated to disease identified through GWASs. Finally, we find that, compared with blood, buccal hypo-tDMRs show significantly greater overlap with hypomethylated regions in other tissues. We propose that for non-blood based diseases/phenotypes, buccal will be a more informative tissue for EWASs.  相似文献   

2.
基于高通量测序的全基因组关联研究策略   总被引:1,自引:0,他引:1  
周家蓬  裴智勇  陈禹保  陈润生 《遗传》2014,36(11):1099-1111
全基因组关联研究(Genome-wide association study, GWAS)是人类复杂疾病研究的重要组成部分之一,在群体水平检测全基因组范围的遗传变异与可观测性状间的遗传关联。传统的GWAS是以芯片(Array)技术获得高密度的遗传变异,尽管硕果累累,但也存在不少问题。如:所谓的“缺失的遗传力”,即利用关联分析检测达到全基因组水平显著的遗传变异位点只能解释小部分遗传力;在某些性状上不同研究的结果一致性较弱;显著关联的遗传变异位点的功能较难解释等。高通量测序技术,也称第二代测序(Next-generation sequencing, NGS)技术,可以快速、准确地产出高通量的变异位点数据,为解决以上问题提供了可行的方案。基于NGS技术的GWAS方法(NGS-GWAS),可在一定程度上弥补传统GWAS的不足。文章对NGS-GWAS策略和方法进行了系统性调研,提出了目前较为可行的NGS-GWAS的实施策略和方法,并对NGS-GWAS如何应用于个体化医疗(Personalized medicine, PM)进行了展望。  相似文献   

3.
Genome-wide association studies (GWASs) are critically dependent on detailed knowledge of the pattern of linkage disequilibrium (LD) in the human genome. GWASs generate lists of variants, usually SNPs, ranked according to the significance of their association to a trait. Downstream analyses generally focus on the gene or genes that are physically closest to these SNPs and ignore their LD profile with other SNPs. We have developed a flexible R package (LDsnpR) that efficiently assigns SNPs to genes on the basis of both their physical position and their pairwise LD with other SNPs. We used the positional-binning and LD-based-binning approaches to investigate whether including these "LD-based" SNPs would affect the interpretation of three published GWASs on bipolar affective disorder (BP) and of the imputed versions of two of these GWASs. We show how including LD can be important for interpreting and comparing GWASs. In the published, unimputed GWASs, LD-based binning effectively "recovered" 6.1%-8.3% of Ensembl-defined genes. It altered the ranks of the genes and resulted in nonnegligible differences between the lists of the top 2,000 genes emerging from the two binning approaches. It also improved the overall gene-based concordance between independent BP studies. In the imputed datasets, although the increases in coverage (>0.4%) and rank changes were more modest, even greater concordance between the studies was observed, attesting to the potential of LD-based binning on imputed data as well. Thus, ignoring LD can result in the misinterpretation of the GWAS findings and have an impact on subsequent genetic and functional studies.  相似文献   

4.

Background

Genome-wide association studies (GWASs) and global profiling of gene expression (microarrays) are two major technological breakthroughs that allow hypothesis-free identification of candidate genes associated with tumorigenesis. It is not obvious whether there is a consistency between the candidate genes identified by GWAS (GWAS genes) and those identified by profiling gene expression (microarray genes).

Methodology/Principal Findings

We used the Cancer Genetic Markers Susceptibility database to retrieve single nucleotide polymorphisms from candidate genes for prostate cancer. In addition, we conducted a large meta-analysis of gene expression data in normal prostate and prostate tumor tissue. We identified 13,905 genes that were interrogated by both GWASs and microarrays. On the basis of P values from GWASs, we selected 1,649 most significantly associated genes for functional annotation by the Database for Annotation, Visualization and Integrated Discovery. We also conducted functional annotation analysis using same number of the top genes identified in the meta-analysis of the gene expression data. We found that genes involved in cell adhesion were overrepresented among both the GWAS and microarray genes.

Conclusions/Significance

We conclude that the results of these analyses suggest that combining GWAS and microarray data would be a more effective approach than analyzing individual datasets and can help to refine the identification of candidate genes and functions associated with tumor development.  相似文献   

5.
Genome-wide association studies (GWASs) have recently revealed many genetic associations that are shared between different diseases. We propose a method, disPCA, for genome-wide characterization of shared and distinct risk factors between and within disease classes. It flips the conventional GWAS paradigm by analyzing the diseases themselves, across GWAS datasets, to explore their “shared pathogenetics”. The method applies principal component analysis (PCA) to gene-level significance scores across all genes and across GWASs, thereby revealing shared pathogenetics between diseases in an unsupervised fashion. Importantly, it adjusts for potential sources of heterogeneity present between GWAS which can confound investigation of shared disease etiology. We applied disPCA to 31 GWASs, including autoimmune diseases, cancers, psychiatric disorders, and neurological disorders. The leading principal components separate these disease classes, as well as inflammatory bowel diseases from other autoimmune diseases. Generally, distinct diseases from the same class tend to be less separated, which is in line with their increased shared etiology. Enrichment analysis of genes contributing to leading principal components revealed pathways that are implicated in the immune system, while also pointing to pathways that have yet to be explored before in this context. Our results point to the potential of disPCA in going beyond epidemiological findings of the co-occurrence of distinct diseases, to highlighting novel genes and pathways that unsupervised learning suggest to be key players in the variability across diseases.  相似文献   

6.
Variation in cystic fibrosis (CF) phenotypes, including lung disease severity, age of onset of persistent Pseudomonas aeruginosa (Paeruginosa) lung infection, and presence of meconium ileus (MI), has been partially explained by genome-wide association studies (GWASs). It is not expected that GWASs alone are sufficiently powered to uncover all heritable traits associated with CF phenotypic diversity. Therefore, we utilized gene expression association from lymphoblastoid cells lines from 754 p.Phe508del CF-affected homozygous individuals to identify genes and pathways. LPAR6, a G protein coupled receptor, associated with lung disease severity (false discovery rate q value = 0.0006). Additional pathway analyses, utilizing a stringent permutation-based approach, identified unique signals for all three phenotypes. Pathways associated with lung disease severity were annotated in three broad categories: (1) endomembrane function, containing p.Phe508del processing genes, providing evidence of the importance of p.Phe508del processing to explain lung phenotype variation; (2) HLA class I genes, extending previous GWAS findings in the HLA region; and (3) endoplasmic reticulum stress response genes. Expression pathways associated with lung disease were concordant for some endosome and HLA pathways, with pathways identified using GWAS associations from 1,978 CF-affected individuals. Pathways associated with age of onset of persistent P. aeruginosa infection were enriched for HLA class II genes, and those associated with MI were related to oxidative phosphorylation. Formal testing demonstrated that genes showing differential expression associated with lung disease severity were enriched for heritable genetic variation and expression quantitative traits. Gene expression provided a powerful tool to identify unrecognized heritable variation, complementing ongoing GWASs in this rare disease.  相似文献   

7.
Elucidating the genetic basis of complex traits and diseases in non-European populations is particularly challenging because US minority populations have been under-represented in genetic association studies. We developed an empirical Bayes approach named XPEB (cross-population empirical Bayes), designed to improve the power for mapping complex-trait-associated loci in a minority population by exploiting information from genome-wide association studies (GWASs) from another ethnic population. Taking as input summary statistics from two GWASs—a target GWAS from an ethnic minority population of primary interest and an auxiliary base GWAS (such as a larger GWAS in Europeans)—our XPEB approach reprioritizes SNPs in the target population to compute local false-discovery rates. We demonstrated, through simulations, that whenever the base GWAS harbors relevant information, XPEB gains efficiency. Moreover, XPEB has the ability to discard irrelevant auxiliary information, providing a safeguard against inflated false-discovery rates due to genetic heterogeneity between populations. Applied to a blood-lipids study in African Americans, XPEB more than quadrupled the discoveries from the conventional approach, which used a target GWAS alone, bringing the number of significant loci from 14 to 65. Thus, XPEB offers a flexible framework for mapping complex traits in minority populations.  相似文献   

8.
The concentration of low-density lipoprotein (LDL) cholesterol (C) in plasma is a key determinant of cardiovascular disease risk and human genetic studies have long endeavoured to elucidate the pathways that regulate LDL metabolism. Massive genome-wide association studies (GWASs) of common genetic variation associated with LDL-C in the population have implicated SORT1 in LDL metabolism. Using experimental paradigms and standards appropriate for understanding the mechanisms by which common variants alter phenotypic expression, three recent publications have presented divergent and even contradictory findings. Interestingly, although these reports each linked SORT1 to LDL metabolism, they did not agree on a mechanism to explain the association. Here, we review recent mechanistic studies of SORT1 - the first gene identified by GWAS as a determinant of plasma LDL-C to be evaluated mechanistically.  相似文献   

9.
Only a small proportion of genetic variation in complex traits has been explained by SNPs from genome-wide association studies (GWASs). We report the results from two GWASs for serum markers of iron status (serum iron, serum transferrin, transferrin saturation with iron, and serum ferritin), which are important in iron overload (e.g., hemochromatosis) and deficiency (e.g., anemia) conditions. We performed two GWASs on samples of Australians of European descent. In the first GWAS, 411 adolescent twins and their siblings were genotyped with 100K SNPs. rs1830084, 10.8 kb 3′ of TF, was significantly associated with serum transferrin (p total association test = 1.0 × 10−9; p within-family test = 2.2 × 10−5). In the second GWAS on an independent sample of 459 female monozygotic (MZ) twin pairs genotyped with 300K SNPs, we found rs3811647 (within intron 11 of TF, HapMap CEU r2 with rs1830084 = 0.86) was significantly associated with serum transferrin (p = 3.0 × 10−15). In the second GWAS, we found two additional and independent SNPs on TF (rs1799852 and rs2280673) and confirmed the known C282Y mutation in HFE to be independently associated with serum transferrin. The three variants in TF (rs3811647, rs1799852 and rs2280673) plus the HFE C282Y mutation explained ~40% of genetic variation in serum transferrin (p = 7.8 × 10−25). These findings are potentially important for our understanding of iron metabolism and of regulation of hepatic protein secretion, and also strongly support the hypothesis that the genetic architecture of some endophenotypes may be simpler than that of disease.  相似文献   

10.
Although genome-wide association studies (GWASs) have discovered numerous novel genetic variants associated with many complex traits and diseases, those genetic variants typically explain only a small fraction of phenotypic variance. Factors that account for phenotypic variance include environmental factors and gene-by-environment interactions (GEIs). Recently, several studies have conducted genome-wide gene-by-environment association analyses and demonstrated important roles of GEIs in complex traits. One of the main challenges in these association studies is to control effects of population structure that may cause spurious associations. Many studies have analyzed how population structure influences statistics of genetic variants and developed several statistical approaches to correct for population structure. However, the impact of population structure on GEI statistics in GWASs has not been extensively studied and nor have there been methods designed to correct for population structure on GEI statistics. In this paper, we show both analytically and empirically that population structure may cause spurious GEIs and use both simulation and two GWAS datasets to support our finding. We propose a statistical approach based on mixed models to account for population structure on GEI statistics. We find that our approach effectively controls population structure on statistics for GEIs as well as for genetic variants.  相似文献   

11.
Genome-wide association studies (GWAS) have become a widely used approach for genetic association studies of various human traits. A few GWAS have been conducted with the goal of identifying novel loci for pigmentation traits, melanoma, and non-melanoma skin cancer. Nevertheless, the phenotype variation explained by the genetic markers identified so far is limited. In this review, we discuss the GWAS study design and its application in pigmentation and skin cancer research. Furthermore, we summarize recent developments in post-GWAS activities such as meta-analysis, pathway analysis, and risk prediction.  相似文献   

12.
To date, the widely used genome-wide association studies (GWASs) of the human genome have reported thousands of variants that are significantly associated with various human traits. However, in the vast majority of these cases, the causal variants responsible for the observed associations remain unknown. In order to facilitate the identification of causal variants, we designed a simple computational method called the "preferential linkage disequilibrium (LD)" approach, which follows the variants discovered by GWASs to pinpoint the causal variants, even if they are rare compared with the discovery variants. The approach is based on the hypothesis that the GWAS-discovered variant is better at tagging the causal variants than are most other variants evaluated in the original GWAS. Applying the preferential LD approach to the GWAS signals of five human traits for which the causal variants are already known, we successfully placed the known causal variants among the top ten candidates in the majority of these cases. Application of this method to additional GWASs, including those of hepatitis C virus treatment response, plasma levels of clotting factors, and late-onset Alzheimer disease, has led to the identification of a number of promising candidate causal variants. This method represents a useful tool for delineating causal variants by bringing together GWAS signals and the rapidly accumulating variant data from next-generation sequencing.  相似文献   

13.
14.
Many candidate genes have been studied for asthma, but replication has varied. Novel candidate genes have been identified for various complex diseases using genome-wide association studies (GWASs). We conducted a GWAS in 492 Mexican children with asthma, predominantly atopic by skin prick test, and their parents using the Illumina HumanHap 550 K BeadChip to identify novel genetic variation for childhood asthma. The 520,767 autosomal single nucleotide polymorphisms (SNPs) passing quality control were tested for association with childhood asthma using log-linear regression with a log-additive risk model. Eleven of the most significantly associated GWAS SNPs were tested for replication in an independent study of 177 Mexican case–parent trios with childhood-onset asthma and atopy using log-linear analysis. The chromosome 9q21.31 SNP rs2378383 (p = 7.10×10−6 in the GWAS), located upstream of transducin-like enhancer of split 4 (TLE4), gave a p-value of 0.03 and the same direction and magnitude of association in the replication study (combined p = 6.79×10−7). Ancestry analysis on chromosome 9q supported an inverse association between the rs2378383 minor allele (G) and childhood asthma. This work identifies chromosome 9q21.31 as a novel susceptibility locus for childhood asthma in Mexicans. Further, analysis of genome-wide expression data in 51 human tissues from the Novartis Research Foundation showed that median GWAS significance levels for SNPs in genes expressed in the lung differed most significantly from genes not expressed in the lung when compared to 50 other tissues, supporting the biological plausibility of our overall GWAS findings and the multigenic etiology of childhood asthma.  相似文献   

15.
Fisher’s partitioning of genotypic values and genetic variance is highly relevant in the current era of genome-wide association studies (GWASs). However, despite being more than a century old, a number of persistent misconceptions related to nonadditive genetic effects remain. We developed a user-friendly web tool, the Falconer ShinyApp, to show how the combination of gene action and allele frequencies at causal loci translate to genetic variance and genetic variance components for a complex trait. The app can be used to demonstrate the relationship between a SNP effect size estimated from GWAS and the variation the SNP generates in the population, i.e., how locus-specific effects lead to individual differences in traits. In addition, it can also be used to demonstrate how within and between locus interactions (dominance and epistasis, respectively) usually do not lead to a large amount of nonadditive variance relative to additive variance, and therefore, that these interactions usually do not explain individual differences in a population.  相似文献   

16.
Pathway-based analysis as an alternative approach can provide complementary information to single-marker genome-wide association studies (GWASs), which always ignore the epistasis and does not have sufficient power to find rare variants. In this study, using genotypes from a genome-wide association study (GWAS), pathway-based association studies were carried out by a modified Gene Set Enrichment Algorithm (GSEA) method (GenGen) for triglyceride in 1028 unrelated European-American extremely obese females (BMI≥35kg/m2) and normal-weight controls (BMI<25kg/m2), and another pathway association analysis (ICSNPathway) was also used to verify the GenGen result in the same data. The GO0009110 pathway (vitamin anabolism) was among the strongest associations with triglyceride (empirical P<0.001); the result remained significant after FDR correction (P = 0.022). MMAB, an obesity-related locus, included in this pathway. The ABCG1 and BCL6 gene was found in several triglyceride-related pathways (empirical P<0.05), which were also replicated by ICSNPathway (empirical P<0.05, FDR<0.05). We also performed single-marked GWAS using PLINK for TG levels (log-transformed). Significant associations were found between ASTN2 gene SNPs and plasma triglyceride levels (rs7035794, P = 2.24×10−10). Our study suggested that vitamin anabolism pathway, BCL6 gene pathways and ASTN2 gene may contribute to the genetic variation of plasma triglyceride concentrations.  相似文献   

17.
Chen R  Davydov EV  Sirota M  Butte AJ 《PloS one》2010,5(10):e13574
Many DNA variants have been identified on more than 300 diseases and traits using Genome-Wide Association Studies (GWASs). Some have been validated using deep sequencing, but many fewer have been validated functionally, primarily focused on non-synonymous coding SNPs (nsSNPs). It is an open question whether synonymous coding SNPs (sSNPs) and other non-coding SNPs can lead to as high odds ratios as nsSNPs. We conducted a broad survey across 21,429 disease-SNP associations curated from 2,113 publications studying human genetic association, and found that nsSNPs and sSNPs shared similar likelihood and effect size for disease association. The enrichment of disease-associated SNPs around the 80(th) base in the first introns might provide an effective way to prioritize intronic SNPs for functional studies. We further found that the likelihood of disease association was positively associated with the effect size across different types of SNPs, and SNPs in the 3' untranslated regions, such as the microRNA binding sites, might be under-investigated. Our results suggest that sSNPs are just as likely to be involved in disease mechanisms, so we recommend that sSNPs discovered from GWAS should also be examined with functional studies.  相似文献   

18.
19.
Recent developments in sequencing technologies have made it possible to uncover both rare and common genetic variants. Genome-wide association studies (GWASs) can test for the effect of common variants, whereas sequence-based association studies can evaluate the cumulative effect of both rare and common variants on disease risk. Many groupwise association tests, including burden tests and variance-component tests, have been proposed for this purpose. Although such tests do not exclude common variants from their evaluation, they focus mostly on testing the effect of rare variants by upweighting rare-variant effects and downweighting common-variant effects and can therefore lose substantial power when both rare and common genetic variants in a region influence trait susceptibility. There is increasing evidence that the allelic spectrum of risk variants at a given locus might include novel, rare, low-frequency, and common genetic variants. Here, we introduce several sequence kernel association tests to evaluate the cumulative effect of rare and common variants. The proposed tests are computationally efficient and are applicable to both binary and continuous traits. Furthermore, they can readily combine GWAS and whole-exome-sequencing data on the same individuals, when available, and are also applicable to deep-resequencing data of GWAS loci. We evaluate these tests on data simulated under comprehensive scenarios and show that compared with the most commonly used tests, including the burden and variance-component tests, they can achieve substantial increases in power. We next show applications to sequencing studies for Crohn disease and autism spectrum disorders. The proposed tests have been incorporated into the software package SKAT.  相似文献   

20.
The genome‐wide association studies (GWASs) are essential to determine the genetic bases of either ecological or economic phenotypic variation across individuals within populations of the model and nonmodel organisms. For this research question, the GWAS replication testing different parameters and models to validate the results'' reproducibility is common. However, straightforward methodologies that manage both replication and tetraploid data are still missing. To solve this problem, we designed the MultiGWAS, a tool that does GWAS for diploid and tetraploid organisms by executing in parallel four software packages, two designed for polyploid data (GWASpoly and SHEsis) and two designed for diploid data (GAPIT and TASSEL). MultiGWAS has several advantages. It runs either in the command line or in a graphical interface; it manages different genotype formats, including VCF. Moreover, it allows control for population structure, relatedness, and several quality control checks on genotype data. Besides, MultiGWAS can test for additive and dominant gene action models, and, through a proprietary scoring function, select the best model to report its associations. Finally, it generates several reports that facilitate identifying false associations from both the significant and the best‐ranked association Single Nucleotide Polymorphisms (SNPs) among the four software packages. We tested MultiGWAS with public tetraploid potato data for tuber shape and several simulated data under both additive and dominant models. These tests demonstrated that MultiGWAS is better at detecting reliable associations than using each of the four software packages individually. Moreover, the parallel analysis of polyploid and diploid software that only offers MultiGWAS demonstrates its utility in understanding the best genetic model behind the SNP association in tetraploid organisms. Therefore, MultiGWAS probed to be an excellent alternative for wrapping GWAS replication in diploid and tetraploid organisms in a single analysis environment.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号