首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Pathway analyses of genome-wide association studies aggregate information over sets of related genes, such as genes in common pathways, to identify gene sets that are enriched for variants associated with disease. We develop a model-based approach to pathway analysis, and apply this approach to data from the Wellcome Trust Case Control Consortium (WTCCC) studies. Our method offers several benefits over existing approaches. First, our method not only interrogates pathways for enrichment of disease associations, but also estimates the level of enrichment, which yields a coherent way to promote variants in enriched pathways, enhancing discovery of genes underlying disease. Second, our approach allows for multiple enriched pathways, a feature that leads to novel findings in two diseases where the major histocompatibility complex (MHC) is a major determinant of disease susceptibility. Third, by modeling disease as the combined effect of multiple markers, our method automatically accounts for linkage disequilibrium among variants. Interrogation of pathways from eight pathway databases yields strong support for enriched pathways, indicating links between Crohn''s disease (CD) and cytokine-driven networks that modulate immune responses; between rheumatoid arthritis (RA) and “Measles” pathway genes involved in immune responses triggered by measles infection; and between type 1 diabetes (T1D) and IL2-mediated signaling genes. Prioritizing variants in these enriched pathways yields many additional putative disease associations compared to analyses without enrichment. For CD and RA, 7 of 8 additional non-MHC associations are corroborated by other studies, providing validation for our approach. For T1D, prioritization of IL-2 signaling genes yields strong evidence for 7 additional non-MHC candidate disease loci, as well as suggestive evidence for several more. Of the 7 strongest associations, 4 are validated by other studies, and 3 (near IL-2 signaling genes RAF1, MAPK14, and FYN) constitute novel putative T1D loci for further study.  相似文献   

2.
Type 2 diabetes (T2D) is a complex metabolic disease that is more prevalent in ethnic groups such as Mexican Americans, and is strongly associated with the risk factors obesity and insulin resistance. The goal of this study was to perform whole genome gene expression profiling in adipose tissue to detect common patterns of gene regulation associated with obesity and insulin resistance. We used phenotypic and genotypic data from 308 Mexican American participants from the Veterans Administration Genetic Epidemiology Study (VAGES). Basal fasting RNA was extracted from adipose tissue biopsies from a subset of 75 unrelated individuals, and gene expression data generated on the Illumina BeadArray platform. The number of gene probes with significant expression above baseline was approximately 31,000. We performed multiple regression analysis of all probes with 15 metabolic traits. Adipose tissue had 3,012 genes significantly associated with the traits of interest (false discovery rate, FDR ≤ 0.05). The significance of gene expression changes was used to select 52 genes with significant (FDR ≤ 10-4) gene expression changes across multiple traits. Gene sets/Pathways analysis identified one gene, alcohol dehydrogenase 1B (ADH1B) that was significantly enriched (P < 10-60) as a prime candidate for involvement in multiple relevant metabolic pathways. Illumina BeadChip derived ADH1B expression data was consistent with quantitative real time PCR data. We observed significant inverse correlations with waist circumference (2.8 x 10-9), BMI (5.4 x 10-6), and fasting plasma insulin (P < 0.001). These findings are consistent with a central role for ADH1B in obesity and insulin resistance and provide evidence for a novel genetic regulatory mechanism for human metabolic diseases related to these traits.  相似文献   

3.
4.
Type 2 diabetes (T2D) is a complex metabolic disease associated with obesity, insulin resistance and hypoinsulinemia due to pancreatic β-cell dysfunction. Reduced mitochondrial function is thought to be central to β-cell dysfunction. Mitochondrial dysfunction and reduced insulin secretion are also observed in β-cells of humans with the most common human genetic disorder, Down syndrome (DS, Trisomy 21). To identify regions of chromosome 21 that may be associated with perturbed glucose homeostasis we profiled the glycaemic status of different DS mouse models. The Ts65Dn and Dp16 DS mouse lines were hyperglycemic, while Tc1 and Ts1Rhr mice were not, providing us with a region of chromosome 21 containing genes that cause hyperglycemia. We then examined whether any of these genes were upregulated in a set of ~5,000 gene expression changes we had identified in a large gene expression analysis of human T2D β-cells. This approach produced a single gene, RCAN1, as a candidate gene linking hyperglycemia and functional changes in T2D β-cells. Further investigations demonstrated that RCAN1 methylation is reduced in human T2D islets at multiple sites, correlating with increased expression. RCAN1 protein expression was also increased in db/db mouse islets and in human and mouse islets exposed to high glucose. Mice overexpressing RCAN1 had reduced in vivo glucose-stimulated insulin secretion and their β-cells displayed mitochondrial dysfunction including hyperpolarised membrane potential, reduced oxidative phosphorylation and low ATP production. This lack of β-cell ATP had functional consequences by negatively affecting both glucose-stimulated membrane depolarisation and ATP-dependent insulin granule exocytosis. Thus, from amongst the myriad of gene expression changes occurring in T2D β-cells where we had little knowledge of which changes cause β-cell dysfunction, we applied a trisomy 21 screening approach which linked RCAN1 to β-cell mitochondrial dysfunction in T2D.  相似文献   

5.
Although they have demonstrated success in searching for common variants for complex diseases, genome-wide association (GWA) studies are less successful in detecting rare genetic variants because of the poor statistical power of most of current methods. We developed a two-stage method that can apply to GWA studies for detecting rare variants. Here we report the results of applying this two-stage method to the Wellcome Trust Case Control Consortium (WTCCC) dataset that include seven complex diseases: bipolar disorder, cardiovascular disease, hypertension (HT), rheumatoid arthritis, Crohn’s disease, type 1 diabetes and type 2 diabetes (T2D). We identified 24 genes or regions that reach genome wide significance. Eight of them are novel and were not reported in the WTCCC study. The cumulative risk (or protective) haplotype frequency for each of the 8 genes or regions is small, being at most 11%. For each of the novel genes, the risk (or protective) haplotype set cannot be tagged by the common SNPs available in chips (r 2 < 0.32). The gene identified in HT was further replicated in the Framingham Heart Study, and is also significantly associated with T2D. Our analysis suggests that searching for rare genetic variants is feasible in current GWA studies and candidate gene studies, and the results can severe as guides to future resequencing studies to identify the underlying rare functional variants.  相似文献   

6.
Mitochondrial dysfunction has been observed in skeletal muscle of people with diabetes and insulin-resistant individuals. Furthermore, inherited mutations in mitochondrial DNA can cause a rare form of diabetes. However, it is unclear whether mitochondrial dysfunction is a primary cause of the common form of diabetes. To date, common genetic variants robustly associated with type 2 diabetes (T2D) are not known to affect mitochondrial function. One possibility is that multiple mitochondrial genes contain modest genetic effects that collectively influence T2D risk. To test this hypothesis we developed a method named Meta-Analysis Gene-set Enrichment of variaNT Associations (MAGENTA; http://www.broadinstitute.org/mpg/magenta). MAGENTA, in analogy to Gene Set Enrichment Analysis, tests whether sets of functionally related genes are enriched for associations with a polygenic disease or trait. MAGENTA was specifically designed to exploit the statistical power of large genome-wide association (GWA) study meta-analyses whose individual genotypes are not available. This is achieved by combining variant association p-values into gene scores and then correcting for confounders, such as gene size, variant number, and linkage disequilibrium properties. Using simulations, we determined the range of parameters for which MAGENTA can detect associations likely missed by single-marker analysis. We verified MAGENTA''s performance on empirical data by identifying known relevant pathways in lipid and lipoprotein GWA meta-analyses. We then tested our mitochondrial hypothesis by applying MAGENTA to three gene sets: nuclear regulators of mitochondrial genes, oxidative phosphorylation genes, and ∼1,000 nuclear-encoded mitochondrial genes. The analysis was performed using the most recent T2D GWA meta-analysis of 47,117 people and meta-analyses of seven diabetes-related glycemic traits (up to 46,186 non-diabetic individuals). This well-powered analysis found no significant enrichment of associations to T2D or any of the glycemic traits in any of the gene sets tested. These results suggest that common variants affecting nuclear-encoded mitochondrial genes have at most a small genetic contribution to T2D susceptibility.  相似文献   

7.

Objective

Candidate genes for non-alcoholic fatty liver disease (NAFLD) identified by a bioinformatics approach were examined for variant associations to quantitative traits of NAFLD-related phenotypes.

Research Design and Methods

By integrating public database text mining, trans-organism protein-protein interaction transferal, and information on liver protein expression a protein-protein interaction network was constructed and from this a smaller isolated interactome was identified. Five genes from this interactome were selected for genetic analysis. Twenty-one tag single-nucleotide polymorphisms (SNPs) which captured all common variation in these genes were genotyped in 10,196 Danes, and analyzed for association with NAFLD-related quantitative traits, type 2 diabetes (T2D), central obesity, and WHO-defined metabolic syndrome (MetS).

Results

273 genes were included in the protein-protein interaction analysis and EHHADH, ECHS1, HADHA, HADHB, and ACADL were selected for further examination. A total of 10 nominal statistical significant associations (P<0.05) to quantitative metabolic traits were identified. Also, the case-control study showed associations between variation in the five genes and T2D, central obesity, and MetS, respectively. Bonferroni adjustments for multiple testing negated all associations.

Conclusions

Using a bioinformatics approach we identified five candidate genes for NAFLD. However, we failed to provide evidence of associations with major effects between SNPs in these five genes and NAFLD-related quantitative traits, T2D, central obesity, and MetS.  相似文献   

8.
9.
Type 2 diabetes (T2D) is a syndrome of multiple metabolic disorders and is genetically heterogeneous. India comprises one of the largest global populations with highest number of reported type 2 diabetes cases. However, limited information about T2D associated loci is available for Indian populations. It is, therefore, pertinent to evaluate the previously associated candidates as well as identify novel genetic variations in Indian populations to understand the extent of genetic heterogeneity. We chose to do a cost effective high-throughput mass-array genotyping and studied the candidate gene variations associated with T2D in literature. In this case-control candidate genes association study, 91 SNPs from 55 candidate genes have been analyzed in three geographically independent population groups from India. We report the genetic variants in five candidate genes: TCF7L2, HHEX, ENPP1, IDE and FTO, are significantly associated (after Bonferroni correction, p<5.5E−04) with T2D susceptibility in combined population. Interestingly, SNP rs7903146 of the TCF7L2 gene passed the genome wide significance threshold (combined P value = 2.05E−08) in the studied populations. We also observed the association of rs7903146 with blood glucose (fasting and postprandial) levels, supporting the role of TCF7L2 gene in blood glucose homeostasis. Further, we noted that the moderate risk provided by the independently associated loci in combined population with Odds Ratio (OR)<1.38 increased to OR = 2.44, (95%CI = 1.67–3.59) when the risk providing genotypes of TCF7L2, HHEX, ENPP1 and FTO genes were combined, suggesting the importance of gene-gene interactions evaluation in complex disorders like T2D.  相似文献   

10.
The Goto-Kakizaki (GK) rat, which has been developed by repeated inbreeding of glucose-intolerant Wistar rats, is the most widely studied rat model for Type 2 diabetes (T2D). However, the detailed genetic background of T2D phenotype in GK rats is still largely unknown. We report a survey of T2D susceptible variations based on high-quality whole genome sequencing of GK and Wistar rats, which have generated a list of GK-specific variations (228 structural variations, 2660 CNV amplification and 2834 CNV deletion, 1796 protein affecting SNVs or indels) by comparative genome analysis and identified 192 potential T2D-associated genes. The genes with variants are further refined with prior knowledge and public resource including variant polymorphism of rat strains, protein-protein interactions and differential gene expression. Finally we have identified 15 genetic mutant genes which include seven known T2D related genes (Tnfrsf1b, Scg5, Fgb, Sell, Dpp4, Icam1, and Pkd2l1) and eight high-confidence new candidate genes (Ldlr, Ccl2, Erbb3, Akr1b1, Pik3c2a, Cd5, Eef2k, and Cpd). Our result reveals that the T2D phenotype may be caused by the accumulation of multiple variations in GK rat, and that the mutated genes may affect biological functions including adipocytokine signaling, glycerolipid metabolism, PPAR signaling, T cell receptor signaling and insulin signaling pathways. We present the genomic difference between two closely related rat strains (GK and Wistar) and narrow down the scope of susceptible loci. It also requires further experimental study to understand and validate the relationship between our candidate variants and T2D phenotype. Our findings highlight the importance of sequenced-based comparative genomics for investigating disease susceptibility loci in inbreeding animal models.  相似文献   

11.
12.
Numerous prognostic gene expression signatures for breast cancer were generated previously with few overlap and limited insight into the biology of the disease. Here we introduce a novel algorithm named SCoR (Survival analysis using Cox proportional hazard regression and Random resampling) to apply random resampling and clustering methods in identifying gene features correlated with time to event data. This is shown to reduce overfitting noises involved in microarray data analysis and discover functional gene sets linked to patient survival. SCoR independently identified a common poor prognostic signature composed of cell proliferation genes from six out of eight breast cancer datasets. Furthermore, a sequential SCoR analysis on highly proliferative breast cancers repeatedly identified T/B cell markers as favorable prognosis factors. In glioblastoma, SCoR identified a common good prognostic signature of chromosome 10 genes from two gene expression datasets (TCGA and REMBRANDT), recapitulating the fact that loss of one copy of chromosome 10 (which harbors the tumor suppressor PTEN) is linked to poor survival in glioblastoma patients. SCoR also identified prognostic genes on sex chromosomes in lung adenocarcinomas, suggesting patient gender might be used to predict outcome in this disease. These results demonstrate the power of SCoR to identify common and biologically meaningful prognostic gene expression signatures.  相似文献   

13.
In recent years, the search for genetic determinants of type 2 diabetes (T2D) has changed dramatically. Although linkage and small-scale candidate gene studies were highly successful in the identification of genes, which, when mutated, caused monogenic forms of T2D, they were largely unsuccessful when applied to the more common forms of the disease. To date, these approaches have only identified two loci (PPARG, KCNJ11) robustly implicated in T2D susceptibility. The ability to perform large-scale association analysis, including genome-wide association studies (GWAS) in many thousands of samples from different populations, and subsequently, the shift to form large international collaborations to perform meta-analyses across many studies has taken the number of independent loci showing genome-wide significant associations with T2D to 44. This number includes six loci identified initially through the analysis of quantitative glycaemic phenotypes, illustrating the usefulness of this approach both to identify new disease genes and gain insight into the mechanisms leading to disease. Combined, these loci still only account for ~10% of the observed familial clustering in Europeans, leaving much of the variance unexplained. In this review, we will describe what GWAS have taught us about the genetic basis of T2D and discuss possible next steps to uncover the remaining heritability.  相似文献   

14.
Revealing mechanisms underlying complex diseases poses great challenges to biologists. The traditional linkage and linkage disequilibrium analysis that have been successful in the identification of genes responsible for Mendelian traits, however, have not led to similar success in discovering genes influencing the development of complex diseases. Emerging functional genomic and proteomic ('omic') resources and technologies provide great opportunities to develop new methods for systematic identification of genes underlying complex diseases. In this report, we propose a systems biology approach, which integrates omic data, to find genes responsible for complex diseases. This approach consists of five steps: (1) generate a set of candidate genes using gene-gene interaction data sets; (2) reconstruct a genetic network with the set of candidate genes from gene expression data; (3) identify differentially regulated genes between normal and abnormal samples in the network; (4) validate regulatory relationship between the genes in the network by perturbing the network using RNAi and monitoring the response using RT-PCR; and (5) genotype the differentially regulated genes and test their association with the diseases by direct association studies. To prove the concept in principle, the proposed approach is applied to genetic studies of the autoimmune disease scleroderma or systemic sclerosis.  相似文献   

15.
16.
17.
The conventional approach of candidate gene studies in complex diseases is to look at the effect of one gene at a time. However, as the outcome of chronic diseases is influenced by a large number of alleles, simultaneous analysis is needed. We demonstrate the application of multivariate regression and cluster analysis to a multiple sclerosis (MS) dataset with genotypes for 489 patients at 11 candidate genes selected on their involvement in the immune response. Using multivariate regression, we observed that different sets of genes were associated with different disease characteristics that reflect different aspects of disease. Out of 15 polymorphisms, we identified one that contributed to the severity of disease. In addition, the set of 15 polymorphisms was predictive for yearly increase in lesion volume as seen on T1-weighted MRI (p=0.044). From this set, no individual polymorphisms could be identified after adjustment for multiple hypotheses testing. By means of a cluster analysis, we aimed to identify subgroups of patients with different pathogenic subtypes of MS on the basis of their genetic profile. We constructed genetic profiles from the genotypes at the 11 candidate genes. The approach proved to be feasible. We observed three clusters in the sample of patients. In this study, we observed no significant differences in the usual clinical and MRI outcome measures between the different clusters. However, a number of consistent trends indicated that this clustering might be related to the course of disease. With a larger number of genes regulating the course of disease, we may be able to identify clinically relevant clusters. The analyses are easily implemented and will be applicable to candidate gene studies of complex traits in general.  相似文献   

18.
There is a rapid rise in cases of Type-2-diabetes mellitus (T2DM) globally, irrespective of the geography, ethnicity or any other variable factors. The molecular mechanisms that could cause the condition of T2DM need to be more thoroughly analysed to understand the clinical manifestations and to derive better therapeutic regimes. Tools in bioinformatics are used to trace out key gene elements and to identify the key causative gene elements and their possible therapeutic agents. Microarray datasets were retrieved from the Gene expression omnibus database and studied using R to derive different expressed gene (DEG) elements. With the comparison of the expressed genes with disease specific genes in DisGeNET, the final annotated genes were taken for analysis. Gene Ontology studies, Protein–protein interaction (PPI), Co-expression analysis, Gene-drug interactions were performed to scale down the hub genes and to identify the novelty across the genes analysed so far. In vivo and invitro analysis of key genes and the trace of interaction pathway is crucial to better understand the unique outcomes from the novel genes, forming the basis to understand the pathway that ends up causing T2DM. Afterwards, docking was executed enabling recognition of interacting residues involved in inhibition. The complex CCL5-265 and CD8A-40585 thus docked showed best results as is evident from its PCA analysis and MMGBSA calculation. There is now scope for deriving candidate drugs that could possibly detect personalized therapies for T2DM.  相似文献   

19.
Insertional mutagenesis screens in mice are used to identify individual genes that drive tumor formation. In these screens, candidate cancer genes are identified if their genomic location is proximal to a common insertion site (CIS) defined by high rates of transposon or retroviral insertions in a given genomic window. In this article, we describe a new method for defining CISs based on a Poisson distribution, the Poisson Regression Insertion Model, and show that this new method is an improvement over previously described methods. We also describe a modification of the method that can identify pairs and higher orders of co-occurring common insertion sites. We apply these methods to two data sets, one generated in a transposon-based screen for gastrointestinal tract cancer genes and another based on the set of retroviral insertions in the Retroviral Tagged Cancer Gene Database. We show that the new methods identify more relevant candidate genes and candidate gene pairs than found using previous methods. Identification of the biologically relevant set of mutations that occur in a single cell and cause tumor progression will aid in the rational design of single and combinatorial therapies in the upcoming age of personalized cancer therapy.  相似文献   

20.
Linkage analysis is a successful procedure to associate diseases with specific genomic regions. These regions are often large, containing hundreds of genes, which make experimental methods employed to identify the disease gene arduous and expensive. We present two methods to prioritize candidates for further experimental study: Common Pathway Scanning (CPS) and Common Module Profiling (CMP). CPS is based on the assumption that common phenotypes are associated with dysfunction in proteins that participate in the same complex or pathway. CPS applies network data derived from protein–protein interaction (PPI) and pathway databases to identify relationships between genes. CMP identifies likely candidates using a domain-dependent sequence similarity approach, based on the hypothesis that disruption of genes of similar function will lead to the same phenotype. Both algorithms use two forms of input data: known disease genes or multiple disease loci. When using known disease genes as input, our combined methods have a sensitivity of 0.52 and a specificity of 0.97 and reduce the candidate list by 13-fold. Using multiple loci, our methods successfully identify disease genes for all benchmark diseases with a sensitivity of 0.84 and a specificity of 0.63. Our combined approach prioritizes good candidates and will accelerate the disease gene discovery process.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号