首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Genome-wide association studies (GWAS) for Autism Spectrum Disorder (ASD) thus far met limited success in the identification of common risk variants, consistent with the notion that variants with small individual effects cannot be detected individually in single SNP analysis. To further capture disease risk gene information from ASD association studies, we applied a network-based strategy to the Autism Genome Project (AGP) and the Autism Genetics Resource Exchange GWAS datasets, combining family-based association data with Human Protein-Protein interaction (PPI) data. Our analysis showed that autism-associated proteins at higher than conventional levels of significance (P<0.1) directly interact more than random expectation and are involved in a limited number of interconnected biological processes, indicating that they are functionally related. The functionally coherent networks generated by this approach contain ASD-relevant disease biology, as demonstrated by an improved positive predictive value and sensitivity in retrieving known ASD candidate genes relative to the top associated genes from either GWAS, as well as a higher gene overlap between the two ASD datasets. Analysis of the intersection between the networks obtained from the two ASD GWAS and six unrelated disease datasets identified fourteen genes exclusively present in the ASD networks. These are mostly novel genes involved in abnormal nervous system phenotypes in animal models, and in fundamental biological processes previously implicated in ASD, such as axon guidance, cell adhesion or cytoskeleton organization. Overall, our results highlighted novel susceptibility genes previously hidden within GWAS statistical “noise” that warrant further analysis for causal variants.  相似文献   

2.
The concept of “housekeeping gene” has been used for four decades but remains loosely defined. Housekeeping genes are commonly described as “essential for cellular existence regardless of their specific function in the tissue or organism”, and “stably expressed irrespective of tissue type, developmental stage, cell cycle state, or external signal”. However, experimental support for the tenet that gene essentiality is linked to stable expression across cell types, conditions, and organisms has been limited. Here we use genome-scale functional genomic screens together with bulk and single-cell sequencing technologies to test this link and optimize a quantitative and experimentally validated definition of housekeeping gene. Using the optimized definition, we identify, characterize, and provide as resources, housekeeping gene lists extracted from several human datasets, and 10 other animal species that include primates, chicken, and C. elegans. We find that stably expressed genes are not necessarily essential, and that the individual genes that are essential and stably expressed can considerably differ across organisms; yet the pathways enriched among these genes are conserved. Further, the level of conservation of housekeeping genes across the analyzed organisms captures their taxonomic groups, showing evolutionary relevance for our definition. Therefore, we present a quantitative and experimentally supported definition of housekeeping genes that can contribute to better understanding of their unique biological and evolutionary characteristics.  相似文献   

3.
4.
5.
Large-scale systematic analysis of gene essentiality is an important step closer toward unraveling the complex relationship between genotypes and phenotypes. Such analysis cannot be accomplished without unbiased and accurate annotations of essential genes. In current genomic databases, most of the essential gene annotations are derived from whole-genome transposon mutagenesis (TM), the most frequently used experimental approach for determining essential genes in microorganisms under defined conditions. However, there are substantial systematic biases associated with TM experiments. In this study, we developed a novel Poisson model–based statistical framework to simulate the TM insertion process and subsequently correct the experimental biases. We first quantitatively assessed the effects of major factors that potentially influence the accuracy of TM and subsequently incorporated relevant factors into the framework. Through iteratively optimizing parameters, we inferred the actual insertion events occurred and described each gene’s essentiality on probability measure. Evaluated by the definite mapping of essential gene profile in Escherichia coli, our model significantly improved the accuracy of original TM datasets, resulting in more accurate annotations of essential genes. Our method also showed encouraging results in improving subsaturation level TM datasets. To test our model’s broad applicability to other bacteria, we applied it to Pseudomonas aeruginosa PAO1 and Francisella tularensis novicida TM datasets. We validated our predictions by literature as well as allelic exchange experiments in PAO1. Our model was correct on six of the seven tested genes. Remarkably, among all three cases that our predictions contradicted the TM assignments, experimental validations supported our predictions. In summary, our method will be a promising tool in improving genomic annotations of essential genes and enabling large-scale explorations of gene essentiality. Our contribution is timely considering the rapidly increasing essential gene sets. A Webserver has been set up to provide convenient access to this tool. All results and source codes are available for download upon publication at http://research.cchmc.org/essentialgene/.  相似文献   

6.
7.
Functional environmental screening of metagenomic libraries is a powerful means to identify and assign function to novel genes and their encoded proteins without any prior sequence knowledge. In the current study we describe the identification and subsequent analysis of a salt-tolerant clone from a human gut metagenomic library. Following transposon mutagenesis we identified an unknown gene (stlA, for “salt tolerance locus A”) with no current known homologues in the databases. Subsequent cloning and expression in Escherichia coli MKH13 revealed that stlA confers a salt tolerance phenotype in its surrogate host. Furthermore, a detailed in silico analysis was also conducted to gain additional information on the properties of the encoded StlA protein. The stlA gene is rare when searched against human metagenome datasets such as MetaHit and the Human Microbiome Project and represents a novel and unique salt tolerance determinant which appears to be found exclusively in the human gut environment.  相似文献   

8.

Background

Recent studies showed that polymorphisms in the Fat and Obesity-Associated (FTO) gene have robust effects on obesity, obesity-related traits and endophenotypes associated with Alzheimer''s disease (AD).

Methods

We used 1,877 Caucasian cases and controls from the NIA-LOAD study and 1,093 Caribbean Hispanics to further explore the association of FTO with AD. Using logistic regression, we assessed 42 SNPs in introns 1 and 2, the region previously reported to be associated with AD endophenotypes, which had been derived by genome-wide screenings. In addition, we performed gene expression analyses of neuropathologically confirmed AD cases and controls of two independent datasets (19 AD cases, 10 controls; 176 AD cases, 188 controls) using within- and between-group factors ANOVA of log10 transformed rank invariant normalized expression data.

Results

In the NIALOAD study, one SNP was significantly associated with AD and three additional markers were close to significance (rs6499640, rs10852521, rs16945088, rs8044769, FDR p-value: 0.05<p<0.09). Two of the SNPs are in strong LD (D′>0.9) with the previously reported SNPs. In the Caribbean Hispanic dataset, we identified three SNPs (rs17219084, rs11075996, rs11075997, FDR p-value: 0.009<p<0.01) that were associated with AD. These results were confirmed by haplotype analyses and in a metaanalysis in which we included the ADNI dataset. FTO had a significantly lower expresssion in AD cases compared to controls in two independent datasets derived from human cortex and amygdala tissue, respectively (p = 2.18×10−5 and p<0.0001).

Conclusions

Our data support the notion that genetic variation in Introns 1 and 2 of the FTO gene may contribute to AD risk.  相似文献   

9.

Background

Despite the recent identification of several prognostic gene signatures, the lack of common genes among experimental cohorts has posed a considerable challenge in uncovering the molecular basis underlying hepatocellular carcinoma (HCC) recurrence for application in clinical purposes. To overcome the limitations of individual gene-based analysis, we applied a pathway-based approach for analysis of HCC recurrence.

Results

By implementing a permutation-based semi-supervised principal component analysis algorithm using the optimal principal component, we selected sixty-four pathways associated with hepatitis B virus (HBV)-positive HCC recurrence (p < 0.01), from our microarray dataset composed of 142 HBV-positive HCCs. In relation to the public HBV- and public hepatitis C virus (HCV)-positive HCC datasets, we detected 46 (71.9%) and 18 (28.1%) common recurrence-associated pathways, respectively. However, overlap of recurrence-associated genes between datasets was rare, further supporting the utility of the pathway-based approach for recurrence analysis between different HCC datasets. Non-supervised clustering of the 64 recurrence-associated pathways facilitated the classification of HCC patients into high- and low-risk subgroups, based on risk of recurrence (p < 0.0001). The pathways identified were additionally successfully applied to discriminate subgroups depending on recurrence risk within the public HCC datasets. Through multivariate analysis, these recurrence-associated pathways were identified as an independent prognostic factor (p < 0.0001) along with tumor number, tumor size and Edmondson’s grade. Moreover, the pathway-based approach had a clinical advantage in terms of discriminating the high-risk subgroup (N = 12) among patients (N = 26) with small HCC (<3 cm).

Conclusions

Using pathway-based analysis, we successfully identified the pathways involved in recurrence of HBV-positive HCC that may be effectively used as prognostic markers.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1472-x) contains supplementary material, which is available to authorized users.  相似文献   

10.
11.
Gene coexpression network analysis is a powerful “data-driven” approach essential for understanding cancer biology and mechanisms of tumor development. Yet, despite the completion of thousands of studies on cancer gene expression, there have been few attempts to normalize and integrate co-expression data from scattered sources in a concise “meta-analysis” framework. We generated such a resource by exploring gene coexpression networks in 82 microarray datasets from 9 major human cancer types. The analysis was conducted using an elaborate weighted gene coexpression network (WGCNA) methodology and identified over 3,000 robust gene coexpression modules. The modules covered a range of known tumor features, such as proliferation, extracellular matrix remodeling, hypoxia, inflammation, angiogenesis, tumor differentiation programs, specific signaling pathways, genomic alterations, and biomarkers of individual tumor subtypes. To prioritize genes with respect to those tumor features, we ranked genes within each module by connectivity, leading to identification of module-specific functionally prominent hub genes. To showcase the utility of this network information, we positioned known cancer drug targets within the coexpression networks and predicted that Anakinra, an anti-rheumatoid therapeutic agent, may be promising for development in colorectal cancer. We offer a comprehensive, normalized and well documented collection of >3000 gene coexpression modules in a variety of cancers as a rich data resource to facilitate further progress in cancer research.  相似文献   

12.
13.
A main challenge of data-driven sciences is how to make maximal use of the progressively expanding databases of experimental datasets in order to keep research cumulative. We introduce the idea of a modeling-based dataset retrieval engine designed for relating a researcher''s experimental dataset to earlier work in the field. The search is (i) data-driven to enable new findings, going beyond the state of the art of keyword searches in annotations, (ii) modeling-driven, to include both biological knowledge and insights learned from data, and (iii) scalable, as it is accomplished without building one unified grand model of all data. Assuming each dataset has been modeled beforehand, by the researchers or automatically by database managers, we apply a rapidly computable and optimizable combination model to decompose a new dataset into contributions from earlier relevant models. By using the data-driven decomposition, we identify a network of interrelated datasets from a large annotated human gene expression atlas. While tissue type and disease were major driving forces for determining relevant datasets, the found relationships were richer, and the model-based search was more accurate than the keyword search; moreover, it recovered biologically meaningful relationships that are not straightforwardly visible from annotations—for instance, between cells in different developmental stages such as thymocytes and T-cells. Data-driven links and citations matched to a large extent; the data-driven links even uncovered corrections to the publication data, as two of the most linked datasets were not highly cited and turned out to have wrong publication entries in the database.  相似文献   

14.
15.
16.
17.

Background

Isocitrate dehydrogenase isoforms 1 and 2 (IDH1 and IDH2) mutations have received considerable attention since the discovery of their relation with human gliomas. The predictive value of IDH1 and IDH2 mutations in gliomas remains controversial. Here, we present the results of a meta-analysis of the associations between IDH mutations and both progression-free survival (PFS) and overall survival (OS) in gliomas. The interrelationship between the IDH mutations and MGMT promoter hypermethylation, EGFR amplification, codeletion of chromosomes 1p/19q and TP53 gene mutation were also revealed.

Methodology and Principal Findings

An electronic literature search of public databases (PubMed, Embase databases) was performed. In total, 10 articles, including 12 studies in English, with 2,190 total cases were included in the meta-analysis. The IDH mutations were frequent in WHO grade II and III glioma (59.5%) and secondary glioblastomas (63.4%) and were less frequent in primary glioblastomas (7.13%). Our study provides evidence that IDH mutations are tightly associated with MGMT promoter hypermethylation (P<0.001), 1p/19q codeletion (P<0.001) and TP53 gene mutation (P<0.001) but are mutually exclusive with EGFR amplification (P<0.001). This meta-analysis showed that the combined hazard ratio (HR) estimate for overall survival and progression-free survival in patients with IDH mutations was 0.33 (95% CI: 0.25–0.42) and 0.38 (95% CI: 0.21–0.68), compared with glioma patients whose tumours harboured the wild-type IDH. Subgroup analyses based on tumour grade also revealed that the presence of IDH mutations was associated with a better outcome.

Conclusion

Our study suggests that IDH mutations, which are closely linked to the genomic profile of gliomas, are potential prognostic biomarkers for gliomas.  相似文献   

18.
19.
Human colonic mucosa altered by inflammation due to ulcerative colitis (UC) displays a drastically altered pattern of gene expression compared with healthy tissue. We aimed to understand the underlying molecular pathways influencing these differences by analyzing three publically-available, independently-generated microarray datasets of gene expression from endoscopic biopsies of the colon. Gene set enrichment analysis (GSEA) revealed that all three datasets share 87 gene sets upregulated in UC lesions and 8 gene sets downregulated (false discovery rate <0.05). The upregulated pathways were dominated by gene sets involved in immune function and signaling, as well as the control of mitosis. We applied pathway analysis to genotype data derived from genome-wide association studies (GWAS) of UC, consisting of 5,584 cases and 11,587 controls assembled from eight European-ancestry cohorts. The upregulated pathways derived from the gene expression data showed a highly significant overlap with pathways derived from the genotype data (33 of 56 gene sets, hypergeometric P = 1.49×10–19). This study supports the hypothesis that heritable variation in gene expression as measured by GWAS signals can influence key pathways in the development of disease, and that comparison of genetic susceptibility loci with gene expression signatures can differentiate key drivers of inflammation from secondary effects on gene expression of the inflammatory process.  相似文献   

20.
Xu Q  Lee C 《Nucleic acids research》2003,31(19):5635-5643
We report here a genome-wide analysis of alternative splicing in 2 million human expressed sequence tags (ESTs), to identify splice forms that are up-regulated in tumors relative to normal tissues. We found strong evidence (P < 0.01) of cancer-specific splice variants in 316 human genes. In total, 78% of the cancer-specific splice forms we detected are confirmed by human-curated mRNA sequences, indicating that our results are not due to random mis-splicing in tumors; 73% of the genes showed the same cancer-specific splicing changes in tissue-matched tumor versus normal datasets, indicating that the vast majority of these changes are associated with tumorigenesis, not tissue specificity. We have confirmed our EST results in an independent set of experimental data provided by human-curated mRNAs (P-value 10–5.7). Moreover, the majority of the genes we detected have functions associated with cancer (P-value 0.0007), suggesting that their altered splicing may play a functional role in cancer. Analysis of the types of cancer-specific splicing shifts suggests that many of these shifts act by disrupting a tumor suppressor function. Sur prisingly, our data show that for a large number (190 in this study) of cancer-associated genes cloned originally from tumors, there exists a previously uncharacterized splice form of the gene that appears to be predominant in normal tissue.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号