首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Yang P  Li X  Wu M  Kwoh CK  Ng SK 《PloS one》2011,6(7):e21502

Background

Phenotypically similar diseases have been found to be caused by functionally related genes, suggesting a modular organization of the genetic landscape of human diseases that mirrors the modularity observed in biological interaction networks. Protein complexes, as molecular machines that integrate multiple gene products to perform biological functions, express the underlying modular organization of protein-protein interaction networks. As such, protein complexes can be useful for interrogating the networks of phenome and interactome to elucidate gene-phenotype associations of diseases.

Methodology/Principal Findings

We proposed a technique called RWPCN (Random Walker on Protein Complex Network) for predicting and prioritizing disease genes. The basis of RWPCN is a protein complex network constructed using existing human protein complexes and protein interaction network. To prioritize candidate disease genes for the query disease phenotypes, we compute the associations between the protein complexes and the query phenotypes in their respective protein complex and phenotype networks. We tested RWPCN on predicting gene-phenotype associations using leave-one-out cross-validation; our method was observed to outperform existing approaches. We also applied RWPCN to predict novel disease genes for two representative diseases, namely, Breast Cancer and Diabetes.

Conclusions/Significance

Guilt-by-association prediction and prioritization of disease genes can be enhanced by fully exploiting the underlying modular organizations of both the disease phenome and the protein interactome. Our RWPCN uses a novel protein complex network as a basis for interrogating the human phenome-interactome network. As the protein complex network can capture the underlying modularity in the biological interaction networks better than simple protein interaction networks, RWPCN was found to be able to detect and prioritize disease genes better than traditional approaches that used only protein-phenotype associations.  相似文献   

2.
3.
4.
Cellular functions are always performed by protein complexes. At present, many approaches have been proposed to identify protein complexes from protein–protein interaction (PPI) networks. Some approaches focus on detecting local dense subgraphs in PPI networks which are regarded as protein‐complex cores, then identify protein complexes by including local neighbors. However, from gene expression profiles at different time points or tissues it is known that proteins are dynamic. Therefore, identifying dynamic protein complexes should become very important and meaningful. In this study, a novel core‐attachment–based method named CO‐DPC to detect dynamic protein complexes is presented. First, CO‐DPC selects active proteins according to gene expression profiles and the 3‐sigma principle, and constructs dynamic PPI networks based on the co‐expression principle and PPI networks. Second, CO‐DPC detects local dense subgraphs as the cores of protein complexes and then attach close neighbors of these cores to form protein complexes. In order to evaluate the method, the method and the existing algorithms are applied to yeast PPI networks. The experimental results show that CO‐DPC performs much better than the existing methods. In addition, the identified dynamic protein complexes can match very well and thus become more meaningful for future biological study.  相似文献   

5.
6.
7.
8.
9.
Increasing knowledge about the organization of proteins into complexes, systems, and pathways has led to a flowering of theoretical approaches for exploiting this knowledge in order to better learn the functions of proteins and their roles underlying phenotypic traits and diseases. Much of this body of theory has been developed and tested in model organisms, relying on their relative simplicity and genetic and biochemical tractability to accelerate the research. In this review, we discuss several of the major approaches for computationally integrating proteomics and genomics observations into integrated protein networks, then applying guilt-by-association in these networks in order to identify genes underlying traits. Recent trends in this field include a rising appreciation of the modular network organization of proteins underlying traits or mutational phenotypes, and how to exploit such protein modularity using computational approaches related to the internet search algorithm PageRank. Many protein network-based predictions have recently been experimentally confirmed in yeast, worms, plants, and mice, and several successful approaches in model organisms have been directly translated to analyze human disease, with notable recent applications to glioma and breast cancer prognosis.  相似文献   

10.

Background

The functions of a eukaryotic cell are largely performed by multi-subunit protein complexes that act as molecular machines or information processing modules in cellular networks. An important problem in systems biology is to understand how, in general, these molecular machines respond to perturbations.

Results

In yeast, genes that inhibit growth when their expression is reduced are strongly enriched amongst the subunits of multi-subunit protein complexes. This applies to both the core and peripheral subunits of protein complexes, and the subunits of each complex normally have the same loss-of-function phenotypes. In contrast, genes that inhibit growth when their expression is increased are not enriched amongst the core or peripheral subunits of protein complexes, and the behaviour of one subunit of a complex is not predictive for the other subunits with respect to over-expression phenotypes.

Conclusion

We propose the principle that the overall activity of a protein complex is in general robust to an increase, but not to a decrease in the expression of its subunits. This means that whereas phenotypes resulting from a decrease in gene expression can be predicted because they cluster on networks of protein complexes, over-expression phenotypes cannot be predicted in this way. We discuss the implications of these findings for understanding how cells are regulated, how they evolve, and how genetic perturbations connect to disease in humans.  相似文献   

11.
High‐throughput ‘‐omics’ data can be combined with large‐scale molecular interaction networks, for example, protein–protein interaction networks, to provide a unique framework for the investigation of human molecular biology. Interest in these integrative ‘‐omics’ methods is growing rapidly because of their potential to understand complexity and association with disease; such approaches have a focus on associations between phenotype and “network‐type.” The potential of this research is enticing, yet there remain a series of important considerations. Here, we discuss interaction data selection, data quality, the relative merits of using data from large high‐throughput studies versus a meta‐database of smaller literature‐curated studies, and possible issues of sociological or inspection bias in interaction data. Other work underway, especially international consortia to establish data formats, quality standards and address data redundancy, and the improvements these efforts are making to the field, is also evaluated. We present options for researchers intending to use large‐scale molecular interaction networks as a functional context for protein or gene expression data, including microRNAs, especially in the context of human disease.  相似文献   

12.
13.
Genome-wide association studies (GWAS) have defined over 150 genomic regions unequivocally containing variation predisposing to immune-mediated disease. Inferring disease biology from these observations, however, hinges on our ability to discover the molecular processes being perturbed by these risk variants. It has previously been observed that different genes harboring causal mutations for the same Mendelian disease often physically interact. We sought to evaluate the degree to which this is true of genes within strongly associated loci in complex disease. Using sets of loci defined in rheumatoid arthritis (RA) and Crohn's disease (CD) GWAS, we build protein-protein interaction (PPI) networks for genes within associated loci and find abundant physical interactions between protein products of associated genes. We apply multiple permutation approaches to show that these networks are more densely connected than chance expectation. To confirm biological relevance, we show that the components of the networks tend to be expressed in similar tissues relevant to the phenotypes in question, suggesting the network indicates common underlying processes perturbed by risk loci. Furthermore, we show that the RA and CD networks have predictive power by demonstrating that proteins in these networks, not encoded in the confirmed list of disease associated loci, are significantly enriched for association to the phenotypes in question in extended GWAS analysis. Finally, we test our method in 3 non-immune traits to assess its applicability to complex traits in general. We find that genes in loci associated to height and lipid levels assemble into significantly connected networks but did not detect excess connectivity among Type 2 Diabetes (T2D) loci beyond chance. Taken together, our results constitute evidence that, for many of the complex diseases studied here, common genetic associations implicate regions encoding proteins that physically interact in a preferential manner, in line with observations in Mendelian disease.  相似文献   

14.
Chronic obstructive pulmonary disease (COPD) is a complex disease with both environmental and genetic determinants, the most important of which is cigarette smoking. There is marked heterogeneity in the development of COPD among persons with similar cigarette smoking histories, which is likely partially explained by genetic variation. Genomic approaches such as genomewide association studies and gene expression studies have been used to discover genes and molecular pathways involved in COPD pathogenesis; however, these “first generation” omics studies have limitations. Integrative genomic studies are emerging which can combine genomic datasets to further examine the molecular underpinnings of COPD. Future research in COPD genetics will likely use network-based approaches to integrate multiple genomic data types in order to model the complex molecular interactions involved in COPD pathogenesis. This article reviews the genomic research to date and offers a vision for the future of integrative genomic research in COPD.  相似文献   

15.

Background

Increasingly available multilayered omics data on large populations has opened exciting analytic opportunities and posed unique challenges to robust estimation of causal effects in the setting of complex disease phenotypes. The GAW20 Causal Modeling Working Group has applied complementary approaches (eg, Mendelian randomization, structural equations modeling, Bayesian networks) to discover novel causal effects of genomic and epigenomic variation on lipid phenotypes, as well as to validate prior findings from observational studies.

Results

Two Mendelian randomization studies have applied novel approaches to instrumental variable selection in methylation data, identifying bidirectional causal effects of CPT1A and triglycerides, as well as of RNMT and C6orf42, on high-density lipoprotein cholesterol response to fenofibrate. The CPT1A finding also emerged in a Bayesian network study. The Mendelian randomization studies have implemented both existing and novel steps to account for pleiotropic effects, which were independently detected in the GAW20 data via a structural equation modeling approach. Two studies estimated indirect effects of genomic variation (via DNA methylation and/or correlated phenotypes) on lipid outcomes of interest. Finally, a novel weighted R2 measure was proposed to complement other causal inference efforts by controlling for the influence of outlying observations.

Conclusions

The GAW20 contributions illustrate the diversity of possible approaches to causal inference in the multi-omic context, highlighting the promises and assumptions of each method and the benefits of integrating both across methods and across omics layers for the most robust and comprehensive insights into disease processes.
  相似文献   

16.
李高磊  黄玮  孙浩  李余动 《微生物学报》2021,61(9):2581-2593
随着大数据时代的到来,如何将生物组学海量数据转化为易理解及可视化的知识是当前生物信息学面临的重要挑战之一。为了处理复杂、高维的微生物组数据,目前机器学习算法已被应用于人体微生物组研究,以揭示疾病背后的复杂机制。本文首先简述了微生物组数据处理方法及常用的机器学习算法,如支持向量机(SVM)、随机森林(RF)和人工神经网络(ANN)等,然后对机器学习的工作流程及其要点进行阐述,并探讨了机器学习算法在基于微生物组数据预测宿主表型方面的应用。最后以唾液微生物组数据预测口腔异味为例,实现了机器学习算法的模型构建与评估分析,并提供了可用于微生物组研究实践的R/Python代码(https://github.com/LiLabZSU/microbioML)。  相似文献   

17.
In comparison to other complex disease traits, alcoholism and alcohol abuse are influenced by the combined effects of many genes that alter susceptibility, phenotypic expression and associated morbidity, respectively. Many genetic studies, in both animal models and humans, have identified genetic intervals containing genes that influence alcoholism or behavioral responses to ethanol. Concurrently, a growing number of microarray studies have identified gene expression differences related to ethanol drinking or other ethanol behaviors. However, concerns about the statistical power of these experiments, combined with the complexity of the underlying phenotypes, have greatly hampered the identification of candidate genes underlying ethanol behaviors. Meta-analysis approaches using recent compilations of large datasets of microarray, behavioral and genetic data promise improved statistical power for detecting the genes or gene networks affecting ethanol behaviors and other complex traits.  相似文献   

18.
Many complex diseases such as cancer are associated with changes in biological pathways and molecular networks rather than being caused by single gene alterations. A major challenge in the diagnosis and treatment of such diseases is to identify characteristic aberrancies in the biological pathways and molecular network activities and elucidate their relationship to the disease. This review presents recent progress in using high-throughput biological assays to decipher aberrant pathways and network activities. In particular, this review provides specific examples in which high-throughput data have been applied to identify relationships between diseases and aberrant pathways and network activities. The achievements in this field have been remarkable, but many challenges have yet to be addressed.  相似文献   

19.
20.
Genome-wide association studies have been instrumental in identifying genetic variants associated with complex traits such as human disease or gene expression phenotypes. It has been proposed that extending existing analysis methods by considering interactions between pairs of loci may uncover additional genetic effects. However, the large number of possible two-marker tests presents significant computational and statistical challenges. Although several strategies to detect epistasis effects have been proposed and tested for specific phenotypes, so far there has been no systematic attempt to compare their performance using real data. We made use of thousands of gene expression traits from linkage and eQTL studies, to compare the performance of different strategies. We found that using information from marginal associations between markers and phenotypes to detect epistatic effects yielded a lower false discovery rate (FDR) than a strategy solely using biological annotation in yeast, whereas results from human data were inconclusive. For future studies whose aim is to discover epistatic effects, we recommend incorporating information about marginal associations between SNPs and phenotypes instead of relying solely on biological annotation. Improved methods to discover epistatic effects will result in a more complete understanding of complex genetic effects.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号