首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 62 毫秒
1.
Despite much recent interest, it remains unclear what determines the rate of evolution of gene expression. To study this issue we develop a new measure, called "Expression Conservation Index" (ECI), to quantify the degree of tissue-expression conservation between two homologous genes. Applying this measure to a large set of gene expression data from human and mouse, we show that tissue expression tends to evolve rapidly for genes that are expressed in only a limited number of tissues, whereas tissue expression can be conserved for a long time for genes expressed in a large number of tissues. Therefore, expression breadth is an important determinant for evolutionary conservation of tissue expression. In addition, we find a rapid decrease in ECI with the synonymous divergence between duplicate genes, suggesting fast divergence in tissue expression between duplicate genes.  相似文献   

2.
3.

Background

Selective gene duplicability, the extensive expansion of a small number of gene families, is universal. Quantitatively, the number of genes (P(K)) with K duplicates in a genome decreases precipitously as K increases, and often follows a power law (P(k)∝k). Functional diversification, either neo- or sub-functionalization, is a major evolution route for duplicate genes.

Results

Using three lines of genomic datasets, we studied the relationship between gene duplicability and diversifiability in the topology of biochemical networks. First, we explored scenario where two pathways in the biochemical networks antagonize each other. Synthetic knockout of respective genes for the two pathways rescues the phenotypic defects of each individual knockout. We identified duplicate gene pairs with sufficient divergences that represent this antagonism relationship in the yeast S. cerevisiae. Such pairs overwhelmingly belong to large gene families, thus tend to have high duplicability. Second, we used distances between proteins of duplicate genes in the protein interaction network as a metric of their diversification. The higher a gene’s duplicate count, the further the proteins of this gene and its duplicates drift away from one another in the networks, which is especially true for genetically antagonizing duplicate genes. Third, we computed a sequence-homology-based clustering coefficient to quantify sequence diversifiability among duplicate genes – the lower the coefficient, the more the sequences have diverged. Duplicate count (K) of a gene is negatively correlated to the clustering coefficient of its duplicates, suggesting that gene duplicability is related to the extent of sequence divergence within the duplicate gene family.

Conclusion

Thus, a positive correlation exists between gene diversifiability and duplicability in the context of biochemical networks – an improvement of our understanding of gene duplicability.  相似文献   

4.
The impact of the biological network structures on the divergence between the two copies of one duplicate gene pair involved in the networks has not been documented on a genome scale. Having analyzed the most recently updated Database of Interacting Proteins (DIP) by incorporating the information for duplicate genes of the same age in yeast, we find that there was a highly significantly positive correlation between the level of connectivity of ancient genes and the number of shared partners of their duplicates in the protein-protein interaction networks. This suggests that duplicate genes with a low ancestral connectivity tend to provide raw materials for functional novelty, whereas those duplicate genes with a high ancestral connectivity tend to create functional redundancy for a genome during the same evolutionary period. Moreover, the difference in the number of partners between two copies of a duplicate pair was found to follow a power-law distribution. This suggests that loss and gain of interacting partners for most duplicate genes with a lower level of ancestral connectivity is largely symmetrical, whereas the "hub duplicate genes" with a higher level of ancient connectivity display an asymmetrical divergence pattern in protein-protein interactions. Thus, it is clear that the protein-protein interaction network structures affect the divergence pattern of duplicate genes. Our findings also provide insights into the origin and development of biological networks.  相似文献   

5.

Background  

While gene duplication is known to be one of the most common mechanisms of genome evolution, the fates of genes after duplication are still being debated. In particular, it is presently unknown whether most duplicate genes preserve (or subdivide) the functions of the parental gene or acquire new functions. One aspect of gene function, that is the expression profile in gene coexpression network, has been largely unexplored for duplicate genes.  相似文献   

6.
7.
8.
The low prevalence rate of orphan diseases (OD) requires special combined efforts to improve diagnosis, prevention, and discovery of novel therapeutic strategies. To identify and investigate relationships based on shared genes or shared functional features, we have conducted a bioinformatic-based global analysis of all orphan diseases with known disease-causing mutant genes. Starting with a bipartite network of known OD and OD-causing mutant genes and using the human protein interactome, we first construct and topologically analyze three networks: the orphan disease network, the orphan disease-causing mutant gene network, and the orphan disease-causing mutant gene interactome. Our results demonstrate that in contrast to the common disease-causing mutant genes that are predominantly nonessential, a majority of orphan disease-causing mutant genes are essential. In confirmation of this finding, we found that OD-causing mutant genes are topologically important in the protein interactome and are ubiquitously expressed. Additionally, functional enrichment analysis of those genes in which mutations cause ODs shows that a majority result in premature death or are lethal in the orthologous mouse gene knockout models. To address the limitations of traditional gene-based disease networks, we also construct and analyze OD networks on the basis of shared enriched features (biological processes, cellular components, pathways, phenotypes, and literature citations). Analyzing these functionally-linked OD networks, we identified several additional OD-OD relations that are both phenotypically similar and phenotypically diverse. Surprisingly, we observed that the wiring of the gene-based and other feature-based OD networks are largely different; this suggests that the relationship between ODs cannot be fully captured by the gene-based network alone.  相似文献   

9.
Learning about the roles that duplicate genes play in the origins of novel phenotypes requires an understanding of how their functions evolve. A previous method for achieving this goal, CDROM, employs gene expression distances as proxies for functional divergence and then classifies the evolutionary mechanisms retaining duplicate genes from comparisons of these distances in a decision tree framework. However, CDROM does not account for stochastic shifts in gene expression or leverage advances in contemporary statistical learning for performing classification, nor is it capable of predicting the parameters driving duplicate gene evolution. Thus, here we develop CLOUD, a multi-layer neural network built on a model of gene expression evolution that can both classify duplicate gene retention mechanisms and predict their underlying evolutionary parameters. We show that not only is the CLOUD classifier substantially more powerful and accurate than CDROM, but that it also yields accurate parameter predictions, enabling a better understanding of the specific forces driving the evolution and long-term retention of duplicate genes. Further, application of the CLOUD classifier and predictor to empirical data from Drosophila recapitulates many previous findings about gene duplication in this lineage, showing that new functions often emerge rapidly and asymmetrically in younger duplicate gene copies, and that functional divergence is driven by strong natural selection. Hence, CLOUD represents a major advancement in classifying retention mechanisms and predicting evolutionary parameters of duplicate genes, thereby highlighting the utility of incorporating sophisticated statistical learning techniques to address long-standing questions about evolution after gene duplication.  相似文献   

10.
ABSTRACT: A central idea in biology is the hierarchical organization of cellular processes. A commonly used method to identify the hierarchical modular organization of network relies on detecting a global signature known as variation of clustering coefficient (so-called modularity scaling). Although several studies have suggested other possible origins of this signature, it is still widely used nowadays to identify hierarchical modularity, especially in the analysis of biological networks. Therefore, a further and systematical investigation of this signature for different types of biological networks is necessary. RESULTS: We analyzed a variety of biological networks and found that the commonly used signature of hierarchical modularity is actually the reflection of spoke-like topology, suggesting a different view of network architecture. We proved that the existence of super-hubs is the origin that the clustering coefficient of a node follows a particular scaling law with degree k in metabolic networks. To study the modularity of biological networks, we systematically investigated the relationship between repulsion of hubs and variation of clustering coefficient. We provided direct evidences for repulsion between hubs being the underlying origin of the variation of clustering coefficient, and found that for biological networks having no anti-correlation between hubs, such as gene co-expression network, the clustering coefficient doesn't show dependence of degree. CONCLUSIONS: Here we have shown that the variation of clustering coefficient is neither sufficient nor exclusive for a network to be hierarchical. Our results suggest the existence of spoke-like modules as opposed to "deterministic model" of hierarchical modularity, and suggest the need to reconsider the organizational principle of biological hierarchy.  相似文献   

11.
12.
13.
Geometric interpretation of gene coexpression network analysis   总被引:1,自引:0,他引:1  
THE MERGING OF NETWORK THEORY AND MICROARRAY DATA ANALYSIS TECHNIQUES HAS SPAWNED A NEW FIELD: gene coexpression network analysis. While network methods are increasingly used in biology, the network vocabulary of computational biologists tends to be far more limited than that of, say, social network theorists. Here we review and propose several potentially useful network concepts. We take advantage of the relationship between network theory and the field of microarray data analysis to clarify the meaning of and the relationship among network concepts in gene coexpression networks. Network theory offers a wealth of intuitive concepts for describing the pairwise relationships among genes, which are depicted in cluster trees and heat maps. Conversely, microarray data analysis techniques (singular value decomposition, tests of differential expression) can also be used to address difficult problems in network theory. We describe conditions when a close relationship exists between network analysis and microarray data analysis techniques, and provide a rough dictionary for translating between the two fields. Using the angular interpretation of correlations, we provide a geometric interpretation of network theoretic concepts and derive unexpected relationships among them. We use the singular value decomposition of module expression data to characterize approximately factorizable gene coexpression networks, i.e., adjacency matrices that factor into node specific contributions. High and low level views of coexpression networks allow us to study the relationships among modules and among module genes, respectively. We characterize coexpression networks where hub genes are significant with respect to a microarray sample trait and show that the network concept of intramodular connectivity can be interpreted as a fuzzy measure of module membership. We illustrate our results using human, mouse, and yeast microarray gene expression data. The unification of coexpression network methods with traditional data mining methods can inform the application and development of systems biologic methods.  相似文献   

14.
Conservation and coevolution in the scale-free human gene coexpression network   总被引:12,自引:0,他引:12  
The role of natural selection in biology is well appreciated. Recently, however, a critical role for physical principles of network self-organization in biological systems has been revealed. Here, we employ a systems level view of genome-scale sequence and expression data to examine the interplay between these two sources of order, natural selection and physical self-organization, in the evolution of human gene regulation. The topology of a human gene coexpression network, derived from tissue-specific expression profiles, shows scale-free properties that imply evolutionary self-organization via preferential node attachment. Genes with numerous coexpressed partners (the hubs of the coexpression network) evolve more slowly on average than genes with fewer coexpressed partners, and genes that are coexpressed show similar rates of evolution. Thus, the strength of selective constraints on gene sequences is affected by the topology of the gene coexpression network. This connection is strong for the coding regions and 3' untranslated regions (UTRs), but the 5' UTRs appear to evolve under a different regime. Surprisingly, we found no connection between the rate of gene sequence divergence and the extent of gene expression profile divergence between human and mouse. This suggests that distinct modes of natural selection might govern sequence versus expression divergence, and we propose a model, based on rapid, adaptation-driven divergence and convergent evolution of gene expression patterns, for how natural selection could influence gene expression divergence.  相似文献   

15.
16.

Background

Genome-wide expression data of gene microarrays can be used to infer gene networks. At a cellular level, a gene network provides a picture of the modules in which genes are densely connected, and of the hub genes, which are highly connected with other genes. A gene network is useful to identify the genes involved in the same pathway, in a protein complex or that are co-regulated. In this study, we used different methods to find gene networks in the ciliate Tetrahymena thermophila, and describe some important properties of this network, such as modules and hubs.

Methodology/Principal Findings

Using 67 single channel microarrays, we constructed the Tetrahymena gene network (TGN) using three methods: the Pearson correlation coefficient (PCC), the Spearman correlation coefficient (SCC) and the context likelihood of relatedness (CLR) algorithm. The accuracy and coverage of the three networks were evaluated using four conserved protein complexes in yeast. The CLR network with a Z-score threshold 3.49 was determined to be the most robust. The TGN was partitioned, and 55 modules were found. In addition, analysis of the arbitrarily determined 1200 hubs showed that these hubs could be sorted into six groups according to their expression profiles. We also investigated human disease orthologs in Tetrahymena that are missing in yeast and provide evidence indicating that some of these are involved in the same process in Tetrahymena as in human.

Conclusions/Significance

This study constructed a Tetrahymena gene network, provided new insights to the properties of this biological network, and presents an important resource to study Tetrahymena genes at the pathway level.  相似文献   

17.
18.
In a research environment dominated by reductionist approaches to brain disease mechanisms, gene network analysis provides a complementary framework in which to tackle the complex dysregulations that occur in neuropsychiatric and other neurological disorders. Gene–gene expression correlations are a common source of molecular networks because they can be extracted from high‐dimensional disease data and encapsulate the activity of multiple regulatory systems. However, the analysis of gene coexpression patterns is often treated as a mechanistic black box, in which looming ‘hub genes’ direct cellular networks, and where other features are obscured. By examining the biophysical bases of coexpression and gene regulatory changes that occur in disease, recent studies suggest it is possible to use coexpression networks as a multi‐omic screening procedure to generate novel hypotheses for disease mechanisms. Because technical processing steps can affect the outcome and interpretation of coexpression networks, we examine the assumptions and alternatives to common patterns of coexpression analysis and discuss additional topics such as acceptable datasets for coexpression analysis, the robust identification of modules, disease‐related prioritization of genes and molecular systems and network meta‐analysis. To accelerate coexpression research beyond modules and hubs, we highlight some emerging directions for coexpression network research that are especially relevant to complex brain disease, including the centrality–lethality relationship, integration with machine learning approaches and network pharmacology .  相似文献   

19.
Raquel Assis 《Fly》2014,8(2):91-94
Gene duplication is thought to play a key role in phenotypic innovation. While several processes have been hypothesized to drive the retention and functional evolution of duplicate genes, their genomic contributions have never been determined. We recently developed the first genome-wide method to classify these processes by comparing distances between expression profiles of duplicate genes and their ancestral single-copy orthologs. Application of our approach to spatial gene expression profiles in two Drosophila species revealed that a majority of young duplicate genes possess new functions, and that new functions are acquired rapidly—often within a few million years. Surprisingly, new functions tend to arise in younger copies of duplicate gene pairs. Moreover, we found that young duplicates are often specifically expressed in testes, whereas old duplicates are broadly expressed across several tissues, providing strong support for the hypothetical “out-of-testes” origin of new genes. In this Extra View, I discuss our findings in the context of theoretical predictions about gene duplication, with a particular emphasis on the importance of natural selection in the evolution of novel phenotypes.  相似文献   

20.
Gene coexpression network analysis is a powerful “data-driven” approach essential for understanding cancer biology and mechanisms of tumor development. Yet, despite the completion of thousands of studies on cancer gene expression, there have been few attempts to normalize and integrate co-expression data from scattered sources in a concise “meta-analysis” framework. We generated such a resource by exploring gene coexpression networks in 82 microarray datasets from 9 major human cancer types. The analysis was conducted using an elaborate weighted gene coexpression network (WGCNA) methodology and identified over 3,000 robust gene coexpression modules. The modules covered a range of known tumor features, such as proliferation, extracellular matrix remodeling, hypoxia, inflammation, angiogenesis, tumor differentiation programs, specific signaling pathways, genomic alterations, and biomarkers of individual tumor subtypes. To prioritize genes with respect to those tumor features, we ranked genes within each module by connectivity, leading to identification of module-specific functionally prominent hub genes. To showcase the utility of this network information, we positioned known cancer drug targets within the coexpression networks and predicted that Anakinra, an anti-rheumatoid therapeutic agent, may be promising for development in colorectal cancer. We offer a comprehensive, normalized and well documented collection of >3000 gene coexpression modules in a variety of cancers as a rich data resource to facilitate further progress in cancer research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号