首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Zhou Y  Wang R  Li L  Xia X  Sun Z 《Journal of molecular biology》2006,359(4):1150-1159
Identifying potential protein interactions is of great importance in understanding the topologies of cellular networks, which is much needed and valued in current systematic biological studies. The development of our computational methods to predict protein-protein interactions have been spurred on by the massive sequencing efforts of the genomic revolution. Among these methods is phylogenetic profiling, which assumes that proteins under similar evolutionary pressures with similar phylogenetic profiles might be functionally related. Here, we introduce a method for inferring functional linkages between proteins from their evolutionary scenarios. The term evolutionary scenario refers to a series of events that occurred in speciation over time, which can be reconstructed given a phylogenetic profile and a species tree. Common evolutionary pressures on two proteins can then be inferred by comparing their evolutionary scenarios, which is a direct indication of their functional linkage. This scenario method has proven to have better performance compared with the classical phylogenetic profile method, when applied to the same test set. In addition, predicted results of the two methods are found to be fairly different, suggesting the possibility of merging them in order to achieve a better performance. We analyzed the influence of the topology of the phylogenetic tree on the performance of this method, and found it to be robust to perturbations in the topology of the tree. However, if a completely random tree is incorporated, performance will decline significantly. The evolutionary scenario method was used for inferring functional linkages in 67 species, and 40,006 linkages were predicted. We examine our prediction for budding yeast and find that almost all predicted linkages are supported by further evidence.  相似文献   

2.
An important element of the developing field of proteomics is to understand protein-protein interactions and other functional links amongst genes. Across-species correlation methods for detecting functional links work on the premise that functionally linked proteins will tend to show a common pattern of presence and absence across a range of genomes. We describe a maximum likelihood statistical model for predicting functional gene linkages. The method detects independent instances of the correlated gain or loss of pairs of proteins on phylogenetic trees, reducing the high rates of false positives observed in conventional across-species methods that do not explicitly incorporate a phylogeny. We show, in a dataset of 10,551 protein pairs, that the phylogenetic method improves by up to 35% on across-species analyses at identifying known functionally linked proteins. The method shows that protein pairs with at least two to three correlated events of gain or loss are almost certainly functionally linked. Contingent evolution, in which one gene's presence or absence depends upon the presence of another, can also be detected phylogenetically, and may identify genes whose functional significance depends upon its interaction with other genes. Incorporating phylogenetic information improves the prediction of functional linkages. The improvement derives from having a lower rate of false positives and from detecting trends that across-species analyses miss. Phylogenetic methods can easily be incorporated into the screening of large-scale bioinformatics datasets to identify sets of protein links and to characterise gene networks.  相似文献   

3.
Song B  Wang F  Guo Y  Sang Q  Liu M  Li D  Fang W  Zhang D 《Proteins》2012,80(7):1736-1743
Although functionally similar proteins across species have been widely studied, functionally similar proteins within species showing low sequence similarity have not been examined in detail. Identification of these proteins is of significant importance for understanding biological functions, evolution of protein families, progression of co-evolution, and convergent evolution and others which cannot be obtained by detection of functionally similar proteins across species. Here, we explored a method of detecting functionally similar proteins within species based on graph theory. After denoting protein-protein interaction networks using graphs, we split the graphs into subgraphs using the 1-hop method. Proteins with functional similarities in a species were detected using a method of modified shortest path to compare these subgraphs and to find the eligible optimal results. Using seven protein-protein interaction networks and this method, some functionally similar proteins with low sequence similarity that cannot detected by sequence alignment were identified. By analyzing the results, we found that, sometimes, it is difficult to separate homologous from convergent evolution. Evaluation of the performance of our method by gene ontology term overlap showed that the precision of our method was excellent.  相似文献   

4.
The identification of the whole set of protein interactions taking place in an organism is one of the main tasks in genomics, proteomics and systems biology. One of the computational techniques used by many investigators for studying and predicting protein interactions is the comparison of evolutionary histories (phylogenetic trees), under the hypothesis that interacting proteins would be subject to a similar evolutionary pressure resulting in a similar topology of the corresponding trees. Here, we present a new approach to predict protein interactions from phylogenetic trees, which incorporates information on the overall evolutionary histories of the species (i.e. the canonical "tree of life") in order to correct by the expected background similarity due to the underlying speciation events. We test the new approach in the largest set of annotated interacting proteins for Escherichia coli. This assessment of co-evolution in the context of the tree of life leads to a highly significant improvement (P(N) by sign test approximately 10E-6) in predicting interaction partners with respect to the previous technique, which does not incorporate information on the overall speciation tree. For half of the proteins we found a real interactor among the 6.4% top scores, compared with the 16.5% by the previous method. We applied the new method to the whole E.coli proteome and propose functions for some hypothetical proteins based on their predicted interactors. The new approach allows us also to detect non-canonical evolutionary events, in particular horizontal gene transfers. We also show that taking into account these non-canonical evolutionary events when assessing the similarity between evolutionary trees improves the performance of the method predicting interactions.  相似文献   

5.
The gene composition of present-day genomes has been shaped by a complicated evolutionary history, resulting in diverse distributions of genes across genomes. The pattern of presence and absence of a gene in different genomes is called its phylogenetic profile. It has been shown that proteins whose encoding genes have highly similar profiles tend to be functionally related: As these genes were gained and lost together, their encoded proteins can probably only perform their full function if both are present. However, a large proportion of genes encoding interacting proteins do not have matching profiles. In this study, we analysed one possible reason for this, namely that phylogenetic profiles can be affected by multi-functional proteins such as shared subunits of two or more protein complexes. We found that by considering triplets of proteins, of which one protein is multi-functional, a large fraction of disturbed co-occurrence patterns can be explained.  相似文献   

6.
Whatever criteria are used to measure evolutionary success – species numbers, geographic range, ecological abundance, ecological and life history diversity, background diversification rates, or the presence of rapidly evolving clades – the legume family is one of the most successful lineages of flowering plants. Despite this, we still know rather little about the dynamics of lineage and species diversification across the family through the Cenozoic, or about the underlying drivers of diversification. There have been few attempts to estimate net species diversification rates or underlying speciation and extinction rates for legume clades, to test whether among-lineage variation in diversification rates deviates from null expectations, or to locate species diversification rate shifts on specific branches of the legume phylogenetic tree. In this study, time-calibrated phylogenetic trees for a set of species-rich legume clades – Calliandra, Indigofereae, Lupinus, Mimosa and Robinieae – and for the legume family as a whole, are used to explore how we might approach these questions. These clades are analysed using recently developed maximum likelihood and Bayesian methods to detect species diversification rate shifts and test for among-lineage variation in speciation, extinction and net diversification rates. Possible explanations for rate shifts in terms of extrinsic factors and/or intrinsic trait evolution are discussed. In addition, several methodological issues and limitations associated with these analyses are highlighted emphasizing the potential to improve our understanding of the evolutionary dynamics of legume diversification by using much more densely sampled phylogenetic trees that integrate information across broad taxonomic, geographical and temporal levels.  相似文献   

7.
Function diversification in large protein families is a major mechanism driving expansion of cellular networks, providing organisms with new metabolic capabilities and thus adding to their evolutionary success. However, our understanding of the evolutionary mechanisms of functional diversity in such families is very limited, which, among many other reasons, is due to the lack of functionally well-characterized sets of proteins. Here, using the FGGY carbohydrate kinase family as an example, we built a confidently annotated reference set (CARS) of proteins by propagating experimentally verified functional assignments to a limited number of homologous proteins that are supported by their genomic and functional contexts. Then, we analyzed, on both the phylogenetic and the molecular levels, the evolution of different functional specificities in this family. The results show that the different functions (substrate specificities) encoded by FGGY kinases have emerged only once in the evolutionary history following an apparently simple divergent evolutionary model. At the same time, on the molecular level, one isofunctional group (L-ribulokinase, AraB) evolved at least two independent solutions that employed distinct specificity-determining residues for the recognition of a same substrate (L-ribulose). Our analysis provides a detailed model of the evolution of the FGGY kinase family. It also shows that only combined molecular and phylogenetic approaches can help reconstruct a full picture of functional diversifications in such diverse families.  相似文献   

8.
The evolutionary history of living species is usually inferred through the phylogenetic analysis of molecular and morphological information using various mathematical models. New challenges in phylogenetic analysis are centered mostly on the search for accurate and efficient methods to handle the huge amounts of sequence data generated from newer genome sequencing. The next major challenge is the determination of relationships between the evolution of structural elements and their functional implementation, which is largely ignored in previous analyses. Here, we described the discovery of structural elements in metazoan mitochondrial genomes, termed key K-strings, that can serve as a basis for phylogenetic tree construction. Although comprising only a small fraction (0.73%) of all K-strings, these key K-strings are pivotal to the tree construction because they allow for a significant reduction in the computational time required to construct phylogenetic trees, and more importantly, they make significant improvement to the results of phylogenetic inference. The trees constructed from the key K-strings were consistent overall to our current view of metazoan phylogeny and exhibited a more rational topology than the trees constructed by using other conventional methods. Surprisingly, the key K-strings tended to accumulate in the conserved regions of the original sequences, which were most likely due to strong selection pressure. Furthermore, the special structural features of the key K-strings should have some potential applications in the study of the structures and functions relationship of proteins and in the determination of evolutionary trajectory of species. The novelty and potential importance of key K-strings lead us to believe that they are essential evolutionary elements. As such, they may play important roles in the process of species evolution and their physical existence. Further studies could lead to discoveries regarding the relationship between evolution and processes of speciation.  相似文献   

9.
细胞色素分子疏水性与进化的关系   总被引:1,自引:1,他引:0  
本文在先前研究结果的基础上,通过对细胞色素分子一维结构间疏水相似性的计算,建立了相应的分子系统树,并对细胞色素分子间的进化关系进行了探讨。结果表明,从蛋白质分子的疏水相似性和非线性三维结构来研究分子间的进货关系,不仅得到了与用其它方法所得到的结果基本一致的结论,而且还在一定程度上克服了其它一些方法的局限性,取得了较佳的结果。  相似文献   

10.
Comprehensive sampling of genomic biodiversity is fast becoming a reality for some genomic regions and complete organelle genomes. Genomic biodiversity is defined as large genomic sequences from many species, and here some recent work is reviewed that demonstrates the potential benefits of genomic biodiversity for molecular evolutionary analysis and phylogenetic reconstruction. This work shows that using likelihood-based approaches, taxon addition can dramatically improve phylogenetic reconstruction. Features or dynamics of the evolutionary process are much more easily inferred with large numbers of taxa, and large numbers are essential for discriminating differences in evolutionary patterns between sites. Accurate prediction of site-specific patterns can improve phylogenetic reconstruction by an amount equivalent to quadrupling sequence length. Genomic biodiversity is particularly central to research relating patterns of evolution, adaptation and coevolution to structural and functional features of proteins. Research on detecting coevolution between amino acid residues in proteins demonstrates a clear need for much greater numbers of closely related taxa to better discriminate site-specific patterns of interaction, and to allow more detailed analysis of coevolutionary interactions between subunits in protein complexes. It is argued that parsing out coevolutionary and other context-dependent substitution probabilities is essential for discriminating between coevolution and adaptation, and for more realistically modelling the evolution of proteins. Also reviewed is research that argues for increasing the efficiency of acquiring genomic biodiversity, and suggests that this might be done by simultaneously shotgun cloning and sequencing genomic mixtures from many species. Increased efficiency is a prerequisite if genomic biodiversity levels are to rapidly increase by orders of magnitude, and thus lead to dramatically improved understanding of interactions between protein structure, function and sequence evolution.  相似文献   

11.
Protein co-evolution, co-adaptation and interactions   总被引:2,自引:0,他引:2  
Pazos F  Valencia A 《The EMBO journal》2008,27(20):2648-2655
Co-evolution has an important function in the evolution of species and it is clearly manifested in certain scenarios such as host–parasite and predator–prey interactions, symbiosis and mutualism. The extrapolation of the concepts and methodologies developed for the study of species co-evolution at the molecular level has prompted the development of a variety of computational methods able to predict protein interactions through the characteristics of co-evolution. Particularly successful have been those methods that predict interactions at the genomic level based on the detection of pairs of protein families with similar evolutionary histories (similarity of phylogenetic trees: mirrortree). Future advances in this field will require a better understanding of the molecular basis of the co-evolution of protein families. Thus, it will be important to decipher the molecular mechanisms underlying the similarity observed in phylogenetic trees of interacting proteins, distinguishing direct specific molecular interactions from other general functional constraints. In particular, it will be important to separate the effects of physical interactions within protein complexes (‘co-adaptation') from other forces that, in a less specific way, can also create general patterns of co-evolution.  相似文献   

12.
Experimental approaches for the identification of functionally important regions on the surface of a protein involve mutagenesis, in which exposed residues are replaced one after another while the change in binding to other proteins or changes in activity are recorded. However, practical considerations limit the use of these methods to small-scale studies, precluding a full mapping of all the functionally important residues on the surface of a protein. We present here an alternative approach involving the use of evolutionary data in the form of multiple-sequence alignment for a protein family to identify hot spots and surface patches that are likely to be in contact with other proteins, domains, peptides, DNA, RNA or ligands. The underlying assumption in this approach is that key residues that are important for binding should be conserved throughout evolution, just like residues that are crucial for maintaining the protein fold, i.e. buried residues. A main limitation in the implementation of this approach is that the sequence space of a protein family may be unevenly sampled, e.g. mammals may be overly represented. Thus, a seemingly conserved position in the alignment may reflect a taxonomically uneven sampling, rather than being indicative of structural or functional importance. To avoid this problem, we present here a novel methodology based on evolutionary relations among proteins as revealed by inferred phylogenetic trees, and demonstrate its capabilities for mapping binding sites in SH2 and PTB signaling domains. A computer program that implements these ideas is available freely at: http://ashtoret.tau.ac.il/ approximately rony Copyright 2001 Academic Press.  相似文献   

13.
Evolution of the Rab family of small GTP-binding proteins.   总被引:33,自引:0,他引:33  
Rab proteins are small GTP-binding proteins that form the largest family within the Ras superfamily. Rab proteins regulate vesicular trafficking pathways, behaving as membrane-associated molecular switches. Here, we have identified the complete Rab families in the Caenorhabditis elegans (29 members), Drosophila melanogaster (29), Homo sapiens (60) and Arabidopsis thaliana (57), and we defined criteria for annotation of this protein family in each organism. We studied sequence conservation patterns and observed that the RabF motifs and the RabSF regions previously described in mammalian Rabs are conserved across species. This is consistent with conserved recognition mechanisms by general regulators and specific effectors. We used phylogenetic analysis and other approaches to reconstruct the multiplication of the Rab family and observed that this family shows a strict phylogeny of function as opposed to a phylogeny of species. Furthermore, we observed that Rabs co-segregating in phylogenetic trees show a pattern of similar cellular localisation and/or function. Therefore, animal and fungi Rab proteins can be grouped in "Rab functional groups" according to their segregating patterns in phylogenetic trees. These functional groups reflect similarity of sequence, localisation and/or function, and may also represent shared ancestry. Rab functional groups can help the understanding of the functional evolution of the Rab family in particular and vesicular transport in general, and may be used to predict general functions for novel Rab sequences.  相似文献   

14.
15.
Recently, an unexpected, positive correlation between the rate of evolution of mitochondrial proteins and longevity was reported. Here we re-analyze this relationship in various mammalian lineages using a bayesian phylogenetic analysis of amino-acid sequences, allowing for variable evolutionary rates across sites and species. A negative relationship between protein evolutionary rate and species longevity is reported for all oxidative phosphorylation complexes. A detailed analysis of the cytochrome b in 528 mammals reinforced this result, which contradicts previous publications. Reconducting the analysis in birds yielded similar results. We explain the discrepancy between this and previous reports by our improved taxon sampling and more appropriate methodology: unlike distance-based methods, the tree-based bayesian approach can take into account the high variation of substitution rate across amino-acid sites, and the resulting multiple substitution events. We discuss how our analysis contradicts Rottenberg’s rationale, but does not dismiss his proposal of a longevity-dependent selective pressure on mitochondrial mutation rate in mammals and birds. This is because his interpretation invokes adaptation as the single evolutionary force at work, disregarding the effects of mutation, genetic drift, and purifying selection.  相似文献   

16.
Evolutionary studies commonly model single nucleotide substitutions and assume that they occur as independent draws from a unique probability distribution across the sequence studied. This assumption is violated for protein-coding sequences, and we consider modeling approaches where codon positions (CPs) are treated as separate categories of sites because within each category the assumption is more reasonable. Such "codon-position" models have been shown to explain the evolution of codon data better than homogenous models in previous studies. This paper examines the ways in which codon-position models outperform homogeneous models and characterizes the differences in estimates of model parameters across CPs. Using the PANDIT database of multiple species DNA sequence alignments, we quantify the differences in the evolutionary processes at the 3 CPs in a systematic and comprehensive manner, characterizing previously undescribed features of protein evolution. We relate our findings to the functional constraints imposed by the genetic code, protein function, and the types of mutation that cause synonymous and nonsynonymous codon changes. The results increase our understanding of selective constraints and could be incorporated into phylogenetic analyses or gene-finding techniques in the future. The methods used are extended to an overlapping reading frame data set, and we discover that overlapping reading frames do not necessarily cause more stringent evolutionary constraints.  相似文献   

17.
Integrating phylogenetic information can potentially improve our ability to explain species' traits, patterns of community assembly, the network structure of communities, and ecosystem function. In this study, we use mathematical models to explore the ecological and evolutionary factors that modulate the explanatory power of phylogenetic information for communities of species that interact within a single trophic level. We find that phylogenetic relationships among species can influence trait evolution and rates of interaction among species, but only under particular models of species interaction. For example, when interactions within communities are mediated by a mechanism of phenotype matching, phylogenetic trees make specific predictions about trait evolution and rates of interaction. In contrast, if interactions within a community depend on a mechanism of phenotype differences, phylogenetic information has little, if any, predictive power for trait evolution and interaction rate. Together, these results make clear and testable predictions for when and how evolutionary history is expected to influence contemporary rates of species interaction.  相似文献   

18.
Rapid evolution of reproductive proteins has been documented in a wide variety of taxa. In internally fertilized species, knowledge about the evolutionary dynamics of these proteins between closely related taxa is primarily limited to accessory gland proteins in the semen of Drosophila. Investigation of additional taxa and functional classes of proteins is necessary in order to determine if there is a general pattern of adaptive evolution of reproductive proteins between recently diverged species. We performed an evolutionary analysis of 2 egg coat proteins, ZP2 and ZP3, in 15 species of deer mice (genus Peromyscus). Both of these proteins are involved in egg-sperm binding, a critical step in maintaining species-specific fertilization. Here, we show that Zp2 and Zp3 gene trees are not consistent with trees based on nonreproductive genes, Mc1r and Lcat, where species formed monophyletic clades. In fact, for both of the reproductive genes, intraspecific amino acid variation was extensive and alleles were sometimes shared across species. We document positive selection acting on ZP2 and ZP3 and identify specific amino acid sites that are likely targets of selection using both maximum likelihood approaches and patterns of parallel amino acid change. In ZP3, positively selected sites are clustered in and around the region implicated in sperm binding in Mus, suggesting changes may impact egg-sperm binding and fertilization potential. Finally, we identify lineages with significantly elevated rates of amino acid substitution using a Bayesian mapping approach. These findings demonstrate that the pattern of adaptive reproductive protein evolution found at higher taxonomic levels can be documented between closely related mammalian species, where reproductive isolation has evolved recently.  相似文献   

19.
Mitochondrial (mt) genes and genomes are among the major sources of data for evolutionary studies in birds. This places mitogenomic studies in birds at the core of intense debates in avian evolutionary biology. Indeed, complete mt genomes are actively been used to unveil the phylogenetic relationships among major orders, whereas single genes (e.g., cytochrome c oxidase I [COX1]) are considered standard for species identification and defining species boundaries (DNA barcoding). In this investigation, we study the time of origin and evolutionary relationships among Neoaves orders using complete mt genomes. First, we were able to solve polytomies previously observed at the deep nodes of the Neoaves phylogeny by analyzing 80 mt genomes, including 17 new sequences reported in this investigation. As an example, we found evidence indicating that columbiforms and charadriforms are sister groups. Overall, our analyses indicate that by improving the taxonomic sampling, complete mt genomes can solve the evolutionary relationships among major bird groups. Second, we used our phylogenetic hypotheses to estimate the time of origin of major avian orders as a way to test if their diversification took place prior to the Cretaceous/Tertiary (K/T) boundary. Such timetrees were estimated using several molecular dating approaches and conservative calibration points. Whereas we found time estimates slightly younger than those reported by others, most of the major orders originated prior to the K/T boundary. Finally, we used our timetrees to estimate the rate of evolution of each mt gene. We found great variation on the mutation rates among mt genes and within different bird groups. COX1 was the gene with less variation among Neoaves orders and the one with the least amount of rate heterogeneity across lineages. Such findings support the choice of COX 1 among mt genes as target for developing DNA barcoding approaches in birds.  相似文献   

20.
Deciphering the network of protein interactions that underlines cellular operations has become one of the main tasks of proteomics and computational biology. Recently, a set of bioinformatics approaches has emerged for the prediction of possible interactions by combining sequence and genomic information. Even though the initial results are very promising, the current methods are still far from perfect. We propose here a new way of discovering possible protein-protein interactions based on the comparison of the evolutionary distances between the sequences of the associated protein families, an idea based on previous observations of correspondence between the phylogenetic trees of associated proteins in systems such as ligands and receptors. Here, we extend the approach to different test sets, including the statistical evaluation of their capacity to predict protein interactions. To demonstrate the possibilities of the system to perform large-scale predictions of interactions, we present the application to a collection of more than 67 000 pairs of E.coli proteins, of which 2742 are predicted to correspond to interacting proteins.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号