首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Recent work has shown that the network of structural similarity between protein domains exhibits a power-law distribution of edges per node. The scale-free nature of this graph, termed the protein domain universe graph or PDUG, may be reproduced via a divergent model of structural evolution. The performance of this model, however, does not preclude the existence of a successful convergent model. To further resolve the issue of protein structural evolution, we explore the predictions of both convergent and divergent models directly. We show that when nodes from the PDUG are partitioned into subgraphs on the basis of their occurrence in the proteomes of particular organisms, these subgraphs exhibit a scale-free nature as well. We explore a simple convergent model of structural evolution and find that the implications of this model are inconsistent with features of these organismal subgraphs. Importantly, we find that biased convergent models are inconsistent with our data. We find that when speciation mechanisms are added to a simple divergent model, subgraphs similar to the organismal subgraphs are produced, demonstrating that dynamic models can easily explain the distributions of structural similarity that exist within proteomes. We show that speciation events must be included in a divergent model of structural evolution to account for the non-random overlap of structural proteomes. These findings have implications for the long-standing debate over convergent and divergent models of protein structural evolution, and for the study of the evolution of organisms as a whole.  相似文献   

2.
Using structural similarity clustering of protein domains: protein domain universe graph (PDUG), and a hierarchical functional annotation: gene ontology (GO) as two evolutionary lenses, we find that each structural cluster (domain fold) exhibits a distribution of functions that is unique to it. These functional distributions are functional fingerprints that are specific to characteristic structural clusters and vary from cluster to cluster. Furthermore, as structural similarity threshold for domain clustering in the PDUG is relaxed we observe an influx of earlier-diverged domains into clusters. These domains join clusters without destroying the functional fingerprint. These results can be understood in light of a divergent evolution scenario that posits correlated divergence of structural and functional traits in protein domains from one or few progenitors.  相似文献   

3.
It has recently been discovered that many biological systems, when represented as graphs, exhibit a scale-free topology. One such system is the set of structural relationships among protein domains. The scale-free nature of this and other systems has previously been explained using network growth models that, although motivated by biological processes, do not explicitly consider the underlying physics or biology. In this work we explore a sequence-based model for the evolution protein structures and demonstrate that this model is able to recapitulate the scale-free nature observed in graphs of real protein structures. We find that this model also reproduces other statistical feature of the protein domain graph. This represents, to our knowledge, the first such microscopic, physics-based evolutionary model for a scale-free network of biological importance and as such has strong implications for our understanding of the evolution of protein structures and of other biological networks.  相似文献   

4.
Recent analyses of biological and artificial networks have revealed a common network architecture, called scale-free topology. The origin of the scale-free topology has been explained by using growth and preferential attachment mechanisms. In a cell, proteins are the most important carriers of function, and are composed of domains as elemental units responsible for the physical interaction between protein pairs. Here, we propose a model for protein–protein interaction networks that reveals the emergence of two possible topologies. We show that depending on the number of randomly selected interacting domain pairs, the connectivity distribution follows either a scale-free distribution, even in the absence of the preferential attachment, or a normal distribution. This new approach only requires an evolutionary model of proteins (nodes) but not for the interactions (edges). The edges are added by means of random interaction of domain pairs. As a result, this model offers a new mechanistic explanation for understanding complex networks with a direct biological interpretation because only protein structures and their functions evolved through genetic modifications of amino acid sequences. These findings are supported by numerical simulations as well as experimental data.  相似文献   

5.
The problem of functional annotation based on homology modeling is primary to current bioinformatics research. Researchers have noted regularities in sequence, structure and even chromosome organization that allow valid functional cross-annotation. However, these methods provide a lot of false negatives due to limited specificity inherent in the system. We want to create an evolutionarily inspired organization of data that would approach the issue of structure-function correlation from a new, probabilistic perspective. Such organization has possible applications in phylogeny, modeling of functional evolution and structural determination. ELISA (Evolutionary Lineage Inferred from Structural Analysis, http://romi.bu.edu/elisa) is an online database that combines functional annotation with structure and sequence homology modeling to place proteins into sequence-structure-function "neighborhoods". The atomic unit of the database is a set of sequences and structural templates that those sequences encode. A graph that is built from the structural comparison of these templates is called PDUG (protein domain universe graph). We introduce a method of functional inference through a probabilistic calculation done on an arbitrary set of PDUG nodes. Further, all PDUG structures are mapped onto all fully sequenced proteomes allowing an easy interface for evolutionary analysis and research into comparative proteomics. ELISA is the first database with applicability to evolutionary structural genomics explicitly in mind.Availability: The database is available at http://romi.bu.edu/elisa.  相似文献   

6.
Many proteins consist of several structural domains. These multi-domain proteins have likely been generated by selective genome growth dynamics during evolution to perform new functions as well as to create structures that fold on a biologically feasible time scale. Domain units frequently evolved through a variety of genetic shuffling mechanisms. Here we examine the protein domain statistics of more than 1000 organisms including eukaryotic, archaeal and bacterial species. The analysis extends earlier findings on asymmetric statistical laws for proteome to a wider variety of species. While proteins are composed of a wide range of domains, displaying a power-law decay, the computation of domain families for each protein reveals an exponential distribution, characterizing a protein universe composed of a thin number of unique families. Structural studies in proteomics have shown that domain repeats, or internal duplicated domains, represent a small but significant fraction of genome. In spite of its importance, this observation has been largely overlooked until recently. We model the evolutionary dynamics of proteome and demonstrate that these distinct distributions are in fact rooted in an internal duplication mechanism. This process generates the contemporary protein structural domain universe, determines its reduced thickness, and tames its growth. These findings have important implications, ranging from protein interaction network modeling to evolutionary studies based on fundamental mechanisms governing genome expansion.  相似文献   

7.
The difficulty involved in following mandrills in the wild means that very little is known about social structure in this species. Most studies initially considered mandrill groups to be an aggregation of one-male/multifemale units, with males occupying central positions in a structure similar to those observed in the majority of baboon species. However, a recent study hypothesized that mandrills form stable groups with only two or three permanent males, and that females occupy more central positions than males within these groups. We used social network analysis methods to examine how a semi-free ranging group of 19 mandrills is structured. We recorded all dyads of individuals that were in contact as a measure of association. The betweenness and the eigenvector centrality for each individual were calculated and correlated to kinship, age and dominance. Finally, we performed a resilience analysis by simulating the removal of individuals displaying the highest betweenness and eigenvector centrality values. We found that related dyads were more frequently associated than unrelated dyads. Moreover, our results showed that the cumulative distribution of individual betweenness and eigenvector centrality followed a power function, which is characteristic of scale-free networks. This property showed that some group members, mostly females, occupied a highly central position. Finally, the resilience analysis showed that the removal of the two most central females split the network into small subgroups and increased the network diameter. Critically, this study confirms that females appear to occupy more central positions than males in mandrill groups. Consequently, these females appear to be crucial for group cohesion and probably play a pivotal role in this species.  相似文献   

8.
结构域重组与序列复制、变异一起,推动了生命的进化。文章应用复杂网络理论比较分析了不同复杂程度的真核生物体中蛋白质结构域组的进化规律。结果表明大量的结构域(约34%)被基因组共享,而结构域的相邻二元组合却具有很大的物种特异性。结构域组合网络呈现无尺度特性,其幂率分布及平均连接度在一定程度上反映了物种的复杂性;网络的聚集系数远高于相同度分布的随机网络(P=0.0096),聚集系数与度呈现幂率分布,这说明网络服从模块化层次式组织规律。最后以人类基因组为例,初步探索了网络模块与功能的关系,发现网络模块中的结构域具有不同程度的功能一致性。  相似文献   

9.
Schnell S  Fortunato S  Roy S 《Proteomics》2007,7(6):961-964
In protein-protein interaction (PPI) networks certain topological properties appear to be recurrent: network maps are considered scale-free. It is possible that this topology is reflected in the protein structure. In this paper, we investigate the role of protein disorder in the network topology. We find that the disorder of a protein (or of its neighbors) is independent of its number of PPIs. This result suggests that protein disorder does not play a role in the scale-free architecture of protein networks.  相似文献   

10.
Global Versus Local Centrality in Evolution of Yeast Protein Network   总被引:1,自引:0,他引:1  
It is shown here that in the yeast protein interaction network the global centrality measure (betweenness) depends on the protein evolutionary age (i.e., on historic contingency) more weakly than the local centrality measure (degree). This phenomenon is not observed in mutational duplication-and-divergence models. The network domains responsible for this difference deal with DNA/RNA information processing, regulation, and cell cycle. A selection vector can operate in these domains, which integrates the network activity and thus compensates for the process of mutational divergence. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

11.
Most eukaryotic proteins are multi-domain proteins that are created from fusions of genes, deletions and internal repetitions. An investigation of such evolutionary events requires a method to find the domain architecture from which each protein originates. Therefore, we defined a novel measure, domain distance, which is calculated as the number of domains that differ between two domain architectures. Using this measure the evolutionary events that distinguish a protein from its closest ancestor have been studied and it was found that indels are more common than internal repetition and that the exchange of a domain is rare. Indels and repetitions are common at both the N and C-terminals while they are rare between domains. The evolution of the majority of multi-domain proteins can be explained by the stepwise insertions of single domains, with the exception of repeats that sometimes are duplicated several domains in tandem. We show that domain distances agree with sequence similarity and semantic similarity based on gene ontology annotations. In addition, we demonstrate the use of the domain distance measure to build evolutionary trees. Finally, the evolution of multi-domain proteins is exemplified by a closer study of the evolution of two protein families, non-receptor tyrosine kinases and RhoGEFs.  相似文献   

12.
The architecture of the network of protein–protein physical interactions in Saccharomyces cerevisiae is exposed through the combination of two complementary theoretical network measures, betweenness centrality and ‘Q-modularity’. The yeast interactome is characterized by well-defined topological modules connected via a small number of inter-module protein interactions. Should such topological inter-module connections turn out to constitute a form of functional coordination between the modules, we speculate that this coordination is occurring typically in a pairwise fashion, rather than by way of high-degree hub proteins responsible for coordinating multiple modules. The unique non-hub-centric hierarchical organization of the interactome is not reproduced by gene duplication-and-divergence stochastic growth models that disregard global selective pressures.  相似文献   

13.
In eukaryotes, the Src homology domain 3 (SH3) is a very important motif in signal transduction. SH3 domains recognize poly-proline-rich peptides and are involved in protein-protein interactions. Until now, the existence of SH3 domains has not been demonstrated in prokaryotes. However, the structure of the C-terminal domain of DtxR clearly shows that the fold of this domain is very similar to that of the SH3 domain. In addition, there is evidence that the C-terminal domain of DtxR binds to poly-proline-rich regions. Other bacterial proteins have domains that are structurally similar to the SH3 domain but whose functions are unknown or differ from that of the SH3 domain. The observed similarities between the structures of the C-terminal domain of DtxR and the SH3 domain constitute a perfect system to gain insight into their function and information about their evolution. Our results show that the C-terminal domain of DtxR shares a number of conserved key hydrophobic positions not recognizable from sequence comparison that might be responsible for the integrity of the SH3-like fold. Structural alignment of an ensemble of such domains from unrelated proteins shows a common structural core that seems to be conserved despite the lack of sequence similarity. This core constitutes the minimal requirements of protein architecture for the SH3-like fold.  相似文献   

14.
Protein evolution within a structural space   总被引:2,自引:1,他引:1       下载免费PDF全文
Understanding of the evolutionary origins of protein structures represents a key component of the understanding of molecular evolution as a whole. Here we seek to elucidate how the features of an underlying protein structural “space” might impact protein structural evolution. We approach this question using lattice polymers as a completely characterized model of this space. We develop a measure of structural comparison of lattice structures that is analogous to the one used to understand structural similarities between real proteins. We use this measure of structural relatedness to create a graph of lattice structures and compare this graph (in which nodes are lattice structures and edges are defined using structural similarity) to the graph obtained for real protein structures. We find that the graph obtained from all compact lattice structures exhibits a distribution of structural neighbors per node consistent with a random graph. We also find that subgraphs of 3500 nodes chosen either at random or according to physical constraints also represent random graphs. We develop a divergent evolution model based on the lattice space which produces graphs that, within certain parameter regimes, recapitulate the scale-free behavior observed in similar graphs of real protein structures.  相似文献   

15.
Domains are basic evolutionary units of proteins and most proteins have more than one domain. Advances in domain modeling and collection are making it possible to annotate a large fraction of known protein sequences by a linear ordering of their domains, yielding their architecture. Protein domain architectures link evolutionarily related proteins and underscore their shared functions. Here, we attempt to better understand this association by identifying the evolutionary pathways by which extant architectures may have evolved. We propose a model of evolution in which architectures arise through rearrangements of inferred precursor architectures and acquisition of new domains. These pathways are ranked using a parsimony principle, whereby scenarios requiring the fewest number of independent recombination events, namely fission and fusion operations, are assumed to be more likely. Using a data set of domain architectures present in 159 proteomes that represent all three major branches of the tree of life allows us to estimate the history of over 85% of all architectures in the sequence database. We find that the distribution of rearrangement classes is robust with respect to alternative parsimony rules for inferring the presence of precursor architectures in ancestral species. Analyzing the most parsimonious pathways, we find 87% of architectures to gain complexity over time through simple changes, among which fusion events account for 5.6 times as many architectures as fission. Our results may be used to compute domain architecture similarities, for example, based on the number of historical recombination events separating them. Domain architecture "neighbors" identified in this way may lead to new insights about the evolution of protein function.  相似文献   

16.
Protein networks, describing physical interactions as well as functional associations between proteins, have been unravelled for many organisms in the recent past. Databases such as the STRING provide excellent resources for the analysis of such networks. In this contribution, we revisit the organisation of protein networks, particularly the centrality–lethality hypothesis, which hypothesises that nodes with higher centrality in a network are more likely to produce lethal phenotypes on removal, compared to nodes with lower centrality. We consider the protein networks of a diverse set of 20 organisms, with essentiality information available in the Database of Essential Genes and assess the relationship between centrality measures and lethality. For each of these organisms, we obtained networks of high-confidence interactions from the STRING database, and computed network parameters such as degree, betweenness centrality, closeness centrality and pairwise disconnectivity indices. We observe that the networks considered here are predominantly disassortative. Further, we observe that essential nodes in a network have a significantly higher average degree and betweenness centrality, compared to the network average. Most previous studies have evaluated the centrality–lethality hypothesis for Saccharomyces cerevisiae and Escherichia coli; we here observe that the centrality–lethality hypothesis hold goods for a large number of organisms, with certain limitations. Betweenness centrality may also be a useful measure to identify essential nodes, but measures like closeness centrality and pairwise disconnectivity are not significantly higher for essential nodes.  相似文献   

17.
Here, we present an automatic assignment of potential cognate ligands to domains of enzymes in the CATH and SCOP protein domain classifications on the basis of structural data available in the wwPDB. This procedure involves two steps; firstly, we assign the binding of particular ligands to particular domains; secondly, we compare the chemical similarity of the PDB ligands to ligands in KEGG in order to assign cognate ligands. We find that use of the Enzyme Commission (EC) numbers is necessary to enable efficient and accurate cognate ligand assignment. The PROCOGNATE database currently has cognate ligand mapping for 3277 (4118) protein structures and 351 (302) superfamilies, as described by the CATH and (SCOP) databases, respectively. We find that just under half of all ligands are only and always bound by a single domain, with 16% bound by more than one domain and the remainder of the ligands showing a variety of binding modes. This finding has implications for domain recombination and the evolution of new protein functions. Domain architecture or context is also found to affect substrate specificity of particular domains, and we discuss example cases. The most popular PDB ligands are all found to be generic components of crystallisation buffers, highlighting the non-cognate ligand problem inherent in the PDB. In contrast, the most popular cognate ligands are all found to be universal cellular currencies of reducing power and energy such as NADH, FADH2 and ATP, respectively, reflecting the fact that the vast majority of enzymatic reactions utilise one of these popular co-factors. These ligands all share a common adenine ribonucleotide moiety, suggesting that many different domain superfamilies have converged to bind this chemical framework.  相似文献   

18.
Functional magnetic resonance data acquired in a task-absent condition (“resting state”) require new data analysis techniques that do not depend on an activation model. In this work, we introduce an alternative assumption- and parameter-free method based on a particular form of node centrality called eigenvector centrality. Eigenvector centrality attributes a value to each voxel in the brain such that a voxel receives a large value if it is strongly correlated with many other nodes that are themselves central within the network. Google''s PageRank algorithm is a variant of eigenvector centrality. Thus far, other centrality measures - in particular “betweenness centrality” - have been applied to fMRI data using a pre-selected set of nodes consisting of several hundred elements. Eigenvector centrality is computationally much more efficient than betweenness centrality and does not require thresholding of similarity values so that it can be applied to thousands of voxels in a region of interest covering the entire cerebrum which would have been infeasible using betweenness centrality. Eigenvector centrality can be used on a variety of different similarity metrics. Here, we present applications based on linear correlations and on spectral coherences between fMRI times series. This latter approach allows us to draw conclusions of connectivity patterns in different spectral bands. We apply this method to fMRI data in task-absent conditions where subjects were in states of hunger or satiety. We show that eigenvector centrality is modulated by the state that the subjects were in. Our analyses demonstrate that eigenvector centrality is a computationally efficient tool for capturing intrinsic neural architecture on a voxel-wise level.  相似文献   

19.
Social network analysis offers new tools to study the social structure of primate groups. We used social network analysis to investigate the cohesiveness of a grooming network in a captive chimpanzee group (N = 17) and the role that individuals may play in it. Using data from a year-long observation, we constructed an unweighted social network of preferred grooming interactions by retaining only those dyads that groomed above the group mean. This choice of criterion was validated by the finding that the properties of the unweighted network correlated with the properties of a weighted network (i.e. a network representing the frequency of grooming interactions) constructed from the same data. To investigate group cohesion, we tested the resilience of the unweighted grooming network to the removal of central individuals (i.e. individuals with high betweenness centrality). The network fragmented more after the removal of individuals with high betweenness centrality than after the removal of random individuals. Central individuals played a pivotal role in maintaining the network's cohesiveness, and we suggest that this may be a typical property of affiliative networks like grooming networks. We found that the grooming network correlated with kinship and age, and that individuals with higher social status occupied more central positions in the network. Overall, the grooming network showed a heterogeneous structure, yet did not exhibit scale-free properties similar to many other primate networks. We discuss our results in light of recent findings on animal social networks and chimpanzee grooming.  相似文献   

20.
Diversity and evolution of the thyroglobulin type-1 domain superfamily   总被引:1,自引:0,他引:1  
Multidomain proteins are gaining increasing consideration for their puzzling, flexible utilization in nature. The presence of the characteristic thyroglobulin type-1 (Tg1) domain as a protein module in a variety of multicellular organisms suggests pivotal roles for this building block. To gain insight into the evolution of Tg1 domains, we performed searches of protein, expressed sequence tag, and genome databases. Tg1 domains were found to be Metazoa specific, and we retrieved a total of 170 Tg1 domain-containing protein sequences. Their architectures revealed a wide taxonomic distribution of proteins containing Tg1 domains followed or preceded by secreted protein, acidic, rich in cysteines (SPARC)-type extracellular calcium-binding domains. Other proteins contained lineage-specific domain combinations of peptidase inhibitory modules or domains with different biological functions. Phylogenetic analysis showed that Tg1 domains are highly conserved within protein structures, whereas insertion into novel proteins is followed by rapid diversification. Seven different basic types of protein architecture containing the Tg1 domain were identified in vertebrates. We examined the evolution of these protein groups by combining Tg1 domain phylogeny with additional analyses based on other characteristic domains. Testicans and secreted modular calcium binding protein (SMOCs) evolved from invertebrate homologs by introduction of vertebrate-specific domains, nidogen evolved by insertion of a Tg1 domain into a preexisting architecture, and the remaining four have unique architectures. Thyroglobulin, Trops, and the major histocompatibility complex class II-associated invariant chain are vertebrate specific, while an insulin-like growth factor-binding protein and nidogen were also identified in urochordates. Among vertebrates, we observed differences in protein repertoires, which result from gene duplication and domain duplication. Members of five groups have been characterized at the molecular level. All exhibit subtle differences in their specificities and function either as peptidase inhibitors (thyropins), substrates, or both. As far as the sequence is concerned, only a few conserved residues were identified. In combination with structural data, our analysis shows that the Tg1 domain fold is highly adaptive and comprises a relatively well-conserved core surrounded by highly variable loops that account for its multipurpose function in the animal kingdom.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号