首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Protein interaction networks are known to exhibit remarkable structures: scale-free and small-world and modular structures. To explain the evolutionary processes of protein interaction networks possessing scale-free and small-world structures, preferential attachment and duplication-divergence models have been proposed as mathematical models. Protein interaction networks are also known to exhibit another remarkable structural characteristic, modular structure. How the protein interaction networks became to exhibit modularity in their evolution? Here, we propose a hypothesis of modularity in the evolution of yeast protein interaction network based on molecular evolutionary evidence. We assigned yeast proteins into six evolutionary ages by constructing a phylogenetic profile. We found that all the almost half of hub proteins are evolutionarily new. Examining the evolutionary processes of protein complexes, functional modules and topological modules, we also found that member proteins of these modules tend to appear in one or two evolutionary ages. Moreover, proteins in protein complexes and topological modules show significantly low evolutionary rates than those not in these modules. Our results suggest a hypothesis of modularity in the evolution of yeast protein interaction network as systems evolution.  相似文献   

2.
Comparisons of proteins show that they evolve through the movement of domains. However, in many cases, the underlying mechanisms remain unclear. Here, we observed the movements of DNA recognition domains between non-orthologous proteins within a prokaryote genome. Restriction–modification (RM) systems, consisting of a sequence-specific DNA methyltransferase and a restriction enzyme, contribute to maintenance/evolution of genomes/epigenomes. RM systems limit horizontal gene transfer but are themselves mobile. We compared Type III RM systems in Helicobacter pylori genomes and found that target recognition domain (TRD) sequences are mobile, moving between different orthologous groups that occupy unique chromosomal locations. Sequence comparisons suggested that a likely underlying mechanism is movement through homologous recombination of similar DNA sequences that encode amino acid sequence motifs that are conserved among Type III DNA methyltransferases. Consistent with this movement, incongruence was observed between the phylogenetic trees of TRD regions and other regions in proteins. Horizontal acquisition of diverse TRD sequences was suggested by detection of homologs in other Helicobacter species and distantly related bacterial species. One of these RM systems in H. pylori was inactivated by insertion of another RM system that likely transferred from an oral bacterium. TRD movement represents a novel route for diversification of DNA-interacting proteins.  相似文献   

3.
4.
5.
The construction of fitness landscape has broad implication in understanding molecular evolution, cellular epigenetic state, and protein structures. We studied the problem of constructing fitness landscape of inverse protein folding or protein design, with the aim to generate amino acid sequences that would fold into an a priori determined structural fold which would enable engineering novel or enhanced biochemistry. For this task, an effective fitness function should allow identification of correct sequences that would fold into the desired structure. In this study, we showed that nonlinear fitness function for protein design can be constructed using a rectangular kernel with a basis set of proteins and decoys chosen a priori. The full landscape for a large number of protein folds can be captured using only 480 native proteins and 3,200 non-protein decoys via a finite Newton method. A blind test of a simplified version of fitness function for sequence design was carried out to discriminate simultaneously 428 native sequences not homologous to any training proteins from 11 million challenging protein-like decoys. This simplified function correctly classified 408 native sequences (20 misclassifications, 95% correct rate), which outperforms several other statistical linear scoring function and optimized linear function. Our results further suggested that for the task of global sequence design of 428 selected proteins, the search space of protein shape and sequence can be effectively parametrized with just about 3,680 carefully chosen basis set of proteins and decoys, and we showed in addition that the overall landscape is not overly sensitive to the specific choice of this set. Our results can be generalized to construct other types of fitness landscape.  相似文献   

6.
Type II restriction endonucleases (REs) are highly sequence-specific compared with other classes of nucleases. PD-(D/E)XK nucleases, initially represented by only type II REs, now comprise a large and extremely diverse superfamily of proteins and, although sharing a structurally conserved core, typically display little or no detectable sequence similarity except for the active site motifs. Sequence similarity can only be observed in methylases and few isoschizomers. As a consequence, REs are classified according to combinations of functional properties rather than on the basis of genetic relatedness. New alignment matrices and classification systems based on structural core connectivity and cleavage mechanisms have been developed to characterize new REs and related proteins. REs recognizing more than 300 distinct specificities have been identified in RE database (REBASE: ) but still the need for newer specificities is increasing due to the advancement in molecular biology and applications. The enzymes have undergone constant evolution through structural changes in protein scaffolds which include random mutations, homologous recombinations, insertions, and deletions of coding DNA sequences but rational mutagenesis or directed evolution delivers protein variants with new functions in accordance with defined biochemical or environmental pressures. Redesigning through random mutation, addition or deletion of amino acids, methylation-based selection, synthetic molecules, combining recognition and cleavage domains from different enzymes, or combination with domains of additional functions change the cleavage specificity or substrate preference and stability. There is a growing number of patents awarded for the creation of engineered REs with new and enhanced properties.  相似文献   

7.
8.
The regulators of complement activation (RCA) are critical to health and disease because their role is to ensure that a complement-mediated immune response to infection is proportionate and targeted. Each protein contains an uninterrupted array of from four to 30 examples of the very widely occurring complement control protein (CCP, or sushi) module. The CCP modules mediate specific protein-protein and protein-carbohydrate interactions that are key to the biological function of the RCA and, paradoxically, provide binding sites for numerous pathogens. Although structural and mutagenesis studies of CCP modules have addressed some aspects of molecular recognition, there have been no studies of the role of molecular dynamics in the interaction of CCP modules with their binding partners. NMR has now been used in the first full characterization of the backbone dynamics of CCP modules. The dynamics of two individual modules-the 16th of the 30 modules of complement receptor type 1 (CD35), and the N-terminal module of membrane cofactor protein (CD46)-as well as their solution structures, are compared. Although both examples share broadly similar three-dimensional structures, many structurally equivalent residues exhibit different amplitudes and timescales of local backbone motion. In each case, however, regions of the module-surface implicated by mutagenesis as sites of interactions with other proteins include several mobile residues. This observation suggests further experiments to explore binding mechanisms and identify new binding sites.  相似文献   

9.
The complexity of modern biochemistry developed gradually on early Earth as new molecules and structures populated the emerging cellular systems. Here, we generate a historical account of the gradual discovery of primordial proteins, cofactors, and molecular functions using phylogenomic information in the sequence of 420 genomes. We focus on structural and functional annotations of the 54 most ancient protein domains. We show how primordial functions are linked to folded structures and how their interaction with cofactors expanded the functional repertoire. We also reveal protocell membranes played a crucial role in early protein evolution and show translation started with RNA and thioester cofactor-mediated aminoacylation. Our findings allow elaboration of an evolutionary model of early biochemistry that is firmly grounded in phylogenomic information and biochemical, biophysical, and structural knowledge. The model describes how primordial α-helical bundles stabilized membranes, how these were decorated by layered arrangements of β-sheets and α-helices, and how these arrangements became globular. Ancient forms of aminoacyl-tRNA synthetase (aaRS) catalytic domains and ancient non-ribosomal protein synthetase (NRPS) modules gave rise to primordial protein synthesis and the ability to generate a code for specificity in their active sites. These structures diversified producing cofactor-binding molecular switches and barrel structures. Accretion of domains and molecules gave rise to modern aaRSs, NRPS, and ribosomal ensembles, first organized around novel emerging cofactors (tRNA and carrier proteins) and then more complex cofactor structures (rRNA). The model explains how the generation of protein structures acted as scaffold for nucleic acids and resulted in crystallization of modern translation.  相似文献   

10.
Naturally occurring proteins comprise a special subset of all plausible sequences and structures selected through evolution. Simulating protein evolution with simplified and all-atom models has shed light on the evolutionary dynamics of protein populations, the nature of evolved sequences and structures, and the extent to which today's proteins are shaped by selection pressures on folding, structure and function. Extensive mapping of the native structure, stability and folding rate in sequence space using lattice proteins has revealed organizational principles of the sequence/structure map important for evolutionary dynamics. Evolutionary simulations with lattice proteins have highlighted the importance of fitness landscapes, evolutionary mechanisms, population dynamics and sequence space entropy in shaping the generic properties of proteins. Finally, evolutionary-like simulations with all-atom models, in particular computational protein design, have helped identify the dominant selection pressures on naturally occurring protein sequences and structures.  相似文献   

11.
Exon-intron structures of eukaryotic genes were examined closely in their relation to primary and tertiary structures of the proteins they encode. Specific attention was given to the introns of genes encoding proteins having no repeats in their amino acid sequences. such introns have been shown to be located at sites corresponding to inter-domain or inter-module junctions of proteins identified in their three dimensional structures. “Modules,” compact structural units in globular domains of proteins, are identified by drawing a distance map. Intron positions are found to correspond to inter-module junctions in various proteins whose X-ray crystallographic data are available: the glogin family, CEWL, ovomucoid, cytochrome c, ADH, and trypsin-like serine proteinases.The good correspondence between intron positions and inter-module junctions excludes a mechanism of random insertion of introns, because the probability of intron insertion at each inter-module junction is extraordinarily small. Intron positions have been very stable and well conserved during evolution. However, at some inter-module junctions no introns are found.Modules in small proteins having no core modules buried in their interior have a character suitable for recruitment through their assembly into a stable domain; one side of them is rich in hydrophobic residues and the other in hydrophilic residues. Functionally important residues are scattered on different modules in the proteins examined. Based on these observations, the role of modules in the precellular period was conjectured: some of them might be functionally active by themselves but most modules might be only segments who could functions as an active protein only in an assembly. The origin of introns might be traced back prior to the divergence of prokaryotes and eukaryotes.  相似文献   

12.
Many highly specialised parasites have adapted to their environments by simplifying different aspects of their morphology or biochemistry. One interesting case is the mitochondrion, which has been subject to strong reductive evolution in parallel in several different parasitic groups. In extreme cases, mitochondria have degenerated so much in physical size and functional complexity that they were not immediately recognised as mitochondria, and are now referred to as 'cryptic'. Cryptic mitochondrion-derived organelles can be classified as either hydrogenosomes or mitosomes. In nearly all cases they lack a genome and all organellar proteins are nucleus-encoded and expressed in the cytosol. The same is true for the majority of proteins in canonical mitochondria, where the proteins are directed to the organelle by specific targeting sequences (transit peptides) that are recognised by translocases in the mitochondrial membrane. In this review, we compare targeting sequences of different parasitic systems with highly reduced mitochondria and give an overview of how the import machinery has been modified in hydrogenosomes and mitosomes.  相似文献   

13.
Analysis of evolution of paralogous genes in a genome is central to our understanding of genome evolution. Comparison of closely related bacterial genomes, which has provided clues as to how genome sequences evolve under natural conditions, would help in such an analysis. With species Staphylococcus aureus, whole-genome sequences have been decoded for seven strains. We compared their DNA sequences to detect large genome polymorphisms and to deduce mechanisms of genome rearrangements that have formed each of them. We first compared strains N315 and Mu50, which make one of the most closely related strain pairs, at the single-nucleotide resolution to catalogue all the middle-sized (more than 10 bp) to large genome polymorphisms such as indels and substitutions. These polymorphisms include two paralogous gene sets, one in a tandem paralogue gene cluster for toxins in a genomic island and the other in a ribosomal RNA operon. We also focused on two other tandem paralogue gene clusters and type I restriction-modification (RM) genes on the genomic islands. Then we reconstructed rearrangement events responsible for these polymorphisms, in the paralogous genes and the others, with reference to the other five genomes. For the tandem paralogue gene clusters, we were able to infer sequences for homologous recombination generating the change in the repeat number. These sequences were conserved among the repeated paralogous units likely because of their functional importance. The sequence specificity (S) subunit of type I RM systems showed recombination, likely at the homology of a conserved region, between the two variable regions for sequence specificity. We also noticed novel alleles in the ribosomal RNA operons and suggested a role for illegitimate recombination in their formation. These results revealed importance of recombination involving long conserved sequence in the evolution of paralogous genes in the genome.  相似文献   

14.
Homing endonucleases are highly specific enzymes, capable of recognizing and cleaving unique DNA sequences in complex genomes. Since such DNA cleavage events can result in targeted allele-inactivation and/or allele-replacement in vivo, the ability to engineer homing endonucleases matched to specific DNA sequences of interest would enable powerful and precise genome manipulations. We have taken a step-wise genetic approach in analyzing individual homing endonuclease I-CreI protein/DNA contacts, and describe here novel interactions at four distinct target site positions. Crystal structures of two mutant endonucleases reveal the molecular interactions responsible for their altered DNA target specificities. We also combine novel contacts to create an endonuclease with the predicted target specificity. These studies provide important insights into engineering homing endonucleases with novel target specificities, as well as into the evolution of DNA recognition by this fascinating family of proteins.  相似文献   

15.
A survey of the already characterized and potential two-component protein sequences that exist in the nine complete and seven partially annotated cyanobacterial genome sequences available (as of May 2005) showed that the cyanobacteria possess a much larger repertoire of such proteins than most other bacteria. By analysis of the domain structure of the 1,171 potential histidine kinases, response regulators, and hybrid kinases, many various arrangements of about thirty different modules could be distinguished. The number of two-component proteins is related in part to genome size but also to the variety of physiological properties and ecophysiologies of the different strains. Groups of orthologues were defined, only a few of which have representatives with known physiological functions. Based on comparisons with the proposed phylogenetic relationships between the strains, the orthology groups show that (i) a few genes, some of them clustered on the genome, have been conserved by all species, suggesting their very ancient origin and an essential role for the corresponding proteins, and (ii) duplications, fusions, gene losses, insertions, and deletions, as well as domain shuffling, occurred during evolution, leading to the extant repertoire. These mechanisms are put in perspective with the different genetic properties that cyanobacteria have to achieve genome plasticity. This review is designed to serve as a basis for orienting further research aimed at defining the most ancient regulatory mechanisms and understanding how evolution worked to select and keep the most appropriate systems for cyanobacteria to develop in the quite different environments that they have successfully colonized.  相似文献   

16.
Universal scale of the sequence conservation has been recently introduced based on omnipresence of the protein sequence motifs across species. A large spectrum of short sequences, up to eight residues has been found to reside in all or almost all prokaryotic organisms. By this discovery a principally novel quantitative approach is introduced to the problem of reconstruction of the last universal common ancestor (LUCA). The most conserved elements (protein modules) with defined structures and sequences harboring the omnipresent motifs are outlined in this work, by combining the sequence and protein crystal structure data. The structurally conserved modules involve 25–30 amino acid residues and have appearance of closed loops, loop-n-lock structures. This confirms earlier conclusions on the loop-fold structure of globular proteins. Many of the topmost conserved modules represent the primary closed loop prototypes, that have been derived by whole genome sequence searches. The data presented, thus, make a basis for further developments toward the earliest stages of protein evolution. [Reviewing Editor: Dr. Martin Kreitman]  相似文献   

17.
Type II restriction endonucleases (ENases) have served as models for understanding the enzyme-based site-specific cleavage of DNA. Using the knowledge gained from the available crystal structures, a number of attempts have been made to alter the specificity of ENases by mutagenesis. The negative results of these experiments argue that the three-dimensional structure of DNA-ENase complexes does not provide enough information to enable us to understand the interactions between DNA and ENases in detail. This conclusion calls for alternative approaches to the study of structure-function relationships related to the specificity of ENases. Comparative analysis of ENases that manifest divergent substrate specificities, but at the same time are evolutionarily related to each other, may be helpful in this respect. The success of such studies depends to a great extent on the availability of related ENases that recognise partially overlapping nucleotide sequences (e.g. sets of enzymes that bind to recognition sites of increasing length). In this study we report the cloning and sequence analysis of genes for three Type IIS restriction-modification (RM) systems. The genes encoding the ENases Alw26I, Eco31I and Esp3I (whose recognition sequences are 5'-GTCTC-3', 5'-GGTCTC-3' and 5'-CGTCTC-3', respectively) and their accompanying methyltransferases (MTases) have been cloned and the deduced amino acid sequences of their products have been compared. In pairwise comparisons, the degree of sequence identity between Alw26I, Eco31I and Esp3I ENases is higher than that observed hitherto among ENases that recognise partially overlapping nucleotide sequences. The sequences of Alw26I, Eco31I and Esp3I also reveal identical mosaic patterns of sequence conservation, which supports the idea that they are evolutionarily related and suggests that they should show a high level of structural similarity. Thus these ENases represent very attractive models for the study of the molecular basis of variation in the specific recognition of DNA targets. The corresponding MTases are represented by proteins of unusual structural and functional organisation. Both M. Alw26I and M. Esp3I are represented by a single bifunctional protein, which is composed of an m(6)A-MTase domain fused to a m(5)C-MTase domain. In contrast, two separate genes encode the m(6)A-MTase and m(5)C-MTase in the Eco31I RM system. Among the known bacterial m(5)C-MTases, the m(5)C-MTases of M. Alw26I, M. Eco31I and M. Esp3I represent unique examples of the circular permutation of their putative target recognition domains together with the conserved motifs IX and X.  相似文献   

18.
A survey of the already characterized and potential two-component protein sequences that exist in the nine complete and seven partially annotated cyanobacterial genome sequences available (as of May 2005) showed that the cyanobacteria possess a much larger repertoire of such proteins than most other bacteria. By analysis of the domain structure of the 1,171 potential histidine kinases, response regulators, and hybrid kinases, many various arrangements of about thirty different modules could be distinguished. The number of two-component proteins is related in part to genome size but also to the variety of physiological properties and ecophysiologies of the different strains. Groups of orthologues were defined, only a few of which have representatives with known physiological functions. Based on comparisons with the proposed phylogenetic relationships between the strains, the orthology groups show that (i) a few genes, some of them clustered on the genome, have been conserved by all species, suggesting their very ancient origin and an essential role for the corresponding proteins, and (ii) duplications, fusions, gene losses, insertions, and deletions, as well as domain shuffling, occurred during evolution, leading to the extant repertoire. These mechanisms are put in perspective with the different genetic properties that cyanobacteria have to achieve genome plasticity. This review is designed to serve as a basis for orienting further research aimed at defining the most ancient regulatory mechanisms and understanding how evolution worked to select and keep the most appropriate systems for cyanobacteria to develop in the quite different environments that they have successfully colonized.  相似文献   

19.
基于蛋白质网络功能模块的蛋白质功能预测   总被引:1,自引:0,他引:1  
在破译了基因序列的后基因组时代,随着系统生物学实验的快速发展,产生了大量的蛋白质相互作用数据,利用这些数据寻找功能模块及预测蛋白质功能在功能基因组研究中具有重要意义.打破了传统的基于蛋白质间相似度的聚类模式,直接从蛋白质功能团的角度出发,考虑功能团间的一阶和二阶相互作用,提出了模块化聚类方法(MCM),对实验数据进行聚类分析,来预测模块内未知蛋白质的功能.通过超几何分布P值法和增、删、改相互作用的方法对聚类结果进行预测能力分析和稳定性分析.结果表明,模块化聚类方法具有较高的预测准确度和覆盖率,有很好的容错性和稳定性.此外,模块化聚类分析得到了一些具有高预测准确度的未知蛋白质的预测结果,将会对生物实验有指导意义,其算法对其他具有相似结构的网络也具有普遍意义.  相似文献   

20.
Advances in sequence genomics have resulted in an accumulation of a huge number of protein sequences derived from genome sequences. However, the functions of a large portion of them cannot be inferred based on the current methods of sequence homology detection to proteins of known functions. Three-dimensional structure can have an important impact in providing inference of molecular function (physical and chemical function) of a protein of unknown function. Structural genomics centers worldwide have been determining many 3-D structures of the proteins of unknown functions, and possible molecular functions of them have been inferred based on their structures. Combined with bioinformatics and enzymatic assay tools, the successful acceleration of the process of protein structure determination through high throughput pipelines enables the rapid functional annotation of a large fraction of hypothetical proteins. We present a brief summary of the process we used at the Berkeley Structural Genomics Center to infer molecular functions of proteins of unknown function.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号