首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 484 毫秒
1.
Myxozoans are enigmatic endoparasitic organisms sharing morphological features with bilateria, protists and cnidarians. This, coupled with their highly divergent gene sequences, has greatly obscured their phylogenetic affinities. Here we report the sequencing and characterization of a minicollagen homologue (designated Tb-Ncol-1) in the myxozoan Tetracapsuloides bryosalmonae. Minicollagens are phylum-specific genes encoding cnidarian nematocyst proteins. Sequence analysis revealed a cysteine-rich domain (CRD) architecture and genomic organization similar to group 1 minicollagens. Homology modelling predicted similar three-dimensional structures to Hydra CRDs despite deviations from the canonical pattern of group 1 minicollagens. The discovery of this minicollagen gene strongly supports myxozoans as cnidarians that have radiated as endoparasites of freshwater, marine and terrestrial hosts. It also reveals novel protein sequence variation of relevance to understanding the evolution of nematocyst complexity, and indicates a molecular/morphological link between myxozoan polar capsules and cnidarian nematocysts. Our study is the first to illustrate the power of using genes related to a taxon-specific novelty for phylogenetic inference within the Metazoa, and it exemplifies how the evolutionary relationships of other metazoans characterized by extreme sequence divergence could be similarly resolved.  相似文献   

2.
Protein folding involves the formation of secondary structural elements from the primary sequence and their association with tertiary assemblies. The relation of this primary sequence to a specific folded protein structure remains a central question in structural biology. An increasing body of evidence suggests that variations in homologous sequence ranging from point mutations to substantial insertions or deletions can yield stable proteins with markedly different folds. Here we report the structural characterization of domain IV (D4) and ΔD4 (polypeptides with 222 and 160 amino acids, respectively) that differ by virtue of an N-terminal deletion of 62 amino acids (28% of the overall D4 sequence). The high-resolution crystal structures of the monomeric D4 and the dimeric ΔD4 reveal substantially different folds despite an overall conservation of secondary structure. These structures show that the formation of tertiary structures, even in extended polypeptide sequences, can be highly context dependent, and they serve as a model for structural plasticity in protein isoforms.  相似文献   

3.
The proteomes that make up the collection of proteins in contemporary organisms evolved through recombination and duplication of a limited set of domains. These protein domains are essentially the main components of globular proteins and are the most principal level at which protein function and protein interactions can be understood. An important aspect of domain evolution is their atomic structure and biochemical function, which are both specified by the information in the amino acid sequence. Changes in this information may bring about new folds, functions and protein architectures. With the present and still increasing wealth of sequences and annotation data brought about by genomics, new evolutionary relationships are constantly being revealed, unknown structures modeled and phylogenies inferred. Such investigations not only help predict the function of newly discovered proteins, but also assist in mapping unforeseen pathways of evolution and reveal crucial, co-evolving inter- and intra-molecular interactions. In turn this will help us describe how protein domains shaped cellular interaction networks and the dynamics with which they are regulated in the cell. Additionally, these studies can be used for the design of new and optimized protein domains for therapy. In this review, we aim to describe the basic concepts of protein domain evolution and illustrate recent developments in molecular evolution that have provided valuable new insights in the field of comparative genomics and protein interaction networks.  相似文献   

4.
The generation of biological complexity by the acquisition of novel modular units is an emerging concept in evolutionary dynamics. Here, we review the coordinate evolution of cnidarian nematocysts, secretory organelles used for capture of prey, and of minicollagens, proteins constituting the nematocyst capsule. Within the Cnidaria there is an increase in nematocyst complexity from Anthozoa to Medusozoa and a parallel increase in the number and complexity of minicollagen proteins. This complexity is primarily manifest in a diversification of N- and C-terminal cysteine-rich domains (CRDs) involved in minicollagen polymerization. We hypothesize that novel CRD motifs alter minicollagen networks, leading to novel capsule structures and nematocyst types.  相似文献   

5.
Detection of similarity is particularly difficult for small proteins and thus connections between many of them remain unnoticed. Structure and sequence analysis of several metal-binding proteins reveals unexpected similarities in structural domains classified as different protein folds in SCOP and suggests unification of seven folds that belong to two protein classes. The common motif, termed treble clef finger in this study, forms the protein structural core and is 25-45 residues long. The treble clef motif is assembled around the central zinc ion and consists of a zinc knuckle, loop, beta-hairpin and an alpha-helix. The knuckle and the first turn of the helix each incorporate two zinc ligands. Treble clef domains constitute the core of many structures such as ribosomal proteins L24E and S14, RING fingers, protein kinase cysteine-rich domains, nuclear receptor-like fingers, LIM domains, phosphatidylinositol-3-phosphate-binding domains and His-Me finger endonucleases. The treble clef finger is a uniquely versatile motif adaptable for various functions. This small domain with a 25 residue structural core can accommodate eight different metal-binding sites and can have many types of functions from binding of nucleic acids, proteins and small molecules, to catalysis of phosphodiester bond hydrolysis. Treble clef motifs are frequently incorporated in larger structures or occur in doublets. Present analysis suggests that the treble clef motif defines a distinct structural fold found in proteins with diverse functional properties and forms one of the major zinc finger groups.  相似文献   

6.
The dramatically increasing number of new protein sequences arising from genomics 4 proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions.Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1–6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6–10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http://www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.  相似文献   

7.
Typically, protein spatial structures are more conserved in evolution than amino acid sequences. However, the recent explosion of sequence and structure information accompanied by the development of powerful computational methods led to the accumulation of examples of homologous proteins with globally distinct structures. Significant sequence conservation, local structural resemblance, and functional similarity strongly indicate evolutionary relationships between these proteins despite pronounced structural differences at the fold level. Several mechanisms such as insertions/deletions/substitutions, circular permutations, and rearrangements in beta-sheet topologies account for the majority of detected structural irregularities. The existence of evolutionarily related proteins that possess different folds brings new challenges to the homology modeling techniques and the structure classification strategies and offers new opportunities for protein design in experimental studies.  相似文献   

8.
Accurately assigning folds for divergent protein sequences is a major obstacle to structural studies. Herein, we outline an effective method for fold recognition using sets of PSSMs, each of which is constructed for different protein folds. Our analyses demonstrate that FSL (Fold-specific Position Specific Scoring Matrix Libraries) can predict/relate structures given only their amino acid sequences of highly divergent proteins. This ability to detect distant relationships is dependent on low-identity sequence alignments obtained from FSL. Results from our experiments demonstrate that FSL perform well in recognizing folds from the "twilight-zone" SABmark dataset. Further, this method is capable of accurate fold prediction in newly determined structures. We suggest that by building complete PSSM libraries for all unique folds within the Protein Database (PDB), FSL can be used to rapidly and reliably annotate a large subset of protein folds at proteomic level. The related programs and fold-specific PSSMs for our FSL are publicly available at: http://ccp.psu.edu/download/FSLv1.0/.  相似文献   

9.
Yo Matsuo  Ken Nishikawa 《Proteins》1995,23(3):370-375
A protein fold recognition method was tested by the blind prediction of the structures of a set of proteins. The method evaluates the compatibility of an amino acid sequence with a three-dimensional structure using the four evaluation functions: side-chain packing, solvation, hydrogen-bonding, and local conformation functions. The structures of 14 proteins containing 19 sequences were predicted. The predictions were compared with the experimental structures. The experimental results showed that 9 of the 19 target sequences have known folds or portions of known folds. Among them, the folds of Klebsiella aerogenes urease β subunit (KAUB) and pyruvate phosphate dikinase domain 4 (PPDK4) were successfully recognized; our method predicted that KAUB and PPDK4 would adopt the folds of macromomycin (Ig-fold) and phosphoribosylanthra-nilate isomerase:indoleglycerol-phosphate synthase (TIM barrel), respectively, and the experimental structure revealed that they actually adopt the predicted folds. The predictions for the other targets were not successful, but they often gave secondary structural patterns similar to those of the experimental structures. © 1995 Wiley-Liss, Inc.  相似文献   

10.
Stecrisp from Trimeresurus stejnegeri snake venom belongs to a family of cysteine-rich secretory proteins (CRISP) that have various functions related to sperm-egg fusion, innate host defense, and the blockage of ion channels. Here we present the crystal structure of stecrisp refined to 1.6-angstrom resolution. It shows that stecrisp contains three regions, namely a PR-1 (pathogenesis-related proteins of group1) domain, a hinge, and a cysteine-rich domain (CRD). A conformation of solvent-exposed and -conserved residues (His60, Glu75, Glu96, and His115) in the PR-1 domain similar to that of their counterparts in homologous structures suggests they may share some molecular mechanism. Three flexible loops of hypervariable sequence surrounding the possible substrate binding site in the PR-1 domain show an evident difference in homologous structures, implying that a great diversity of species- and substrate-specific interactions may be involved in recognition and catalysis. The hinge is fixed by two crossed disulfide bonds formed by four of ten characteristic cysteines in the carboxyl-terminal region and is important for stabilizing the N-terminal PR-1 domain. Spatially separated from the PR-1 domain, CRD possesses a similar fold with two K+ channel inhibitors (Bgk and Shk). Several candidates for the possible functional sites of ion channel blocking are located in a solvent-exposed loop in the CRD. The structure of stecrisp will provide a prototypic architecture for a structural and functional exploration of the diverse members of the CRISP family.  相似文献   

11.
Overview of structural genomics: from structure to function   总被引:7,自引:0,他引:7  
The unprecedented increase in the number of new protein sequences arising from genomics and proteomics highlights directly the need for methods to rapidly and reliably determine the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds, thereby providing three-dimensional portraits for all proteins in a living organism and to infer molecular functions of the proteins. The goal of obtaining protein structures on a genomic scale has motivated the development of high-throughput technologies for macromolecular structure determination, which have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional and evolution relationships that were hidden at the sequence level.  相似文献   

12.
蛋白质的序列、结构和功能多种多样.大量研究表明蛋白质的结构与其氨基酸序列的排序有关,并且局部的氨基酸序列环境对蛋白质的结构具有一定的影响.本文提出一种新的基于5-mer氨基酸扭转角统计偏好的蛋白质结构类型预测方法,在该方法通过PDB数据库中5-mer中间氨基酸的扭转角统计偏好来进行结构类型的预测.新方法可以通过计算机仿...  相似文献   

13.
Vav family proteins are members of the Dbl family of guanine nucleotide exchange factors and activators of Rho family small GTPases. In addition to the Dbl homology (DH) domain important for guanine nucleotide exchange factor catalytic function, all Dbl family proteins contain an adjacent pleckstrin homology (PH) domain that serves to regulate DH domain activity. Although the role of the PH domain in Vav function has been evaluated extensively, its precise role and whether it serves a distinct role in different Vav proteins remain unresolved. Additionally, the precise role of an adjacent cysteine-rich domain (CRD) in regulating DH domain function is also unclear. In this study, we evaluated the contribution of these putative protein-protein or protein-lipid interaction domains to Vav signaling and transforming activity. In contrast to previous observations, we found that the PH domain is critical for Vav transforming activity. Similarly, the CRD was also essential and served a function distinct from that of the PH domain. Although mutation of either domain reduced Vav membrane association, addition of plasma membrane targeting sequences to either the CRD or PH domain mutant proteins did not restore Vav transforming activity. This result contrasts with other Dbl family proteins, where a membrane targeting sequence alone was sufficient to restore the loss of function caused by mutation of the PH domain. Furthermore, green fluorescent protein fusion proteins containing the PH domain or CRD, or both, failed to target to the plasma membrane, suggesting that these two domains also serve regulatory functions independent of promoting membrane localization. Finally, we found that phosphatidylinositol 3-kinase activation may promote Vav membrane association via phosphatidylinositol 3,4,5-triphosphate binding to the PH domain.  相似文献   

14.
Worldwide structural genomics projects are increasing structure coverage of sequence space but have not significantly expanded the protein structure space itself (i.e., number of unique structural folds) since 2007. Discovering new structural folds experimentally by directed evolution and random recombination of secondary-structure blocks is also proved rarely successful. Meanwhile, previous computational efforts for large-scale mapping of protein structure space are limited to simple model proteins and led to an inconclusive answer on the completeness of the existing observed protein structure space. Here, we build novel protein structures by extending naturally occurring circular (single-loop) permutation to multiple loop permutations (MLPs). These structures are clustered by structural similarity measure called TM-score. The computational technique allows us to produce different structural clusters on the same naturally occurring, packed, stable core but with alternatively connected secondary-structure segments. A large-scale MLP of 2936 domains from structural classification of protein domains reproduces those existing structural clusters (63%) mostly as hubs for many nonredundant sequences and illustrates newly discovered novel clusters as islands adopted by a few sequences only. Results further show that there exist a significant number of novel potentially stable clusters for medium-size or large-size single-domain proteins, in particular, > 100 amino acid residues, that are either not yet adopted by nature or adopted only by a few sequences. This study suggests that MLP provides a simple yet highly effective tool for engineering and design of novel protein structures (including naturally knotted proteins). The implication of recovering new-fold targets from critical assessment of structure prediction techniques (CASP) by MLP on template-based structure prediction is also discussed. Our MLP structures are available for download at the publication page of the Web site http://sparks.informatics.iupui.edu.  相似文献   

15.
Evolution of Chitin-Binding Proteins in Invertebrates   总被引:11,自引:0,他引:11  
Analysis of a group of invertebrate proteins, including chitinases and peritrophic matrix proteins, reveals the presence of chitin-binding domains that share significant amino acid sequence similarity. The data suggest that these domains evolved from a common ancestor which may be a protein containing a single chitin-binding domain. The duplication and transposition of this chitin-binding domain may have contributed to the functional diversification of chitin-binding proteins. Sequence comparisons indicated that invertebrate and plant chitin binding domains do not share significant amino acid sequence similarity, suggesting that they are not coancestral. However, both the invertebrate and the plant chitin-binding domains are cysteine-rich and have several highly conserved aromatic residues. In plants, cysteines have been elucidated in maintaining protein folding and aromatic amino acids in interacting with saccharides [Wright HT, Sanddrasegaram G, Wright CS (1991) J Mol Evol 33:283–294]. It is likely that these residues perform similar functions in invertebrates. We propose that the invertebrate and the plant chitin-binding domains share similar mechanisms for folding and saccharide binding and that they evolved by convergent evolution. Furthermore, we propose that the disulfide bonds and aromatic residues are hallmarks for saccharide-binding proteins. Received: 2 March 1998 / Accepted: 17 July 1998  相似文献   

16.
Frenkel ZM  Trifonov EN 《Proteins》2007,67(2):271-284
A new method is proposed to reveal apparent evolutionary relationships between protein fragments with similar 3D structures by finding "intermediate" sequences in the proteomic database. Instead of looking for homologies and intermediates for a whole protein domain, we build a chain of intermediate short sequences, which allows one to link similar structural modules of proteins belonging to the same or different families. Several such chains of intermediates can be combined into an evolutionary tree of structural protein modules. All calculations were made for protein fragments of 20 aa residues. Three evolutionary trees for different module structures are described. The aim of the paper is to introduce the new method and to demonstrate its potential for protein structural predictions. The approach also opens new perspectives for protein evolution studies.  相似文献   

17.
The question of whether novel, structurally different protein folds might have arisen from existing ones is crucial to understanding protein evolution. Recent work on cysteine-rich domains in Hydra proteins illuminates how evolutionary transitions between dramatically different structures might occur.  相似文献   

18.
19.
Baoqiang Cao  Ron Elber 《Proteins》2010,78(4):985-1003
We investigate small sequence adjustments (of one or a few amino acids) that induce large conformational transitions between distinct and stable folds of proteins. Such transitions are intriguing from evolutionary and protein‐design perspectives. They make it possible to search for ancient protein structures or to design protein switches that flip between folds and functions. A network of sequence flow between protein folds is computed for representative structures of the Protein Data Bank. The computed network is dense, on an average each structure is connected to tens of other folds. Proteins that attract sequences from a higher than expected number of neighboring folds are more likely to be enzymes and alpha/beta fold. The large number of connections between folds may reflect the need of enzymes to adjust their structures for alternative substrates. The network of the Cro family is discussed, and we speculate that capacity is an important factor (but not the only one) that determines protein evolution. The experimentally observed flip from all alpha to alpha + beta fold is examined by the network tools. A kinetic model for the transition of sequences between the folds (with only protein stability in mind) is proposed. Proteins 2010. © 2009 Wiley‐Liss, Inc.  相似文献   

20.
We present an approach that is able to detect native folds amongst a large number of non-native conformations. The method is based on the compilation of potentials of mean force of the interactions of the C beta atoms of all amino acid pairs from a database of known three-dimensional protein structures. These potentials are used to calculate the conformational energy of amino acid sequences in a number of different folds. For a substantial number of proteins we find that the conformational energy of the native state is lowest amongst the alternatives. Exceptions are proteins containing large prosthetic groups, Fe-S clusters or polypeptide chains that do not adopt globular folds. We discuss briefly potential applications in various fields of protein structural research.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号