首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Proteins that contain similar structural elements often have analogous functions regardless of the degree of sequence similarity or structure connectivity in space. In general, protein structure comparison (PSC) provides a straightforward methodology for biologists to determine critical aspects of structure and function. Here, we developed a novel PSC technique based on angle-distance image (A-D image) transformation and matching, which is independent of sequence similarity and connectivity of secondary structure elements (SSEs). An A-D image is constructed by utilizing protein secondary structure information. According to various types of SSEs, the mutual SSE pairs of the query protein are classified into three different types of sub-images. Subsequently, corresponding sub-images between query and target protein structures are compared using modified cross-correlation approaches to identify the similarity of various patterns. Structural relationships among proteins are displayed by hierarchical clustering trees, which facilitate the establishment of the evolutionary relationships between structure and function of various proteins.Four standard testing datasets and one newly created dataset were used to evaluate the proposed method. The results demonstrate that proteins from these five datasets can be categorized in conformity with their spatial distribution of SSEs. Moreover, for proteins with low sequence identity that share high structure similarity, the proposed algorithms are an efficient and effective method for structural comparison.  相似文献   

2.
Proteins form arguably the most significant link between genotype and phenotype. Understanding the relationship between protein sequence and structure, and applying this knowledge to predict function, is difficult. One way to investigate these relationships is by considering the space of protein folds and how one might move from fold to fold through similarity, or potential evolutionary relationships. The many individual characterisations of fold space presented in the literature can tell us a lot about how well the current Protein Data Bank represents protein fold space, how convergence and divergence may affect protein evolution, how proteins affect the whole of which they are part, and how proteins themselves function. A synthesis of these different approaches and viewpoints seems the most likely way to further our knowledge of protein structure evolution and thus, facilitate improved protein structure design and prediction.  相似文献   

3.
The genes encoding the leucine binding proteins in E coli have been cloned and their DNA sequences have been determined. One of the binding proteins (LIV-BP) binds leucine, isoleucine, valine, threonine, and alanine, whereas the other (LS-BP) binds only the D- and L-isomers of leucine. These proteins bind their solutes as they enter the periplasm, then interact with three membrane components, livH, livG, and livM, to achieve the translocation of the solute across the bacterial cell membrane. Another feature of the binding proteins is that they must be secreted into the periplasmic space where they carry out their function. The amino acid sequence of the two binding proteins is 80% homologous, indicating that they are the products of an ancestral gene duplication. Because of these characteristics of the leucine binding proteins, we are using them as models for studying the relationships between protein structure and function.  相似文献   

4.
Phylogeny as a guide to structure and function of membrane transport proteins   总被引:10,自引:0,他引:10  
Protein phylogeny, based on primary amino acid sequence relatedness, reflects the evolutionary process and therefore provides a guide to structure, mechanism and function. Any two proteins that are related by common descent are expected to exhibit similar structures and functions to a degree proportional to the degree of their sequence similarity; but two independently evolving proteins should not. This principle provides the impetus to define protein phylogenetic relationships and interrelate families when possible. In this mini-review, we summarize the computational approaches and criteria we use to establish common evolutionary origin. We apply these tools to define distant superfamily relationships between several previously recognized transport protein families. In some cases, available structural and functional data are evaluated in order to substantiate our claim that molecular phylogeny provides a reliable guide to protein structure and function.  相似文献   

5.
6.
Naturally occurring proteins comprise a special subset of all plausible sequences and structures selected through evolution. Simulating protein evolution with simplified and all-atom models has shed light on the evolutionary dynamics of protein populations, the nature of evolved sequences and structures, and the extent to which today's proteins are shaped by selection pressures on folding, structure and function. Extensive mapping of the native structure, stability and folding rate in sequence space using lattice proteins has revealed organizational principles of the sequence/structure map important for evolutionary dynamics. Evolutionary simulations with lattice proteins have highlighted the importance of fitness landscapes, evolutionary mechanisms, population dynamics and sequence space entropy in shaping the generic properties of proteins. Finally, evolutionary-like simulations with all-atom models, in particular computational protein design, have helped identify the dominant selection pressures on naturally occurring protein sequences and structures.  相似文献   

7.
BackgroundMembrane proteins play important roles in cell survival and cell communication, as they function as transporters, receptors, anchors and enzymes. They are also potential targets for drugs that block receptors or inhibit enzymes related to diseases. Although the number of known structures of membrane proteins is still small relative to the size of the proteome as a whole, many new membrane protein structures have been determined recently.Scope of the articleWe compared and analyzed the widely used membrane protein databases, mpstruc, Orientations of Proteins in Membranes (OPM), and PDBTM, as well as the extended dataset of mpstruc based on sequence similarity, the PDB structures whose classification field indicates that they are “membrane proteins” and the proteins with Structural Classification of Proteins (SCOP) class-f domains. We evaluated the relationships between these databases or datasets based on the overlap in their contents and the degree of consistency in the structural, topological, and functional classifications and in the transmembrane domain assignment.Major conclusionsThe membrane databases differ from each other in their coverage, and in the criteria that they use for annotation and classification. To ensure the efficient use of these databases, it is important to understand their differences and similarities. The establishment of more detailed and consistent annotations for the sequence, structure, membrane association, and function of membrane proteins is still required.General significanceConsidering the recent growth of experimentally determined structures, a broad survey and cumulative analysis of the sum of knowledge as presented in the membrane protein structure databases can be helpful to elucidate structures and functions of membrane proteins. We also aim to provide a framework for future research and classification of membrane proteins.  相似文献   

8.
We suggest a new approach to the generation of candidate structures (decoys) for ab initio prediction of protein structures. Our method is based on random sampling of conformation space and subsequent local energy minimization. At the core of this approach lies the design of a novel type of energy function. This energy function has local minima with native structure characteristics and wide basins of attraction. The current work presents our motivation for deriving such an energy function and also tests the derived energy function.Our approach is novel in that it takes advantage of the inherently rough energy landscape of proteins, which is generally considered a major obstacle for protein structure prediction. When local minima have wide basins of attraction, the protein's conformation space can be greatly reduced by the convergence of large regions of the space into single points, namely the local minima corresponding to these funnels. We have implemented this concept by an iterative process. The potential is first used to generate decoy sets and then we study these sets of decoys to guide further development of the potential. A key feature of our potential is the use of cooperative multi-body interactions that mimic the role of the entropic and solvent contributions to the free energy.The validity and value of our approach is demonstrated by applying it to 14 diverse, small proteins. We show that, for these proteins, the size of conformation space is considerably reduced by the new energy function. In fact, the reduction is so substantial as to allow efficient conformational sampling. As a result we are able to find a significant number of near-native conformations in random searches performed with limited computational resources.  相似文献   

9.
Recent studies have begun to yield some insight into the structural and regulatory components of centromeres, and new assays have been developed that promise to be of use in advancing our understanding of centromere structure and function. In the budding yeast Saccharomyces cerevisiae new proteins that are required for centromere function have been identified and an in vitro microtubule-binding assay that should assist in dissecting the process of centromere microtubule attachment has been developed. The centromere-specific DNA sequences in the fission yeast Schizosaccharomyces pombe have been identified and partially characterized. In addition, several mammalian centromere proteins have been further characterized, and localization and inhibition studies suggest roles for these proteins in the regulation and assembly of a functional kinetochore.  相似文献   

10.
Prediction of protein function from protein sequence and structure   总被引:1,自引:0,他引:1  
The sequence of a genome contains the plans of the possible life of an organism, but implementation of genetic information depends on the functions of the proteins and nucleic acids that it encodes. Many individual proteins of known sequence and structure present challenges to the understanding of their function. In particular, a number of genes responsible for diseases have been identified but their specific functions are unknown. Whole-genome sequencing projects are a major source of proteins of unknown function. Annotation of a genome involves assignment of functions to gene products, in most cases on the basis of amino-acid sequence alone. 3D structure can aid the assignment of function, motivating the challenge of structural genomics projects to make structural information available for novel uncharacterized proteins. Structure-based identification of homologues often succeeds where sequence-alone-based methods fail, because in many cases evolution retains the folding pattern long after sequence similarity becomes undetectable. Nevertheless, prediction of protein function from sequence and structure is a difficult problem, because homologous proteins often have different functions. Many methods of function prediction rely on identifying similarity in sequence and/or structure between a protein of unknown function and one or more well-understood proteins. Alternative methods include inferring conservation patterns in members of a functionally uncharacterized family for which many sequences and structures are known. However, these inferences are tenuous. Such methods provide reasonable guesses at function, but are far from foolproof. It is therefore fortunate that the development of whole-organism approaches and comparative genomics permits other approaches to function prediction when the data are available. These include the use of protein-protein interaction patterns, and correlations between occurrences of related proteins in different organisms, as indicators of functional properties. Even if it is possible to ascribe a particular function to a gene product, the protein may have multiple functions. A fundamental problem is that function is in many cases an ill-defined concept. In this article we review the state of the art in function prediction and describe some of the underlying difficulties and successes.  相似文献   

11.
A number of recent advances have been made in deriving function information from protein structure. A fold relationship to an already characterized protein will often allow general information about function to be deduced. More detailed information can be obtained using sequence relationships to already studied proteins. Methods of deducing function directly from structure, without the use of evolutionary relationships, are developing rapidly. All such methods may be used with models of protein structure, rather than with experimentally determined ones, but model accuracy imposes limitations. The rapid expansion of the structural genomics field has created a new urgency for improved methods of structure-based annotation of function.  相似文献   

12.
Axe DD  Dixon BW  Lu P 《PloS one》2008,3(6):e2246
The study of protein evolution is complicated by the vast size of protein sequence space, the huge number of possible protein folds, and the extraordinary complexity of the causal relationships between protein sequence, structure, and function. Much simpler model constructs may therefore provide an attractive complement to experimental studies in this area. Lattice models, which have long been useful in studies of protein folding, have found increasing use here. However, while these models incorporate actual sequences and structures (albeit non-biological ones), they incorporate no actual functions--relying instead on largely arbitrary structural criteria as a proxy for function. In view of the central importance of function to evolution, and the impossibility of incorporating real functional constraints without real function, it is important that protein-like models be developed around real structure-function relationships. Here we describe such a model and introduce open-source software that implements it. The model is based on the structure-function relationship in written language, where structures are two-dimensional ink paths and functions are the meanings that result when these paths form legible characters. To capture something like the hierarchical complexity of protein structure, we use the traditional characters of Chinese origin. Twenty coplanar vectors, encoded by base triplets, act like amino acids in building the character forms. This vector-world model captures many aspects of real proteins, including life-size sequences, a life-size structural repertoire, a realistic genetic code, secondary, tertiary, and quaternary structure, structural domains and motifs, operon-like genetic structures, and layered functional complexity up to a level resembling bacterial genomes and proteomes. Stylus is a full-featured implementation of the vector world for Unix systems. To demonstrate the utility of Stylus, we generated a sample set of homologous vector proteins by evolving successive lines from a single starting gene. These homologues show sequence and structure divergence resembling those of natural homologues in many respects, suggesting that the system may be sufficiently life-like for informative comparison to biology.  相似文献   

13.
14.

Background

Protein structure comparison play important role in in silico functional prediction of a new protein. It is also used for understanding the evolutionary relationships among proteins. A variety of methods have been proposed in literature for comparing protein structures but they have their own limitations in terms of accuracy and complexity with respect to computational time and space. There is a need to improve the computational complexity in comparison/alignment of proteins through incorporation of important biological and structural properties in the existing techniques.

Results

An efficient algorithm has been developed for comparing protein structures using elastic shape analysis in which the sequence of 3D coordinates atoms of protein structures supplemented by additional auxiliary information from side-chain properties are incorporated. The protein structure is represented by a special function called square-root velocity function. Furthermore, singular value decomposition and dynamic programming have been employed for optimal rotation and optimal matching of the proteins, respectively. Also, geodesic distance has been calculated and used as the dissimilarity score between two protein structures. The performance of the developed algorithm is tested and found to be more efficient, i.e., running time reduced by 80–90 % without compromising accuracy of comparison when compared with the existing methods. Source codes for different functions have been developed in R. Also, user friendly web-based application called ProtSComp has been developed using above algorithm for comparing protein 3D structures and is accessible free.

Conclusions

The methodology and algorithm developed in this study is taking considerably less computational time without loss of accuracy (Table 2). The proposed algorithm is considering different criteria of representing protein structures using 3D coordinates of atoms and inclusion of residue wise molecular properties as auxiliary information.
  相似文献   

15.
Understanding the relationship between the amino‐acid sequence of a protein and its ability to fold and to function is one of the major challenges of protein science. Here, cases are reviewed in which mutagenesis, biochemistry, structure determination, protein engineering, and single‐molecule biophysics have illuminated the sequence determinants of folding, binding specificity, and biological function for DNA‐binding proteins and ATP‐fueled machines that forcibly unfold native proteins as a prelude to degradation. In addition to structure‐function relationships, these studies provide information about folding intermediates, mutations that accelerate folding, slow unfolding, and stabilize proteins against denaturation, show how new binding specificities and folds can evolve, and reveal strategies that proteolytic machines use to recognize, unfold, and degrade thousands of distinct substrates.  相似文献   

16.
The introduction of disulfide crosslinks is a generally useful method by which to identify regions of a protein that are close together in space. Here we describe the use of disulfide crosslinks to investigate the structure and flexibility of a family of designed 4-helix bundle proteins. The results of these analyses lend support to our working model of the proteins' structure and suggest that the proteins have limited main-chain flexibility.  相似文献   

17.
Vibrational Raman optical activity (ROA), measured as a small difference in the intensity of Raman scattering from chiral molecules in right and left-circularly polarized incident light, or as the intensity of a small circularly polarized component in the scattered light, is a powerful probe of the aqueous solution structure of proteins. On account of the large number of structure-sensitive bands in protein ROA spectra, multivariate analysis techniques such as non-linear mapping (NLM) are especially favourable for determining structural relationships between different proteins. Here NLM is used to map a dataset of 80 polypeptide, protein and virus ROA spectra, considered as points in a multidimensional space with axes representing the digitized wavenumbers, into readily visualizable two and three-dimensional spaces in which points close to or distant from each other, respectively, represent similar or dissimilar structures. Discrete clusters are observed which correspond to the seven structure classes all alpha, mainly alpha, alphabeta, mainly beta, all beta, mainly disordered/irregular and all disordered/irregular. The average standardised ROA spectra of the proteins falling within each structure class have distinct features characteristic of each class. A distinct cluster containing the wheat protein A-gliadin and the plant viruses potato virus X, narcissus mosaic virus, papaya mosaic virus and tobacco rattle virus, all of which appear in the mainly alpha cluster in the two-dimensional representation, becomes clearly separated in the direction of increasing disorder in the three-dimensional representation. This suggests that the corresponding five proteins, none of which to date has yielded high-resolution X-ray structures, consist mainly of alpha-helix and disordered structure with little or no beta-sheet. This combination of structural elements may have functional significance, such as facilitating disorder-to-order transitions (and vice versa) and suppressing aggregation, in these proteins and also in sequences within other proteins. The use of ROA to identify proteins containing significant amounts of disordered structure will, inter alia, be valuable in structural genomics/proteomics since disordered regions often inhibit crystallization.  相似文献   

18.
Cai XH  Jaroszewski L  Wooley J  Godzik A 《Proteins》2011,79(8):2389-2402
The protein universe can be organized in families that group proteins sharing common ancestry. Such families display variable levels of structural and functional divergence, from homogenous families, where all members have the same function and very similar structure, to very divergent families, where large variations in function and structure are observed. For practical purposes of structure and function prediction, it would be beneficial to identify sub-groups of proteins with highly similar structures (iso-structural) and/or functions (iso-functional) within divergent protein families. We compared three algorithms in their ability to cluster large protein families and discuss whether any of these methods could reliably identify such iso-structural or iso-functional groups. We show that clustering using profile-sequence and profile-profile comparison methods closely reproduces clusters based on similarities between 3D structures or clusters of proteins with similar biological functions. In contrast, the still commonly used sequence-based methods with fixed thresholds result in vast overestimates of structural and functional diversity in protein families. As a result, these methods also overestimate the number of protein structures that have to be determined to fully characterize structural space of such families. The fact that one can build reliable models based on apparently distantly related templates is crucial for extracting maximal amount of information from new sequencing projects.  相似文献   

19.
MOTIVATION: Evolutionary relationships of proteins have long been derived from the alignment of protein sequences. But from the view of function, most restraints of evolutionary divergence operate at the level of tertiary structure. It has been demonstrated that quantitative measures of dissimilarity in families of structurally similar proteins can be applied to the construction of trees from a comparison of their three-dimensional structures. However, no convenient tool is publicly available to carry out such analyses. RESULTS: We developed STRUCLA (STRUcture CLAssification), a WWW tool for generation of trees based on evolutionary distances inferred from protein structures according to various methods. The server takes as an input a list of PDB files or the initial alignment of protein coordinates provided by the user (for instance exported from SWISS PDB VIEWER). The user specifies the distance cutoff and selects the distance measures. The server returns series of unrooted trees in the NEXUS format and corresponding distance matrices, as well as a consensus tree. The results can be used as an alternative and a complement to a fixed hierarchy of current protein structure databases. It can complement sequence-based phylogenetic analysis in the 'twilight zone of homology', where amino acid sequences are too diverged to provide reliable relationships.  相似文献   

20.
Due to advances in molecular biology the DNA sequences of structural genes coding for proteins are often known before a protein is characterized or even isolated. The function of a protein whose amino acid sequence has been deduced from a DNA sequence may not even be known. This has created greater interest in the development of methods to predict the tertiary structures of proteins. The a priori prediction of a protein's structure from its amino acid sequence is not yet possible. However, since proteins with similar amino acid sequences are observed to have similar three-dimensional structures, it is possible to use an analogy with a protein of known structure to draw some conclusions about the structure and properties of an uncharacterized protein. The process of predicting the tertiary structure of a protein relies very much upon computer modeling and analysis of the structure. The prediction of the structure of the bacteriophage 434 cro repressor is used as an example illustrating current procedures.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号