首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We propose a classification of DNA structures formed from 1 to 4 strands, based only on relative strand directions, base to strand orientation and base pairing geometries. This classification and its associated notation enable all nucleic acids to be grouped into structural families and bring to light possible structures which have not yet been observed experimentally. It also helps in understanding transitions between families and can assist in the design of multistrand structures.  相似文献   

2.
A family of global geometric measures is constructed for protein structure classification. These measures originate from integral formulas of Vassiliev knot invariants and give rise to a unique classification scheme. Our measures can better discriminate between many known protein structures than the simple measures of the secondary structure content of these protein structures.  相似文献   

3.

Background  

Formal classification of a large collection of protein structures aids the understanding of evolutionary relationships among them. Classifications involving manual steps, such as SCOP and CATH, face the challenge of increasing volume of available structures. Automatic methods such as FSSP or Dali Domain Dictionary, yield divergent classifications, for reasons not yet fully investigated. One possible reason is that the pairwise similarity scores used in automatic classification do not adequately reflect the judgments made in manual classification. Another possibility is the difference between manual and automatic classification procedures. We explore the degree to which these two factors might affect the final classification.  相似文献   

4.
We present new structural classification and parameter estimation results that are applicable to multi-input nonlinear systems. The mathematical relationships between the self- and cross-(Volterra and Wiener) kernels are derived for a basic two-input nonlinear structure. These results are then used to develop classification methods for more complicated two-input structures. Algorithms for estimating the parameters (linear and nonlinear subsystems) of these structures are also presented.  相似文献   

5.
A systematic classification of beta-hairpin structures which takes into account the polypeptide chain length and hydrogen bonding between the two antiparallel beta-strands is described. We have used this classification of beta-hairpin structures and their specific sequence pattern to derive rules which demonstrate its usefulness in assisting modelling beta-hairpins. These rules can be applied to comparative model building, modelling into electron density and in the prediction of conformation of beta-hairpins to aid protein engineering.  相似文献   

6.
We report a classification of the crystallographic structures of bovine and squid rhodopsins corresponding to different stages of their photocycles. Using the resource Protein (Structure) Comparison, Knowledge, Similarity, and Information server (ProCKSI, ), selected spatial structures were compared on the basis of classification schemes (dendrograms). To compare the spatial structures of transmembrane proteins, optimal consensus was developed from methods implemented in ProCKSI. Structures were also clustered using principal component analysis, resulting in good agreement with the classification based on the ProCKSI consensus method. Analysis of the results revealed the basic movements of individual transmembrane domains of these proteins that we were able to relate to different stages of the photoactivation of rhodopsin. A combination of methods identified in this study can be used as an up-to-date analytical tool to study the conformational dynamics of membrane receptors.  相似文献   

7.
Many raw biological sequence data have been generated by the human genome project and related efforts. The understanding of structural information encoded by biological sequences is important to acquire knowledge of their biochemical functions but remains a fundamental challenge. Recent interest in RNA regulation has resulted in a rapid growth of deposited RNA secondary structures in varied databases. However, a functional classification and characterization of the RNA structure have only been partially addressed. This article aims to introduce a novel interval-based distance metric for structure-based RNA function assignment. The characterization of RNA structures relies on distance vectors learned from a collection of predicted structures. The distance measure considers the intersected, disjoint, and inclusion between intervals. A set of RNA pseudoknotted structures with known function are applied and the function of the query structure is determined by measuring structure similarity. This not only offers sequence distance criteria to measure the similarity of secondary structures but also aids the functional classification of RNA structures with pesudoknots.  相似文献   

8.
A classification of possible routes of Darwinian evolution   总被引:1,自引:0,他引:1  
A classification of four possible routes of Darwinian evolution is presented. These are serial direct evolution, parallel direct evolution, elimination of functional redundancy, and adoption from a different function. This classification provides a conceptual framework within which to investigate the accessibility by Darwinian evolution of complex biological structures.  相似文献   

9.
Classification is central to many studies of protein structure, function, and evolution. This article presents a strategy for classifying protein three-dimensional structures. Methods for and issues related to secondary structure, domain, and class assignment are discussed, in addition to methods for the comparison of protein three-dimensional structures. Strategies for assigning protein domains to particular folds and homologous superfamilies are then described in the context of the currently available classification schemes. Two examples (adenylate cyclase/DNA polymerase and glycogen phosphorylase/β-glucosyltransferase) are presented to illustrate problems associated with protein classification.  相似文献   

10.
In order to support the structural genomic initiatives, both by rapidly classifying newly determined structures and by suggesting suitable targets for structure determination, we have recently developed several new protocols for classifying structures in the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath). These aim to increase the speed of classification of new structures using fast algorithms for structure comparison (GRATH) and to improve the sensitivity in recognising distant structural relatives by incorporating sequence information from relatives in the genomes (DomainFinder). In order to ensure the integrity of the database given the expected increase in data, the CATH Protein Family Database (CATH-PFDB), which currently includes 25,320 structural domains and a further 160,000 sequence relatives has now been installed in a relational ORACLE database. This was essential for developing more rigorous validation procedures and for allowing efficient querying of the database, particularly for genome analysis. The associated Dictionary of Homologous Superfamilies [Bray,J.E., Todd,A.E., Pearl,F.M.G., Thornton,J.M. and Orengo,C.A. (2000) Protein Eng., 13, 153-165], which provides multiple structural alignments and functional information to assist in assigning new relatives, has also been expanded recently and now includes information for 903 homologous superfamilies. In order to improve coverage of known structures, preliminary classification levels are now provided for new structures at interim stages in the classification protocol. Since a large proportion of new structures can be rapidly classified using profile-based sequence analysis [e.g. PSI-BLAST: Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 3389-3402], this provides preliminary classification for easily recognisable homologues, which in the latest release of CATH (version 1.7) represented nearly three-quarters of the non-identical structures.  相似文献   

11.
Protein structural annotation and classification is an important and challenging problem in bioinformatics. Research towards analysis of sequence-structure correspondences is critical for better understanding of a protein's structure, function, and its interaction with other molecules. Clustering of protein domains based on their structural similarities provides valuable information for protein classification schemes. In this article, we attempt to determine whether structure information alone is sufficient to adequately classify protein structures. We present an algorithm that identifies regions of structural similarity within a given set of protein structures, and uses those regions for clustering. In our approach, called STRALCP (STRucture ALignment-based Clustering of Proteins), we generate detailed information about global and local similarities between pairs of protein structures, identify fragments (spans) that are structurally conserved among proteins, and use these spans to group the structures accordingly. We also provide a web server at http://as2ts.llnl.gov/AS2TS/STRALCP/ for selecting protein structures, calculating structurally conserved regions and performing automated clustering.  相似文献   

12.
Single molecule tracking of membrane proteins by fluorescence microscopy is a promising method to investigate dynamic processes in live cells. Translating the trajectories of proteins to biological implications, such as protein interactions, requires the classification of protein motion within the trajectories. Spatial information of protein motion may reveal where the protein interacts with cellular structures, because binding of proteins to such structures often alters their diffusion speed. For dynamic diffusion systems, we provide an analytical framework to determine in which diffusion state a molecule is residing during the course of its trajectory. We compare different methods for the quantification of motion to utilize this framework for the classification of two diffusion states (two populations with different diffusion speed). We found that a gyration quantification method and a Bayesian statistics-based method are the most accurate in diffusion-state classification for realistic experimentally obtained datasets, of which the gyration method is much less computationally demanding. After classification of the diffusion, the lifetime of the states can be determined, and images of the diffusion states can be reconstructed at high resolution. Simulations validate these applications. We apply the classification and its applications to experimental data to demonstrate the potential of this approach to obtain further insights into the dynamics of cell membrane proteins.  相似文献   

13.
Single molecule tracking of membrane proteins by fluorescence microscopy is a promising method to investigate dynamic processes in live cells. Translating the trajectories of proteins to biological implications, such as protein interactions, requires the classification of protein motion within the trajectories. Spatial information of protein motion may reveal where the protein interacts with cellular structures, because binding of proteins to such structures often alters their diffusion speed. For dynamic diffusion systems, we provide an analytical framework to determine in which diffusion state a molecule is residing during the course of its trajectory. We compare different methods for the quantification of motion to utilize this framework for the classification of two diffusion states (two populations with different diffusion speed). We found that a gyration quantification method and a Bayesian statistics-based method are the most accurate in diffusion-state classification for realistic experimentally obtained datasets, of which the gyration method is much less computationally demanding. After classification of the diffusion, the lifetime of the states can be determined, and images of the diffusion states can be reconstructed at high resolution. Simulations validate these applications. We apply the classification and its applications to experimental data to demonstrate the potential of this approach to obtain further insights into the dynamics of cell membrane proteins.  相似文献   

14.
We discuss the taxonomy of Enterobacteriaceae in the light of classification by minimization of stochastic complexity (SC). A classification which minimizes SC is optimal from the point of view of information theory. It was found that the SC-minimizing classification of a large database of strains of Enterobacteriaceae resulted in structures which correspond well to the conclusions of experts on the taxonomy of Enterobacteriaceae. The approach based on minimization of SC can therefore be considered as useful in bacterial taxonomy.  相似文献   

15.
16.

Background  

Domain experts manually construct the Structural Classification of Protein (SCOP) database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic human classification processes, we develop an automatic SCOP fold classification system to assign possible known SCOP folds and recognize novel folds for newly-discovered proteins.  相似文献   

17.
Comparison of protein structures is important for revealing the evolutionary relationship among proteins, predicting protein functions and predicting protein structures. Many methods have been developed in the past to align two or multiple protein structures. Despite the importance of this problem, rigorous mathematical or statistical frameworks have seldom been pursued for general protein structure comparison. One notable issue in this field is that with many different distances used to measure the similarity between protein structures, none of them are proper distances when protein structures of different sequences are compared. Statistical approaches based on those non-proper distances or similarity scores as random variables are thus not mathematically rigorous. In this work, we develop a mathematical framework for protein structure comparison by treating protein structures as three-dimensional curves. Using an elastic Riemannian metric on spaces of curves, geodesic distance, a proper distance on spaces of curves, can be computed for any two protein structures. In this framework, protein structures can be treated as random variables on the shape manifold, and means and covariance can be computed for populations of protein structures. Furthermore, these moments can be used to build Gaussian-type probability distributions of protein structures for use in hypothesis testing. The covariance of a population of protein structures can reveal the population-specific variations and be helpful in improving structure classification. With curves representing protein structures, the matching is performed using elastic shape analysis of curves, which can effectively model conformational changes and insertions/deletions. We show that our method performs comparably with commonly used methods in protein structure classification on a large manually annotated data set.  相似文献   

18.
To study local structures in proteins, we previously developed an autoassociative artificial neural network (autoANN) and clustering tool to discover intrinsic features of macromolecular structures. The hidden unit activations computed by the trained autoANN are a convenient low-dimensional encoding of the local protein backbone structure. Clustering these activation vectors results in a unique classification of protein local structural features called Structural Building Blocks (SBBs). Here we describe application of this method to a larger database of proteins, verification of the applicability of this method to structure classification, and subsequent analysis of amino acid frequencies and several commonly occurring patterns of SBBs. The SBB classification method has several interesting properties: 1) it identifies the regular secondary structures, α helix and β strand; 2) it consistently identifies other local structure features (e.g., helix caps and strand caps); 3) strong amino acid preferences are revealed at some positions in some SBBs; and 4) distinct patterns of SBBs occur in the “random coil” regions of proteins. Analysis of these patterns identifies interesting structural motifs in the protein backbone structure, indicating that SBBs can be used as “building blocks” in the analysis of protein structure. This type of pattern analysis should increase our understanding of the relationship between protein sequence and local structure, especially in the prediction of protein structures. © 1997 Wiley-Liss, Inc.  相似文献   

19.
Abstract.  Ensign wasps (Hymenoptera: Evaniidae) are colourful, frequently collected and easily distinguished from other parasitic Hymenoptera. Despite many fascinating biological attributes, this group of insects has been overlooked by ecologists and systematists. An imposing obstacle inhibiting research on these wasps is the current state of their chaotic and potentially flawed classification, which has more than 50% of all described species assigned to the genus Evania – a taxon long suspected of being polyphyletic. The generic classification has recently been redefined on the basis of morphological characters. We tested this reinterpreted classification by analysing sequence data from three genes [28S ribosomal RNA (rRNA), 16S rRNA and cytochrome oxidase I (COI)] under parsimony and Bayesian criteria. For the 28S and 16S rRNAs, we illustrate the predicted secondary structures and provide a series of summary statistics for them; information pertaining to these structures was incorporated into our phylogenetic analyses where appropriate. Phylogenetically, our results indicate that this new generic classification is relatively sound, but that more data are required to understand intergeneric relationships.  相似文献   

20.
We present a novel topological classification of RNA secondary structures with pseudoknots. It is based on the topological genus of the circular diagram associated to the RNA base-pair structure. The genus is a positive integer number whose value quantifies the topological complexity of the folded RNA structure. In such a representation, planar diagrams correspond to pure RNA secondary structures and have zero genus, whereas non-planar diagrams correspond to pseudoknotted structures and have higher genus. The topological genus allows for the definition of topological folding motifs, similar in spirit to those introduced and commonly used in protein folding. We analyze real RNA structures from the databases Worldwide Protein Data Bank and Pseudobase and classify them according to their topological genus. For simplicity, we limit our analysis by considering only Watson-Crick complementary base pairs and G-U wobble base pairs. We compare the results of our statistical survey with existing theoretical and numerical models. We also discuss possible applications of this classification and show how it can be used for identifying new RNA structural motifs.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号