首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
BACKGROUND: Several methods of structural classification have been developed to introduce some order to the large amount of data present in the Protein Data Bank. Such methods facilitate structural comparisons and provide a greater understanding of structure and function. The most widely used and comprehensive databases are SCOP, CATH and FSSP, which represent three unique methods of classifying protein structures: purely manual, a combination of manual and automated, and purely automated, respectively. In order to develop reliable template libraries and benchmarks for protein-fold recognition, a systematic comparison of these databases has been carried out to determine their overall agreement in classifying protein structures. RESULTS: Approximately two-thirds of the protein chains in each database are common to all three databases. Despite employing different methods, and basing their systems on different rules of protein structure and taxonomy, SCOP, CATH and FSSP agree on the majority of their classifications. Discrepancies and inconsistencies are accounted for by a small number of explanations. Other interesting features have been identified, and various differences between manual and automatic classification methods are presented. CONCLUSIONS: Using these databases requires an understanding of the rules upon which they are based; each method offers certain advantages depending on the biological requirements and knowledge of the user. The degree of discrepancy between the systems also has an impact on reliability of prediction methods that employ these schemes as benchmarks. To generate accurate fold templates for threading, we extract information from a consensus database, encompassing agreements between SCOP, CATH and FSSP.  相似文献   

2.
Getz G  Vendruscolo M  Sachs D  Domany E 《Proteins》2002,46(4):405-415
We present an automated procedure to assign CATH and SCOP classifications to proteins whose FSSP score is available. CATH classification is assigned down to the topology level, and SCOP classification is assigned to the fold level. Because the FSSP database is updated weekly, this method makes it possible to update also CATH and SCOP with the same frequency. Our predictions have a nearly perfect success rate when ambiguous cases are discarded. These ambiguous cases are intrinsic in any protein structure classification that relies on structural information alone. Hence, we introduce the "twilight zone for structure classification." We further suggest that to resolve these ambiguous cases, other criteria of classification, based also on information about sequence and function, must be used.  相似文献   

3.
The analysis and prediction of protein-protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein-protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain-domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain-domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain-domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein-protein interaction by an estimated 6.5%.  相似文献   

4.
The F2CS server provides access to the software, F2CS2.00, which implements an automated prediction method of SCOP and CATH classifications of proteins, based on their FSSP Z-scores. AVAILABILITY: Free at http://www.weizmann.ac.il/physics/complex/compphys/f2cs/ SUPPLEMENTARY INFORMATION: The site contains links to additional figures and tables.  相似文献   

5.
Tobi D 《Proteins》2012,80(4):1167-1176
A novel methodology for comparison of protein dynamics is presented. Protein dynamics is calculated using the Gaussian network model and the modes of motion are globally aligned using the dynamic programming algorithm of Needleman and Wunsch, commonly used for sequence alignment. The alignment is fast and can be used to analyze large sets of proteins. The methodology is applied to the four major classes of the SCOP database: "all alpha proteins," "all beta proteins," "alpha and beta proteins," and "alpha/beta proteins". We show that different domains may have similar global dynamics. In addition, we report that the dynamics of "all alpha proteins" domains are less specific to structural variations within a given fold or superfamily compared with the other classes. We report that domain pairs with the most similar and the least similar global dynamics tend to be of similar length. The significance of the methodology is that it suggests a new and efficient way of mapping between the global structural features of protein families/subfamilies and their encoded dynamics.  相似文献   

6.
We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the classification systems is due to differing domain definitions rather than assigning the same domain to different folds. However, there are significant differences in the fold assignments between the three systems. These remaining differences can be explained primarily in terms of the breadth of the fold classifications. Many structures may be defined as having one fold in one system, whereas far fewer are defined as having the analogous fold in another system. By comparing these folds for a nonredundant set of proteins, the consensus method breaks up broad fold classifications and combines restrictive fold classifications into metafolds, creating, in effect, an averaged view of fold space. This averaged view requires that the structural similarities between proteins having the same metafold be recognized by multiple classification systems. Thus, the consensus map is useful for researchers looking for fold similarities that are relatively independent of the method used to compare proteins. The 30 most populated metafolds, representing the folds of about half of a nonredundant subset of the PDB, are presented here. The full list of metafolds is presented on the Web.  相似文献   

7.
MOTIVATION: The PFDB (Protein Family Database) is a new database designed to integrate protein family-related data with relevant functional and genomic data. It currently manages biological data for three projects-the CATH protein domain database (Orengo et al., 1997; Pearl et al., 2001), the VIDA virus domains database (Albà et al., 2001) and the Gene3D database (Buchan et al., 2001). The PFDB has been designed to accommodate protein families identified by a variety of sequence based or structure based protocols and provides a generic resource for biological research by enabling mapping between different protein families and diverse biochemical and genetic data, including complete genomes. RESULTS: A characteristic feature of the PFDB is that it has a number of meta-level entities (for example aggregation, collection and inclusion) represented as base tables in the final design. The explicit representation of relationships at the meta-level has a number of advantages, including flexibility-both in terms of the range of queries that can be formulated and the ability to integrate new biological entities within the existing design. A potential drawback with this approach-poor performance caused by the number of joins across meta-level tables-is avoided by implementing the PFDB with materialized views using the mature relational database technology of Oracle 8i. The resultant database is both fast and flexible. This paper presents the principles on which the database has been designed and implemented, and describes the current status of the database and query facilities supported.  相似文献   

8.
Systematic comparison of surface coatings for protein microarrays   总被引:4,自引:0,他引:4  
To process large numbers of samples in parallel is one potential of protein microarrays for research and diagnostics. However, the application of protein arrays is currently hampered by the lack of comprehensive technological knowledge about the suitability of 2-D and 3-D slide surface coatings. We have performed a systematic study to analyze how both surface types perform in combination with different fluorescent dyes to generate significant and reproducible data. In total, we analyzed more than 100 slides containing 1152 spots each. Slides were probed against different monoclonal antibodies (mAbs) and recombinant fusion proteins. We found two surface coatings to be most suitable for protein and antibody (Ab) immobilization. These were further subjected to quantitative analyses by evaluating intraslide and slide-to-slide reproducibilities, and the linear range of target detection. In summary, we demonstrate that only suitable combinations of surface and fluorescent dyes allow the generation of highly reproducible data.  相似文献   

9.
MOTIVATION: Evolutionary relationships of proteins have long been derived from the alignment of protein sequences. But from the view of function, most restraints of evolutionary divergence operate at the level of tertiary structure. It has been demonstrated that quantitative measures of dissimilarity in families of structurally similar proteins can be applied to the construction of trees from a comparison of their three-dimensional structures. However, no convenient tool is publicly available to carry out such analyses. RESULTS: We developed STRUCLA (STRUcture CLAssification), a WWW tool for generation of trees based on evolutionary distances inferred from protein structures according to various methods. The server takes as an input a list of PDB files or the initial alignment of protein coordinates provided by the user (for instance exported from SWISS PDB VIEWER). The user specifies the distance cutoff and selects the distance measures. The server returns series of unrooted trees in the NEXUS format and corresponding distance matrices, as well as a consensus tree. The results can be used as an alternative and a complement to a fixed hierarchy of current protein structure databases. It can complement sequence-based phylogenetic analysis in the 'twilight zone of homology', where amino acid sequences are too diverged to provide reliable relationships.  相似文献   

10.
From 1189 colposcopy referrals in 1997 at a single cervical screening centre, 88 women who had no biopsy taken at colposcopy (negative colposcopy) were identified. We followed up these women for a maximum of 4 years and calculated the positive predictive value (PPV) of a single smear before and after follow-up. Using slide review we attempted to correlate the grade of smear leading to colposcopy referral with final outcome. Our results showed that long-term follow-up alters the PPV of cervical cytology. Analysis showed a strong correlation between the review grade of the referring smear and the final outcome after follow-up. From these results we suggest an evidence-based protocol for cervical screening follow-up after negative colposcopy.  相似文献   

11.
The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath_new) currently contains 34 287 domain structures classified into 1383 superfamilies and 3285 sequence families. Each structural family is expanded with domain sequence relatives recruited from GenBank using a variety of efficient sequence search protocols and reliable thresholds. This extended resource, known as the CATH-protein family database (CATH-PFDB) contains a total of 310 000 domain sequences classified into 26 812 sequence families. New sequence search protocols have been designed, based on these intermediate sequence libraries, to allow more regular updating of the classification. Further developments include the adaptation of a recently developed method for rapid structure comparison, based on secondary structure matching, for domain boundary assignment. The philosophy behind CATHEDRAL is the recognition of recurrent folds already classified in CATH. Benchmarking of CATHEDRAL, using manually validated domain assignments, demonstrated that 43% of domains boundaries could be completely automatically assigned. This is an improvement on a previous consensus approach for which only 10-20% of domains could be reliably processed in a completely automated fashion. Since domain boundary assignment is a significant bottleneck in the classification of new structures, CATHEDRAL will also help to increase the frequency of CATH updates.  相似文献   

12.

Background  

Recently a new class of methods for fast protein structure comparison has emerged. We call the methods in this class projection methods as they rely on a mapping of protein structure into a high-dimensional vector space. Once the mapping is done, the structure comparison is reduced to distance computation between corresponding vectors. As structural similarity is approximated by distance between projections, the success of any projection method depends on how well its mapping function is able to capture the salient features of protein structure. There is no agreement on what constitutes a good projection technique and the three currently known projection methods utilize very different approaches to the mapping construction, both in terms of what structural elements are included and how this information is integrated to produce a vector representation.  相似文献   

13.

Background

HutZ is the sole heme storage protein identified in the pathogenic bacterium Vibrio cholerae and is required for optimal heme utilization. However, no heme oxygenase activity has been observed with this protein. Thus far, HutZ??s structure and heme-binding mechanism are unknown.

Results

We report the first crystal structure of HutZ in a homodimer determined at 2.0 ? resolution. The HutZ structure adopted a typical split-barrel fold. Through a docking study and site-directed mutagenesis, a heme-binding model for the HutZ dimer is proposed. Very interestingly, structural superimposition of HutZ and its homologous protein HugZ, a heme oxygenase from Helicobacter pylori, exhibited a structural mismatch of one amino acid residue in ??6 of HutZ, although residues involved in this region are highly conserved in both proteins. Derived homologous models of different single point variants with model evaluations suggested that Pro140 of HutZ, corresponding to Phe215 of HugZ, might have been the main contributor to the structural mismatch. This mismatch initiates more divergent structural characteristics towards their C-terminal regions, which are essential features for the heme-binding of HugZ as a heme oxygenase.

Conclusions

HutZ??s deficiency in heme oxygenase activity might derive from its residue shift relative to the heme oxygenase HugZ. This residue shift also emphasized a limitation of the traditional template selection criterion for homology modeling.  相似文献   

14.
Added-value is the additional information that a model carries with respect to the template structure used for model building. Thousands of single-template models, corresponding to proteins of known structure, were analyzed. The accuracy of structure-derived properties, such as residue accessibility, surface area, electrostatic potential, and others, was determined as a function of template:target sequence identity by comparing the models with their corresponding experimental structures. Added-value was determined by comparing the accuracy in models with that from templates. Geometry-dependent properties such as neighborhood of buried residues and accessible surface area showed low added-value. Properties that also depend on the protein sequence, such as presence of polar areas and electrostatic potential, showed high added-value. In general added-value increases when template:target sequence identity decreases, but it is also affected by alignment errors. This study justifies the use of models instead of the use of templates to estimate structure-derived properties of a target protein.  相似文献   

15.

Background

Protein structure comparison play important role in in silico functional prediction of a new protein. It is also used for understanding the evolutionary relationships among proteins. A variety of methods have been proposed in literature for comparing protein structures but they have their own limitations in terms of accuracy and complexity with respect to computational time and space. There is a need to improve the computational complexity in comparison/alignment of proteins through incorporation of important biological and structural properties in the existing techniques.

Results

An efficient algorithm has been developed for comparing protein structures using elastic shape analysis in which the sequence of 3D coordinates atoms of protein structures supplemented by additional auxiliary information from side-chain properties are incorporated. The protein structure is represented by a special function called square-root velocity function. Furthermore, singular value decomposition and dynamic programming have been employed for optimal rotation and optimal matching of the proteins, respectively. Also, geodesic distance has been calculated and used as the dissimilarity score between two protein structures. The performance of the developed algorithm is tested and found to be more efficient, i.e., running time reduced by 80–90 % without compromising accuracy of comparison when compared with the existing methods. Source codes for different functions have been developed in R. Also, user friendly web-based application called ProtSComp has been developed using above algorithm for comparing protein 3D structures and is accessible free.

Conclusions

The methodology and algorithm developed in this study is taking considerably less computational time without loss of accuracy (Table 2). The proposed algorithm is considering different criteria of representing protein structures using 3D coordinates of atoms and inclusion of residue wise molecular properties as auxiliary information.
  相似文献   

16.
SUMMARY: DaliLite is a program for pairwise structure comparison and for structure database searching. It is a standalone version of the search engine of the popular Dali server. A web interface is provided to view the results, multiple alignments and 3D superimpositions of structures.  相似文献   

17.
With the advent of technologies for the derivation of embryonic stem cells and reprogrammed stem cells, use of the term “pluripotent” has become widespread. Despite its increased scientific and political importance, there are ambiguities with this designation and a common standard for experimental approaches that precisely define this state in human cells remains elusive. Recent studies have revealed that reprogramming may occur via many pathways which do not always lead to pluripotency. In addition, the pluripotent state itself appears to be highly dynamic, leading to significant variability in the results of molecular studies. Establishment of a stringent set of criteria for defining pluripotency will be vital for biological studies and potential clinical applications in this rapidly evolving field. In this review, we explore the various definitions of pluripotency, examine the current status of pluripotency testing in the field and provide an analysis of how these assays have been used to establish pluripotency in the scientific literature. J. Cell. Physiol. 220: 21–29, 2009. © 2009 Wiley‐Liss, Inc.  相似文献   

18.
On account of its sensitivity to chirality, Raman optical activity (ROA), measured here as the intensity of a small, circularly polarized component in the scattered light using unpolarized incident light, is a powerful probe of protein structure and behavior. Protein ROA spectra provide information on secondary and tertiary structures of polypeptide backbones, backbone hydration, and side chain conformations, and on structural elements present in unfolded states. This article describes the ROA technique and presents ROA spectra, recorded with a commercial instrument of novel design, of a selection of proteins to demonstrate how ROA may be used to readily distinguish between the main classes of protein structure. A principal component analysis illustrates how the many structure-sensitive bands in protein ROA spectra are favorable for applying pattern recognition techniques to determine structural relationships between different proteins.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号