期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

A systematic comparison of protein structure classifications: SCOP, CATH and FSSP. 总被引：7，自引：0，他引：7

C Hadley D T Jones 《Structure (London, England : 1993)》1999,7(9):1099-1112

BACKGROUND: Several methods of structural classification have been developed to introduce some order to the large amount of data present in the Protein Data Bank. Such methods facilitate structural comparisons and provide a greater understanding of structure and function. The most widely used and comprehensive databases are SCOP, CATH and FSSP, which represent three unique methods of classifying protein structures: purely manual, a combination of manual and automated, and purely automated, respectively. In order to develop reliable template libraries and benchmarks for protein-fold recognition, a systematic comparison of these databases has been carried out to determine their overall agreement in classifying protein structures. RESULTS: Approximately two-thirds of the protein chains in each database are common to all three databases. Despite employing different methods, and basing their systems on different rules of protein structure and taxonomy, SCOP, CATH and FSSP agree on the majority of their classifications. Discrepancies and inconsistencies are accounted for by a small number of explanations. Other interesting features have been identified, and various differences between manual and automatic classification methods are presented. CONCLUSIONS: Using these databases requires an understanding of the rules upon which they are based; each method offers certain advantages depending on the biological requirements and knowledge of the user. The degree of discrepancy between the systems also has an impact on reliability of prediction methods that employ these schemes as benchmarks. To generate accurate fold templates for threading, we extract information from a consensus database, encompassing agreements between SCOP, CATH and FSSP. 相似文献

2.

Automated assignment of SCOP and CATH protein structure classifications from FSSP scores

Getz G Vendruscolo M Sachs D Domany E 《Proteins》2002,46(4):405-415

We present an automated procedure to assign CATH and SCOP classifications to proteins whose FSSP score is available. CATH classification is assigned down to the topology level, and SCOP classification is assigned to the fold level. Because the FSSP database is updated weekly, this method makes it possible to update also CATH and SCOP with the same frequency. Our predictions have a nearly perfect success rate when ambiguous cases are discarded. These ambiguous cases are intrinsic in any protein structure classification that relies on structural information alone. Hence, we introduce the "twilight zone for structure classification." We further suggest that to resolve these ambiguous cases, other criteria of classification, based also on information about sequence and function, must be used. 相似文献

3.

A comparison of SCOP and CATH with respect to domain-domain interactions

Jefferson ER Walsh TP Barton GJ 《Proteins》2008,70(1):54-62

The analysis and prediction of protein-protein interaction sites from structural data are restricted by the limited availability of structural complexes that represent the complete protein-protein interaction space. The domain classification schemes CATH and SCOP are normally used independently in the analysis and prediction of protein domain-domain interactions. In this article, the effect of different domain classification schemes on the number and type of domain-domain interactions observed in structural data is systematically evaluated for the SCOP and CATH hierarchies. Although there is a large overlap in domain assignments between SCOP and CATH, 23.6% of CATH interfaces had no SCOP equivalent and 37.3% of SCOP interfaces had no CATH equivalent in a nonredundant set. Therefore, combining both classifications gives an increase of between 23.6 and 37.3% in domain-domain interfaces. It is suggested that if possible, both domain classification schemes should be used together, but if only one is selected, SCOP provides better coverage than CATH. Employing both SCOP and CATH reduces the false negative rate of predictive methods, which employ homology matching to structural data to predict protein-protein interaction by an estimated 6.5%. 相似文献

4.

F2CS: FSSP to CATH and SCOP prediction server

Getz G Starovolsky A Domany E 《Bioinformatics (Oxford, England)》2004,20(13):2150-2152

The F2CS server provides access to the software, F2CS2.00, which implements an automated prediction method of SCOP and CATH classifications of proteins, based on their FSSP Z-scores. AVAILABILITY: Free at http://www.weizmann.ac.il/physics/complex/compphys/f2cs/ SUPPLEMENTARY INFORMATION: The site contains links to additional figures and tables. 相似文献

5.

Dynamics alignment: comparison of protein dynamics in the SCOP database

Tobi D 《Proteins》2012,80(4):1167-1176

A novel methodology for comparison of protein dynamics is presented. Protein dynamics is calculated using the Gaussian network model and the modes of motion are globally aligned using the dynamic programming algorithm of Needleman and Wunsch, commonly used for sequence alignment. The alignment is fast and can be used to analyze large sets of proteins. The methodology is applied to the four major classes of the SCOP database: "all alpha proteins," "all beta proteins," "alpha and beta proteins," and "alpha/beta proteins". We show that different domains may have similar global dynamics. In addition, we report that the dynamics of "all alpha proteins" domains are less specific to structural variations within a given fold or superfamily compared with the other classes. We report that domain pairs with the most similar and the least similar global dynamics tend to be of similar length. The significance of the methodology is that it suggests a new and efficient way of mapping between the global structural features of protein families/subfamilies and their encoded dynamics. 相似文献

6.

A consensus view of fold space: combining SCOP, CATH, and the Dali Domain Dictionary

Day R Beck DA Armen RS Daggett V 《Protein science : a publication of the Protein Society》2003,12(10):2150-2160

We have determined consensus protein-fold classifications on the basis of three classification methods, SCOP, CATH, and Dali. These classifications make use of different methods of defining and categorizing protein folds that lead to different views of protein-fold space. Pairwise comparisons of domains on the basis of their fold classifications show that much of the disagreement between the classification systems is due to differing domain definitions rather than assigning the same domain to different folds. However, there are significant differences in the fold assignments between the three systems. These remaining differences can be explained primarily in terms of the breadth of the fold classifications. Many structures may be defined as having one fold in one system, whereas far fewer are defined as having the analogous fold in another system. By comparing these folds for a nonredundant set of proteins, the consensus method breaks up broad fold classifications and combines restrictive fold classifications into metafolds, creating, in effect, an averaged view of fold space. This averaged view requires that the structural similarities between proteins having the same metafold be recognized by multiple classification systems. Thus, the consensus map is useful for researchers looking for fold similarities that are relatively independent of the method used to compare proteins. The 30 most populated metafolds, representing the folds of about half of a nonredundant subset of the PDB, are presented here. The full list of metafolds is presented on the Web. 相似文献

7.

PFDB: a generic protein family database integrating the CATH domain structure database with sequence based protein family resources

Shepherd AJ Martin NJ Johnson RG Kellam P Orengo CA 《Bioinformatics (Oxford, England)》2002,18(12):1666-1672

MOTIVATION: The PFDB (Protein Family Database) is a new database designed to integrate protein family-related data with relevant functional and genomic data. It currently manages biological data for three projects-the CATH protein domain database (Orengo et al., 1997; Pearl et al., 2001), the VIDA virus domains database (Albà et al., 2001) and the Gene3D database (Buchan et al., 2001). The PFDB has been designed to accommodate protein families identified by a variety of sequence based or structure based protocols and provides a generic resource for biological research by enabling mapping between different protein families and diverse biochemical and genetic data, including complete genomes. RESULTS: A characteristic feature of the PFDB is that it has a number of meta-level entities (for example aggregation, collection and inclusion) represented as base tables in the final design. The explicit representation of relationships at the meta-level has a number of advantages, including flexibility-both in terms of the range of queries that can be formulated and the ability to integrate new biological entities within the existing design. A potential drawback with this approach-poor performance caused by the number of joins across meta-level tables-is avoided by implementing the PFDB with materialized views using the mature relational database technology of Oracle 8i. The resultant database is both fast and flexible. This paper presents the principles on which the database has been designed and implemented, and describes the current status of the database and query facilities supported. 相似文献

8.

Systematic comparison of surface coatings for protein microarrays 总被引：4，自引：0，他引：4

Guilleaume B Buness A Schmidt C Klimek F Moldenhauer G Huber W Arlt D Korf U Wiemann S Poustka A 《Proteomics》2005,5(18):4705-4712

To process large numbers of samples in parallel is one potential of protein microarrays for research and diagnostics. However, the application of protein arrays is currently hampered by the lack of comprehensive technological knowledge about the suitability of 2-D and 3-D slide surface coatings. We have performed a systematic study to analyze how both surface types perform in combination with different fluorescent dyes to generate significant and reproducible data. In total, we analyzed more than 100 slides containing 1152 spots each. Slides were probed against different monoclonal antibodies (mAbs) and recombinant fusion proteins. We found two surface coatings to be most suitable for protein and antibody (Ab) immobilization. These were further subjected to quantitative analyses by evaluating intraslide and slide-to-slide reproducibilities, and the linear range of target detection. In summary, we demonstrate that only suitable combinations of surface and fluorescent dyes allow the generation of highly reproducible data. 相似文献

9.

STRUCLA: a WWW meta-server for protein structure comparison and evolutionary classification

Sasin JM Kurowski MA Bujnicki JM 《Bioinformatics (Oxford, England)》2003,19(Z1):i252-i254

MOTIVATION: Evolutionary relationships of proteins have long been derived from the alignment of protein sequences. But from the view of function, most restraints of evolutionary divergence operate at the level of tertiary structure. It has been demonstrated that quantitative measures of dissimilarity in families of structurally similar proteins can be applied to the construction of trees from a comparison of their three-dimensional structures. However, no convenient tool is publicly available to carry out such analyses. RESULTS: We developed STRUCLA (STRUcture CLAssification), a WWW tool for generation of trees based on evolutionary distances inferred from protein structures according to various methods. The server takes as an input a list of PDB files or the initial alignment of protein coordinates provided by the user (for instance exported from SWISS PDB VIEWER). The user specifies the distance cutoff and selects the distance measures. The server returns series of unrooted trees in the NEXUS format and corresponding distance matrices, as well as a consensus tree. The results can be used as an alternative and a complement to a fixed hierarchy of current protein structure databases. It can complement sequence-based phylogenetic analysis in the 'twilight zone of homology', where amino acid sequences are too diverged to provide reliable relationships. 相似文献

10.

Long-term follow-up of patients following negative colposcopy: a new gold standard and its implications for cervical screening

P. D. Da Forno M. R. Holbrook D. Nunns P. A. V. Shaw 《Cytopathology》2003,14(5):281-286

From 1189 colposcopy referrals in 1997 at a single cervical screening centre, 88 women who had no biopsy taken at colposcopy (negative colposcopy) were identified. We followed up these women for a maximum of 4 years and calculated the positive predictive value (PPV) of a single smear before and after follow-up. Using slide review we attempted to correlate the grade of smear leading to colposcopy referral with final outcome. Our results showed that long-term follow-up alters the PPV of cervical cytology. Analysis showed a strong correlation between the review grade of the referring smear and the final outcome after follow-up. From these results we suggest an evidence-based protocol for cervical screening follow-up after negative colposcopy. 相似文献

11.

The CATH database: an extended protein family resource for structural and functional genomics

Pearl FM Bennett CF Bray JE Harrison AP Martin N Shepherd A Sillitoe I Thornton J Orengo CA 《Nucleic acids research》2003,31(1):452-455

The CATH database of protein domain structures (http://www.biochem.ucl.ac.uk/bsm/cath_new) currently contains 34 287 domain structures classified into 1383 superfamilies and 3285 sequence families. Each structural family is expanded with domain sequence relatives recruited from GenBank using a variety of efficient sequence search protocols and reliable thresholds. This extended resource, known as the CATH-protein family database (CATH-PFDB) contains a total of 310 000 domain sequences classified into 26 812 sequence families. New sequence search protocols have been designed, based on these intermediate sequence libraries, to allow more regular updating of the classification. Further developments include the adaptation of a recently developed method for rapid structure comparison, based on secondary structure matching, for domain boundary assignment. The philosophy behind CATHEDRAL is the recognition of recurrent folds already classified in CATH. Benchmarking of CATHEDRAL, using manually validated domain assignments, demonstrated that 43% of domains boundaries could be completely automatically assigned. This is an improvement on a previous consensus approach for which only 10-20% of domains could be reliably processed in a completely automated fashion. Since domain boundary assignment is a significant bottleneck in the classification of new structures, CATHEDRAL will also help to increase the frequency of CATH updates. 相似文献

12.

Secondary structure spatial conformation footprint: a novel method for fast protein structure comparison and classification

Elena Zotenko Dianne P O'Leary Teresa M Przytycka 《BMC structural biology》2006,6(1):12

Background

Recently a new class of methods for fast protein structure comparison has emerged. We call the methods in this class projection methods as they rely on a mapping of protein structure into a high-dimensional vector space. Once the mapping is done, the structure comparison is reduced to distance computation between corresponding vectors. As structural similarity is approximated by distance between projections, the success of any projection method depends on how well its mapping function is able to capture the salient features of protein structure. There is no agreement on what constitutes a good projection technique and the three currently known projection methods utilize very different approaches to the mapping construction, both in terms of what structural elements are included and how this information is integrated to produce a vector representation. 相似文献

13.

Evolutionarily consistent families in SCOP: sequence, structure and function

Ralph B Pethica Michael Levitt Julian Gough 《BMC structural biology》2012,12(1):1-10

Background

HutZ is the sole heme storage protein identified in the pathogenic bacterium Vibrio cholerae and is required for optimal heme utilization. However, no heme oxygenase activity has been observed with this protein. Thus far, HutZ??s structure and heme-binding mechanism are unknown.

Results

We report the first crystal structure of HutZ in a homodimer determined at 2.0 ? resolution. The HutZ structure adopted a typical split-barrel fold. Through a docking study and site-directed mutagenesis, a heme-binding model for the HutZ dimer is proposed. Very interestingly, structural superimposition of HutZ and its homologous protein HugZ, a heme oxygenase from Helicobacter pylori, exhibited a structural mismatch of one amino acid residue in ??6 of HutZ, although residues involved in this region are highly conserved in both proteins. Derived homologous models of different single point variants with model evaluations suggested that Pro¹⁴⁰ of HutZ, corresponding to Phe²¹⁵ of HugZ, might have been the main contributor to the structural mismatch. This mismatch initiates more divergent structural characteristics towards their C-terminal regions, which are essential features for the heme-binding of HugZ as a heme oxygenase.

Conclusions

HutZ??s deficiency in heme oxygenase activity might derive from its residue shift relative to the heme oxygenase HugZ. This residue shift also emphasized a limitation of the traditional template selection criterion for homology modeling. 相似文献

14.

Systematic analysis of added-value in simple comparative models of protein structure

Chakravarty S Sanchez R 《Structure (London, England : 1993)》2004,12(8):1461-1470

Added-value is the additional information that a model carries with respect to the template structure used for model building. Thousands of single-template models, corresponding to proteins of known structure, were analyzed. The accuracy of structure-derived properties, such as residue accessibility, surface area, electrostatic potential, and others, was determined as a function of template:target sequence identity by comparing the models with their corresponding experimental structures. Added-value was determined by comparing the accuracy in models with that from templates. Geometry-dependent properties such as neighborhood of buried residues and accessible surface area showed low added-value. Properties that also depend on the protein sequence, such as presence of polar areas and electrostatic potential, showed high added-value. In general added-value increases when template:target sequence identity decreases, but it is also affected by alignment errors. This study justifies the use of models instead of the use of templates to estimate structure-derived properties of a target protein. 相似文献

15.

An efficient algorithm for protein structure comparison using elastic shape analysis

S. Srivastava S. B. Lal D. C. Mishra U. B. Angadi K. K. Chaturvedi S. N. Rai A. Rai 《Algorithms for molecular biology : AMB》2016,11(1):27

Background

Protein structure comparison play important role in in silico functional prediction of a new protein. It is also used for understanding the evolutionary relationships among proteins. A variety of methods have been proposed in literature for comparing protein structures but they have their own limitations in terms of accuracy and complexity with respect to computational time and space. There is a need to improve the computational complexity in comparison/alignment of proteins through incorporation of important biological and structural properties in the existing techniques.

Results

An efficient algorithm has been developed for comparing protein structures using elastic shape analysis in which the sequence of 3D coordinates atoms of protein structures supplemented by additional auxiliary information from side-chain properties are incorporated. The protein structure is represented by a special function called square-root velocity function. Furthermore, singular value decomposition and dynamic programming have been employed for optimal rotation and optimal matching of the proteins, respectively. Also, geodesic distance has been calculated and used as the dissimilarity score between two protein structures. The performance of the developed algorithm is tested and found to be more efficient, i.e., running time reduced by 80–90 % without compromising accuracy of comparison when compared with the existing methods. Source codes for different functions have been developed in R. Also, user friendly web-based application called ProtSComp has been developed using above algorithm for comparing protein 3D structures and is accessible free.

Conclusions

The methodology and algorithm developed in this study is taking considerably less computational time without loss of accuracy (Table 2). The proposed algorithm is considering different criteria of representing protein structures using 3D coordinates of atoms and inclusion of residue wise molecular properties as auxiliary information.

相似文献

16.

DaliLite workbench for protein structure comparison

Holm L Park J 《Bioinformatics (Oxford, England)》2000,16(6):566-567

SUMMARY: DaliLite is a program for pairwise structure comparison and for structure database searching. It is a standalone version of the search engine of the popular Dali server. A web interface is provided to view the results, multiple alignments and 3D superimpositions of structures. 相似文献

17.

Pluripotency: Toward a gold standard for human ES and iPS cells

Kelly P. Smith Mai X. Luong Gary S. Stein 《Journal of cellular physiology》2009,220(1):21-29

With the advent of technologies for the derivation of embryonic stem cells and reprogrammed stem cells, use of the term “pluripotent” has become widespread. Despite its increased scientific and political importance, there are ambiguities with this designation and a common standard for experimental approaches that precisely define this state in human cells remains elusive. Recent studies have revealed that reprogramming may occur via many pathways which do not always lead to pluripotency. In addition, the pluripotent state itself appears to be highly dynamic, leading to significant variability in the results of molecular studies. Establishment of a stringent set of criteria for defining pluripotency will be vital for biological studies and potential clinical applications in this rapidly evolving field. In this review, we explore the various definitions of pluripotency, examine the current status of pluripotency testing in the field and provide an analysis of how these assays have been used to establish pluripotency in the scientific literature. J. Cell. Physiol. 220: 21–29, 2009. © 2009 Wiley‐Liss, Inc. 相似文献

18.

Raman optical activity: a tool for protein structure analysis

Zhu F Isaacs NW Hecht L Barron LD 《Structure (London, England : 1993)》2005,13(10):1409-1419

On account of its sensitivity to chirality, Raman optical activity (ROA), measured here as the intensity of a small, circularly polarized component in the scattered light using unpolarized incident light, is a powerful probe of protein structure and behavior. Protein ROA spectra provide information on secondary and tertiary structures of polypeptide backbones, backbone hydration, and side chain conformations, and on structural elements present in unfolded states. This article describes the ROA technique and presents ROA spectra, recorded with a commercial instrument of novel design, of a selection of proteins to demonstrate how ROA may be used to readily distinguish between the main classes of protein structure. A principal component analysis illustrates how the many structure-sensitive bands in protein ROA spectra are favorable for applying pattern recognition techniques to determine structural relationships between different proteins. 相似文献

19.

Setting a gold standard for integrative evolutionary biology

下载免费PDF全文

Rebecca J. Safran 《Evolution; international journal of organic evolution》2017,71(1):199-200

相似文献

20.

Common Structural Cliques: a tool for protein structure and function analysis

Milik M Szalma S Olszewski KA 《Protein engineering》2003,16(8):543-552

相似文献