期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

SCOPmap: Automated assignment of protein structures to evolutionary superfamilies

Sara?Cheek Yuan?Qi S?Sri?Krishna Lisa?N?Kinch Nick?V?Grishin Email author 《BMC bioinformatics》2004,5(1):197

Background

Inference of remote homology between proteins is very challenging and remains a prerogative of an expert. Thus a significant drawback to the use of evolutionary-based protein structure classifications is the difficulty in assigning new proteins to unique positions in the classification scheme with automatic methods. To address this issue, we have developed an algorithm to map protein domains to an existing structural classification scheme and have applied it to the SCOP database. 相似文献

2.

A high level interface to SCOP and ASTRAL implemented in Python

James A Casbon Gavin E Crooks Mansoor AS Saqi 《BMC bioinformatics》2006,7(1):10-4

Background

Benchmarking algorithms in structural bioinformatics often involves the construction of datasets of proteins with given sequence and structural properties. The SCOP database is a manually curated structural classification which groups together proteins on the basis of structural similarity. The ASTRAL compendium provides non redundant subsets of SCOP domains on the basis of sequence similarity such that no two domains in a given subset share more than a defined degree of sequence similarity. Taken together these two resources provide a 'ground truth' for assessing structural bioinformatics algorithms. We present a small and easy to use API written in python to enable construction of datasets from these resources. 相似文献

3.

Exploring protein structural dissimilarity to facilitate structure classification

Pooja Jain Jonathan D Hirst 《BMC structural biology》2009,9(1):60-16

Background

Classification of newly resolved protein structures is important in understanding their architectural, evolutionary and functional relatedness to known protein structures. Among various efforts to improve the database of Structural Classification of Proteins (SCOP), automation has received particular attention. Herein, we predict the deepest SCOP structural level that an unclassified protein shares with classified proteins with an equal number of secondary structure elements (SSEs). 相似文献

4.

Systematic comparison of SCOP and CATH: a new gold standard for protein structure analysis

Gergely Csaba Fabian Birzele Ralf Zimmer 《BMC structural biology》2009,9(1):23

Background

SCOP and CATH are widely used as gold standards to benchmark novel protein structure comparison methods as well as to train machine learning approaches for protein structure classification and prediction. The two hierarchies result from different protocols which may result in differing classifications of the same protein. Ignoring such differences leads to problems when being used to train or benchmark automatic structure classification methods. Here, we propose a method to compare SCOP and CATH in detail and discuss possible applications of this analysis. 相似文献

5.

ASH structure alignment package: Sensitivity and selectivity in domain classification

Daron M Standley Hiroyuki Toh Haruki Nakamura 《BMC bioinformatics》2007,8(1):116

Background

Structure alignment methods offer the possibility of measuring distant evolutionary relationships between proteins that are not visible by sequence-based analysis. However, the question of how structural differences and similarities ought to be quantified in this regard remains open. In this study we construct a training set of sequence-unique CATH and SCOP domains, from which we develop a scoring function that can reliably identify domains with the same CATH topology and SCOP fold classification. The score is implemented in the ASH structure alignment package, for which the source code and a web service are freely available from the PDBj website . 相似文献

6.

Support Vector Machines for predicting protein structural class

Yu-Dong Cai Xiao-Jun Liu Xue-biao Xu Guo-Ping Zhou 《BMC bioinformatics》2001,2(1):3-5

Background

We apply a new machine learning method, the so-called Support Vector Machine method, to predict the protein structural class. Support Vector Machine method is performed based on the database derived from SCOP, in which protein domains are classified based on known structures and the evolutionary relationships and the principles that govern their 3-D structure. 相似文献

7.

A method for probabilistic mapping between protein structure and function taxonomies through cross training

Kshitiz Gupta Vivek Sehgal Andre Levchenko 《BMC structural biology》2008,8(1):1-12

Background

Prediction of function of proteins on the basis of structure and vice versa is a partially solved problem, largely in the domain of biophysics and biochemistry. This underlies the need of computational and bioinformatics approach to solve the problem. Large and organized latent knowledge on protein classification exists in the form of independently created protein classification databases. By creating probabilistic maps between classes of structural classification databases (e.g. SCOP [1]) and classes of functional classification databases (e.g. PROSITE [2]), structure and function of proteins could be probabilistically related.

Results

We demonstrate that PROSITE and SCOP have significant semantic overlap, in spite of independent classification schemes. By training classifiers of SCOP using classes of PROSITE as attributes and vice versa, accuracy of Support Vector Machine classifiers for both SCOP and PROSITE was improved. Novel attributes, 2-D elastic profiles and Blocks were used to improve time complexity and accuracy. Many relationships were extracted between classes of SCOP and PROSITE using decision trees.

Conclusion

We demonstrate that presented approach can discover new probabilistic relationships between classes of different taxonomies and render a more accurate classification. Extensive mappings between existing protein classification databases can be created to link the large amount of organized data. Probabilistic maps were created between classes of SCOP and PROSITE allowing predictions of structure using function, and vice versa. In our experiments, we also found that functions are indeed more strongly related to structure than are structure to functions. 相似文献

8.

SUPFAM: A database of sequence superfamilies of protein domains

Shashi B Pandit Rana Bhadra VS Gowri S Balaji B Anand N Srinivasan 《BMC bioinformatics》2004,5(1):28

Background

SUPFAM database is a compilation of superfamily relationships between protein domain families of either known or unknown 3-D structure. In SUPFAM, sequence families from Pfam and structural families from SCOP are associated, using profile matching, to result in sequence superfamilies of known structure. Subsequently all-against-all family profile matches are made to deduce a list of new potential superfamilies of yet unknown structure. 相似文献

9.

Towards an automatic classification of protein structural domains based on structural similarity

Vichetra Sam Chin-Hsien Tai Jean Garnier Jean-Francois Gibrat Byungkook Lee Peter J Munson 《BMC bioinformatics》2008,9(1):74

Background

Formal classification of a large collection of protein structures aids the understanding of evolutionary relationships among them. Classifications involving manual steps, such as SCOP and CATH, face the challenge of increasing volume of available structures. Automatic methods such as FSSP or Dali Domain Dictionary, yield divergent classifications, for reasons not yet fully investigated. One possible reason is that the pairwise similarity scores used in automatic classification do not adequately reflect the judgments made in manual classification. Another possibility is the difference between manual and automatic classification procedures. We explore the degree to which these two factors might affect the final classification. 相似文献

10.

Automatic structure classification of small proteins using random forest

Pooja Jain Jonathan D Hirst 《BMC bioinformatics》2010,11(1):364

相似文献

11.

ROC and confusion analysis of structure comparison methods identify the main causes of divergence from manual protein classification

Vichetra Sam Chin-Hsien Tai Jean Garnier Jean-Francois Gibrat Byungkook Lee Peter J Munson 《BMC bioinformatics》2006,7(1):206-20

Background

Current classification of protein folds are based, ultimately, on visual inspection of similarities. Previous attempts to use computerized structure comparison methods show only partial agreement with curated databases, but have failed to provide detailed statistical and structural analysis of the causes of these divergences. 相似文献

12.

A framework for protein structure classification and identification of novel protein structures

You Jung Kim Jignesh M Patel 《BMC bioinformatics》2006,7(1):456-13

Background

Protein structure classification plays a central role in understanding the function of a protein molecule with respect to all known proteins in a structure database. With the rapid increase in the number of new protein structures, the need for automated and accurate methods for protein classification is increasingly important. 相似文献

13.

Insertions and the emergence of novel protein structure: a structure-based phylogenetic study of insertions

Haiyan Jiang Christian Blouin 《BMC bioinformatics》2007,8(1):444

Background

In protein evolution, the mechanism of the emergence of novel protein domain is still an open question. The incremental growth of protein variable regions, which was produced by stochastic insertions, has the potential to generate large and complex sub-structures. In this study, a deterministic methodology is proposed to reconstruct phylogenies from protein structures, and to infer insertion events in protein evolution. The analysis was performed on a broad range of SCOP domain families. 相似文献

14.

Sequence and structural analysis of the Asp-box motif and Asp-box beta-propellers; a widespread propeller-type characteristic of the Vps10 domain family and several glycoside hydrolase families

Esben M Quistgaard Søren S Thirup 《BMC structural biology》2009,9(1):46-18

Background

The Asp-box is a short sequence and structure motif that folds as a well-defined β-hairpin. It is present in different folds, but occurs most prominently as repeats in β-propellers. Asp-box β-propellers are known to be characteristically irregular and to occur in many medically important proteins, most of which are glycosidase enzymes, but they are otherwise not well characterized and are only rarely treated as a distinct β-propeller family. We have analyzed the sequence, structure, function and occurrence of the Asp-box and s-Asp-box -a related shorter variant, and provide a comprehensive classification and computational analysis of the Asp-box β-propeller family. 相似文献

15.

dConsensus: a tool for displaying domain assignments by multiple structure-based algorithms and for construction of a consensus assignment

Kieran Alden Stella Veretnik Philip E Bourne 《BMC bioinformatics》2010,11(1):310

Background

Partitioning of a protein into structural components, known as domains, is an important initial step in protein classification and for functional and evolutionary studies. While the systematic assignments of domains by human experts exist (CATH and SCOP), the introduction of high throughput technologies for structure determination threatens to overwhelm expert approaches. A variety of algorithmic methods have been developed to expedite this process, allowing almost instant structural decomposition into domains. The performance of algorithmic methods can approach 85% agreement on the number of domains with the consensus reached by experts. However, each algorithm takes a somewhat different conceptual approach, each with unique strengths and weaknesses. Currently there is no simple way to automatically compare assignments from different structure-based domain assignment methods, thereby providing a comprehensive understanding of possible structure partitioning as well as providing some insight into the tendencies of particular algorithms. Most importantly, a consensus assignment drawn from multiple assignment methods can provide a singular and presumably more accurate view. 相似文献

16.

Enhancing navigation in biomedical databases by community voting and database-driven text classification

Timo Duchrow Timur Shtatland Daniel Guettler Misha Pivovarov Stefan Kramer Ralph Weissleder 《BMC bioinformatics》2009,10(1):317

Background

The breadth of biological databases and their information content continues to increase exponentially. Unfortunately, our ability to query such sources is still often suboptimal. Here, we introduce and apply community voting, database-driven text classification, and visual aids as a means to incorporate distributed expert knowledge, to automatically classify database entries and to efficiently retrieve them. 相似文献

17.

Improved profile HMM performance by assessment of critical algorithmic features in SAM and HMMER

Markus?Wistrand Erik?LL?Sonnhammer Email author 《BMC bioinformatics》2005,6(1):99

Background

Profile hidden Markov model (HMM) techniques are among the most powerful methods for protein homology detection. Yet, the critical features for successful modelling are not fully known. In the present work we approached this by using two of the most popular HMM packages: SAM and HMMER. The programs' abilities to build models and score sequences were compared on a SCOP/Pfam based test set. The comparison was done separately for local and global HMM scoring. 相似文献

18.

Functional evolution of two subtly different (similar) folds

Vishal Agrawal Radha KV Kishan 《BMC structural biology》2001,1(1):5-6

Background

The function of proteins is a direct consequence of their three-dimensional structure. The structural classification of proteins describes the ways of folding patterns all proteins could adopt. Although, the protein folds were described in many ways the functional properties of individual folds were not studied.

Results

We have analyzed two β-barrel folds generally adopted by small proteins to be looking similar but have different topology. On the basis of the topology they could be divided into two different folds named SH3-fold and OB-fold. There was no sequence homology between any of the proteins considered. The sequence diversity and loop variability was found to be important for various binding functions.

Conclusions

The function of Oligonucleotide/oligosaccharide-binding (OB) fold proteins was restricted to either DNA/RNA binding or sugar binding whereas the Src homology 3 (SH3) domain like proteins bind to a variety of ligands through loop modulations. A question was raised whether the evolution of these two folds was through DNA shuffling. 相似文献

19.

Identification and characterization of subfamily-specific signatures in a large protein superfamily by a hidden Markov model approach

Kevin Truong Mitsuhiko Ikura 《BMC bioinformatics》2002,3(1):1-14

Background

Most profile and motif databases strive to classify protein sequences into a broad spectrum of protein families. The next step of such database studies should include the development of classification systems capable of distinguishing between subfamilies within a structurally and functionally diverse superfamily. This would be helpful in elucidating sequence-structure-function relationships of proteins. 相似文献

20.

Columba: an integrated database of proteins,structures, and annotations

Silke?Tri?l Kristian?Rother Email author Heiko?Müller Thomas?Steinke Ina?Koch Robert?Preissner Cornelius?Fr?mmel Ulf?Leser 《BMC bioinformatics》2005,6(1):81

Background

Structural and functional research often requires the computation of sets of protein structures based on certain properties of the proteins, such as sequence features, fold classification, or functional annotation. Compiling such sets using current web resources is tedious because the necessary data are spread over many different databases. To facilitate this task, we have created COLUMBA, an integrated database of annotations of protein structures. 相似文献