期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

CMASA: an accurate algorithm for detecting local protein structural similarity and its application to enzyme catalytic site annotation

Gong-Hua Li Jing-Fei Huang 《BMC bioinformatics》2010,11(1):439

Background

The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure-based methods have been developed to predict protein function, but these methods need to be improved further, such as, enhancing the accuracy, sensitivity, and the computational speed. Here, an accurate algorithm, the CMASA (Contact MAtrix based local Structural Alignment algorithm), has been developed to predict unknown functions of proteins based on the local protein structural similarity. This algorithm has been evaluated by building a test set including 164 enzyme families, and also been compared to other methods. 相似文献

2.

<Emphasis Type="Italic">SplitTester</Emphasis> : software to identify domains responsible for functional divergence in protein family

Xiang?Gao Kent?A?Vander Velden Daniel?F?Voytas Email author Xun?Gu Email author 《BMC bioinformatics》2005,6(1):137

Background

Many protein families have undergone functional divergence after gene duplications such that current subgroups of the family carry out overlapping but distinct biological roles. For the protein families with known functional subtypes (a functional split), we developed the software, SplitTester, to identify potential regions that are responsible for the observed distinct functional subtypes within the same protein family. 相似文献

3.

Partially-supervised protein subclass discovery with simultaneous annotation of functional residues

Benjamin Georgi Jörg Schultz Alexander Schliep 《BMC structural biology》2009,9(1):68-14

Background

The study of functional subfamilies of protein domain families and the identification of the residues which determine substrate specificity is an important question in the analysis of protein domains. One way to address this question is the use of clustering methods for protein sequence data and approaches to predict functional residues based on such clusterings. The locations of putative functional residues in known protein structures provide insights into how different substrate specificities are reflected on the protein structure level. 相似文献

4.

Inferring functional modules of protein families with probabilistic topic models

Sebastian GA Konietzny Laura Dietz Alice C McHardy 《BMC bioinformatics》2011,12(1):141

Background

Genome and metagenome studies have identified thousands of protein families whose functions are poorly understood and for which techniques for functional characterization provide only partial information. For such proteins, the genome context can give further information about their functional context. 相似文献

5.

Using context to improve protein domain identification

Alejandro Ochoa Manuel Llinás Mona Singh 《BMC bioinformatics》2011,12(1):90

Background

Identifying domains in protein sequences is an important step in protein structural and functional annotation. Existing domain recognition methods typically evaluate each domain prediction independently of the rest. However, the majority of proteins are multidomain, and pairwise domain co-occurrences are highly specific and non-transitive. 相似文献

6.

Predicting conserved protein motifs with Sub-HMMs

Kevin Horan Christian R Shelton Thomas Girke 《BMC bioinformatics》2010,11(1):205

Background

Profile HMMs (hidden Markov models) provide effective methods for modeling the conserved regions of protein families. A limitation of the resulting domain models is the difficulty to pinpoint their much shorter functional sub-features, such as catalytically relevant sequence motifs in enzymes or ligand binding signatures of receptor proteins. 相似文献

7.

Predicting the effect of missense mutations on protein function: analysis with Bayesian networks

Chris J Needham James R Bradford Andrew J Bulpitt Matthew A Care David R Westhead 《BMC bioinformatics》2006,7(1):405-14

Background

A number of methods that use both protein structural and evolutionary information are available to predict the functional consequences of missense mutations. However, many of these methods break down if either one of the two types of data are missing. Furthermore, there is a lack of rigorous assessment of how important the different factors are to prediction. 相似文献

8.

A simplified approach to disulfide connectivity prediction from protein sequences

Marc Vincent Andrea Passerini Matthieu Labbé Paolo Frasconi 《BMC bioinformatics》2008,9(1):20

Background

Prediction of disulfide bridges from protein sequences is useful for characterizing structural and functional properties of proteins. Several methods based on different machine learning algorithms have been applied to solve this problem and public domain prediction services exist. These methods are however still potentially subject to significant improvements both in terms of prediction accuracy and overall architectural complexity. 相似文献

9.

Automatic discovery of cross-family sequence features associated with protein function

Markus Brameier Josien Haan Andrea Krings Robert M MacCallum 《BMC bioinformatics》2006,7(1):16-19

Background

Methods for predicting protein function directly from amino acid sequences are useful tools in the study of uncharacterised protein families and in comparative genomics. Until now, this problem has been approached using machine learning techniques that attempt to predict membership, or otherwise, to predefined functional categories or subcellular locations. A potential drawback of this approach is that the human-designated functional classes may not accurately reflect the underlying biology, and consequently important sequence-to-function relationships may be missed. 相似文献

10.

Low-complexity regions within protein sequences have position-dependent roles 总被引：1，自引：0，他引：1

Alain Coletta John W Pinney David Y Weiss Solís James Marsh Steve R Pettifer Teresa K Attwood 《BMC systems biology》2010,4(1):43

Background

Regions of protein sequences with biased amino acid composition (so-called Low-Complexity Regions (LCRs)) are abundant in the protein universe. A number of studies have revealed that i) these regions show significant divergence across protein families; ii) the genetic mechanisms from which they arise lends them remarkable degrees of compositional plasticity. They have therefore proved difficult to compare using conventional sequence analysis techniques, and functions remain to be elucidated for most of them. Here we undertake a systematic investigation of LCRs in order to explore their possible functional significance, placed in the particular context of Protein-Protein Interaction (PPI) networks and Gene Ontology (GO)-term analysis. 相似文献

11.

Ensemble approach to predict specificity determinants: benchmarking and validation

Saikat Chakrabarti Anna R Panchenko 《BMC bioinformatics》2009,10(1):207

Background

It is extremely important and challenging to identify the sites that are responsible for functional specification or diversification in protein families. In this study, a rigorous comparative benchmarking protocol was employed to provide a reliable evaluation of methods which predict the specificity determining sites. Subsequently, three best performing methods were applied to identify new potential specificity determining sites through ensemble approach and common agreement of their prediction results. 相似文献

12.

Incorporating functional inter-relationships into protein function prediction algorithms

Gaurav Pandey Chad L Myers Vipin Kumar 《BMC bioinformatics》2009,10(1):142-22

Background

Functional classification schemes (e.g. the Gene Ontology) that serve as the basis for annotation efforts in several organisms are often the source of gold standard information for computational efforts at supervised protein function prediction. While successful function prediction algorithms have been developed, few previous efforts have utilized more than the protein-to-functional class label information provided by such knowledge bases. For instance, the Gene Ontology not only captures protein annotations to a set of functional classes, but it also arranges these classes in a DAG-based hierarchy that captures rich inter-relationships between different classes. These inter-relationships present both opportunities, such as the potential for additional training examples for small classes from larger related classes, and challenges, such as a harder to learn distinction between similar GO terms, for standard classification-based approaches. 相似文献

13.

Super paramagnetic clustering of protein sequences

Igor?V?Tetko Email author Axel?Facius Andreas?Ruepp Hans-Werner?Mewes 《BMC bioinformatics》2005,6(1):82

Background

Detection of sequence homologues represents a challenging task that is important for the discovery of protein families and the reliable application of automatic annotation methods. The presence of domains in protein families of diverse function, inhomogeneity and different sizes of protein families create considerable difficulties for the application of published clustering methods. 相似文献

14.

Length-dependent prediction of protein intrinsic disorder 总被引：2，自引：0，他引：2

Kang Peng Predrag Radivojac Slobodan Vucetic A Keith Dunker Zoran Obradovic 《BMC bioinformatics》2006,7(1):208-17

Background

Due to the functional importance of intrinsically disordered proteins or protein regions, prediction of intrinsic protein disorder from amino acid sequence has become an area of active research as witnessed in the 6th experiment on Critical Assessment of Techniques for Protein Structure Prediction (CASP6). Since the initial work by Romero et al. (Identifying disordered regions in proteins from amino acid sequences, IEEE Int. Conf. Neural Netw., 1997), our group has developed several predictors optimized for long disordered regions (>30 residues) with prediction accuracy exceeding 85%. However, these predictors are less successful on short disordered regions (≤30 residues). A probable cause is a length-dependent amino acid compositions and sequence properties of disordered regions. 相似文献

15.

Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties

Natalia V Petrova Cathy H Wu 《BMC bioinformatics》2006,7(1):312

Background

The number of protein sequences deriving from genome sequencing projects is outpacing our knowledge about the function of these proteins. With the gap between experimentally characterized and uncharacterized proteins continuing to widen, it is necessary to develop new computational methods and tools for functional prediction. Knowledge of catalytic sites provides a valuable insight into protein function. Although many computational methods have been developed to predict catalytic residues and active sites, their accuracy remains low, with a significant number of false positives. In this paper, we present a novel method for the prediction of catalytic sites, using a carefully selected, supervised machine learning algorithm coupled with an optimal discriminative set of protein sequence conservation and structural properties. 相似文献

16.

Variation in structural location and amino acid conservation of functional sites in protein domain families

Birgit?Pils Richard?R?Copley J?rg?Schultz Email author 《BMC bioinformatics》2005,6(1):210

Background

The functional sites of a protein present important information for determining its cellular function and are fundamental in drug design. Accordingly, accurate methods for the prediction of functional sites are of immense value. Most available methods are based on a set of homologous sequences and structural or evolutionary information, and assume that functional sites are more conserved than the average. In the analysis presented here, we have investigated the conservation of location and type of amino acids at functional sites, and compared the behaviour of functional sites between different protein domains. 相似文献

17.

Species-specific analysis of protein sequence motifs using mutual information

Jan?Hummel Nima?Keshvari Wolfram?Weckwerth Joachim?Selbig Email author 《BMC bioinformatics》2005,6(1):164

相似文献

18.

The distance-profile representation and its application to detection of distantly related protein families

Chin-Jen?Ku Golan?Yona Email author 《BMC bioinformatics》2005,6(1):282

Background

Detecting homology between remotely related protein families is an important problem in computational biology since the biological properties of uncharacterized proteins can often be inferred from those of homologous proteins. Many existing approaches address this problem by measuring the similarity between proteins through sequence or structural alignment. However, these methods do not exploit collective aspects of the protein space and the computed scores are often noisy and frequently fail to recognize distantly related protein families. 相似文献

19.

Selective prediction of interaction sites in protein structures with THEMATICS

Ying Wei Jaeju Ko Leonel F Murga Mary Jo Ondrechen 《BMC bioinformatics》2007,8(1):119

Background

Methods are now available for the prediction of interaction sites in protein 3D structures. While many of these methods report high success rates for site prediction, often these predictions are not very selective and have low precision. Precision in site prediction is addressed using Theoretical Microscopic Titration Curves (THEMATICS), a simple computational method for the identification of active sites in enzymes. Recall and precision are measured and compared with other methods for the prediction of catalytic sites. 相似文献

20.

A hybrid clustering approach to recognition of protein families in 114 microbial genomes

Timothy?J?Harlow J?Peter?Gogarten Mark?A?Ragan Email author 《BMC bioinformatics》2004,5(1):45

Background

Grouping proteins into sequence-based clusters is a fundamental step in many bioinformatic analyses (e.g., homology-based prediction of structure or function). Standard clustering methods such as single-linkage clustering capture a history of cluster topologies as a function of threshold, but in practice their usefulness is limited because unrelated sequences join clusters before biologically meaningful families are fully constituted, e.g. as the result of matches to so-called promiscuous domains. Use of the Markov Cluster algorithm avoids this non-specificity, but does not preserve topological or threshold information about protein families. 相似文献