期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Enzyme classification with peptide programs: a comparative study

Daniel Faria António EN Ferreira André O Falcão 《BMC bioinformatics》2009,10(1):231-9

Background

Efficient and accurate prediction of protein function from sequence is one of the standing problems in Biology. The generalised use of sequence alignments for inferring function promotes the propagation of errors, and there are limits to its applicability. Several machine learning methods have been applied to predict protein function, but they lose much of the information encoded by protein sequences because they need to transform them to obtain data of fixed length. 相似文献

2.

Prediction of antigenic epitopes on protein surfaces by consensus scoring

Shide Liang Dandan Zheng Chi Zhang Martin Zacharias 《BMC bioinformatics》2009,10(1):302

Background

Prediction of antigenic epitopes on protein surfaces is important for vaccine design. Most existing epitope prediction methods focus on protein sequences to predict continuous epitopes linear in sequence. Only a few structure-based epitope prediction algorithms are available and they have not yet shown satisfying performance. 相似文献

3.

Assessing protein similarity with Gene Ontology and its use in subnuclear localization prediction

Zhengdeng Lei Yang Dai 《BMC bioinformatics》2006,7(1):491

Background

The accomplishment of the various genome sequencing projects resulted in accumulation of massive amount of gene sequence information. This calls for a large-scale computational method for predicting protein localization from sequence. The protein localization can provide valuable information about its molecular function, as well as the biological pathway in which it participates. The prediction of localization of a protein at subnuclear level is a challenging task. In our previous work we proposed an SVM-based system using protein sequence information for this prediction task. In this work, we assess protein similarity with Gene Ontology (GO) and then improve the performance of the system by adding a module of nearest neighbor classifier using a similarity measure derived from the GO annotation terms for protein sequences. 相似文献

4.

An SVM-based system for predicting protein subnuclear localizations

Zhengdeng?Lei Yang?Dai Email author 《BMC bioinformatics》2005,6(1):291

Background

The large gap between the number of protein sequences in databases and the number of functionally characterized proteins calls for the development of a fast computational tool for the prediction of subnuclear and subcellular localizations generally applicable to protein sequences. The information on localization may reveal the molecular function of novel proteins, in addition to providing insight on the biological pathways in which they function. The bulk of past work has been focused on protein subcellular localizations. Furthermore, no specific tool has been dedicated to prediction at the subnuclear level, despite its high importance. In order to design a suitable predictive system, the extraction of subtle sequence signals that can discriminate among proteins with different subnuclear localizations is the key. 相似文献

5.

Prediction of catalytic residues using Support Vector Machine with selected protein sequence and structural properties

Natalia V Petrova Cathy H Wu 《BMC bioinformatics》2006,7(1):312

Background

The number of protein sequences deriving from genome sequencing projects is outpacing our knowledge about the function of these proteins. With the gap between experimentally characterized and uncharacterized proteins continuing to widen, it is necessary to develop new computational methods and tools for functional prediction. Knowledge of catalytic sites provides a valuable insight into protein function. Although many computational methods have been developed to predict catalytic residues and active sites, their accuracy remains low, with a significant number of false positives. In this paper, we present a novel method for the prediction of catalytic sites, using a carefully selected, supervised machine learning algorithm coupled with an optimal discriminative set of protein sequence conservation and structural properties. 相似文献

6.

TESTLoc: protein subcellular localization prediction from EST data

Yao-Qing Shen Gertraud Burger 《BMC bioinformatics》2010,11(1):563

Background

The eukaryotic cell has an intricate architecture with compartments and substructures dedicated to particular biological processes. Knowing the subcellular location of proteins not only indicates how bio-processes are organized in different cellular compartments, but also contributes to unravelling the function of individual proteins. Computational localization prediction is possible based on sequence information alone, and has been successfully applied to proteins from virtually all subcellular compartments and all domains of life. However, we realized that current prediction tools do not perform well on partial protein sequences such as those inferred from Expressed Sequence Tag (EST) data, limiting the exploitation of the large and taxonomically most comprehensive body of sequence information from eukaryotes. 相似文献

7.

GASP: Gapped Ancestral Sequence Prediction for proteins

Richard?J?Edwards Email author Denis?C?Shields 《BMC bioinformatics》2004,5(1):123

Background

The prediction of ancestral protein sequences from multiple sequence alignments is useful for many bioinformatics analyses. Predicting ancestral sequences is not a simple procedure and relies on accurate alignments and phylogenies. Several algorithms exist based on Maximum Parsimony or Maximum Likelihood methods but many current implementations are unable to process residues with gaps, which may represent insertion/deletion (indel) events or sequence fragments. 相似文献

8.

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction

Eric D Scheeff Philip E Bourne 《BMC bioinformatics》2006,7(1):410-17

Background

One of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to the sensitivity of the resulting profile. The inclusion of highly diverse sequences will presumably produce a more powerful profile, but distantly related sequences can be difficult to align accurately using only sequence information. Therefore, it would be expected that the use of protein structure alignments to improve the selection and alignment of diverse sequence homologs might yield improved profiles. However, the actual utility of such an approach has remained unclear. 相似文献

9.

Java GUI for InterProScan (JIPS): A tool to help process multiple InterProScans and perform ortholog analysis

Aijazuddin Syed Chris Upton 《BMC bioinformatics》2006,7(1):462

Background

Recent, rapid growth in the quantity of available genomic data has generated many protein sequences that are not yet biochemically classified. Thus, the prediction of biochemical function based on structural motifs is an important task in post-genomic analysis. The InterPro databases are a major resource for protein function information. For optimal results, these databases should be searched at regular intervals, since they are frequently updated. 相似文献

10.

Prediction of enzyme function by combining sequence similarity and protein interactions

Jordi Espadaler Narayanan Eswar Enrique Querol Francesc X Avilés Andrej Sali Marc A Marti-Renom Baldomero Oliva 《BMC bioinformatics》2008,9(1):249

Background

A number of studies have used protein interaction data alone for protein function prediction. Here, we introduce a computational approach for annotation of enzymes, based on the observation that similar protein sequences are more likely to perform the same function if they share similar interacting partners. 相似文献

11.

IdentiCS – Identification of coding sequence and <Emphasis Type="Italic">in silico</Emphasis> reconstruction of the metabolic network directly from unannotated low-coverage bacterial genome sequence

Jibin?Sun An-Ping?Zeng Email author 《BMC bioinformatics》2004,5(1):112

Background

A necessary step for a genome level analysis of the cellular metabolism is the in silico reconstruction of the metabolic network from genome sequences. The available methods are mainly based on the annotation of genome sequences including two successive steps, the prediction of coding sequences (CDS) and their function assignment. The annotation process takes time. The available methods often encounter difficulties when dealing with unfinished error-containing genomic sequence. 相似文献

12.

PSSM-based prediction of DNA binding sites in proteins

Shandar?Ahmad Email author Akinori?Sarai 《BMC bioinformatics》2005,6(1):33

Background

Detection of DNA-binding sites in proteins is of enormous interest for technologies targeting gene regulation and manipulation. We have previously shown that a residue and its sequence neighbor information can be used to predict DNA-binding candidates in a protein sequence. This sequence-based prediction method is applicable even if no sequence homology with a previously known DNA-binding protein is observed. Here we implement a neural network based algorithm to utilize evolutionary information of amino acid sequences in terms of their position specific scoring matrices (PSSMs) for a better prediction of DNA-binding sites. 相似文献

13.

Improving the accuracy of protein secondary structure prediction using structural alignment

Scott Montgomerie Shan Sundararaj Warren J Gallin David S Wishart 《BMC bioinformatics》2006,7(1):301-13

Background

The accuracy of protein secondary structure prediction has steadily improved over the past 30 years. Now many secondary structure prediction methods routinely achieve an accuracy (Q3) of about 75%. We believe this accuracy could be further improved by including structure (as opposed to sequence) database comparisons as part of the prediction process. Indeed, given the large size of the Protein Data Bank (>35,000 sequences), the probability of a newly identified sequence having a structural homologue is actually quite high. 相似文献

14.

An adaptive bin framework search method for a beta-sheet protein homopolymer model

Alena Shmygelska Holger H Hoos 《BMC bioinformatics》2007,8(1):136

Background

The problem of protein structure prediction consists of predicting the functional or native structure of a protein given its linear sequence of amino acids. This problem has played a prominent role in the fields of biomolecular physics and algorithm design for over 50 years. Additionally, its importance increases continually as a result of an exponential growth over time in the number of known protein sequences in contrast to a linear increase in the number of determined structures. Our work focuses on the problem of searching an exponentially large space of possible conformations as efficiently as possible, with the goal of finding a global optimum with respect to a given energy function. This problem plays an important role in the analysis of systems with complex search landscapes, and particularly in the context of ab initio protein structure prediction. 相似文献

15.

Predicting RNA secondary structure by the comparative approach: how to select the homologous sequences

Stéfan Engelen Fariza Tahi 《BMC bioinformatics》2007,8(1):464

Background

The secondary structure of an RNA must be known before the relationship between its structure and function can be determined. One way to predict the secondary structure of an RNA is to identify covarying residues that maintain the pairings (Watson-Crick, Wobble and non-canonical pairings). This "comparative approach" consists of identifying mutations from homologous sequence alignments. The sequences must covary enough for compensatory mutations to be revealed, but comparison is difficult if they are too different. Thus the choice of homologous sequences is critical. While many possible combinations of homologous sequences may be used for prediction, only a few will give good structure predictions. This can be due to poor quality alignment in stems or to the variability of certain sequences. This problem of sequence selection is currently unsolved. 相似文献

16.

Automatic annotation of protein motif function with Gene Ontology terms

Xinghua?Lu Email author Chengxiang?Zhai Vanathi?Gopalakrishnan Bruce?G?Buchanan 《BMC bioinformatics》2004,5(1):122

Background

Conserved protein sequence motifs are short stretches of amino acid sequence patterns that potentially encode the function of proteins. Several sequence pattern searching algorithms and programs exist foridentifying candidate protein motifs at the whole genome level. However, amuch needed and importanttask is to determine the functions of the newly identified protein motifs. The Gene Ontology (GO) project is an endeavor to annotate the function of genes or protein sequences with terms from a dynamic, controlled vocabulary and these annotations serve well as a knowledge base. 相似文献

17.

Exploiting structural and topological information to improve prediction of RNA-protein binding sites

Stefan R Maetschke Zheng Yuan 《BMC bioinformatics》2009,10(1):341

Background

RNA-protein interactions are important for a wide range of biological processes. Current computational methods to predict interacting residues in RNA-protein interfaces predominately rely on sequence data. It is, however, known that interface residue propensity is closely correlated with structural properties. In this paper we systematically study information obtained from sequences and structures and compare their contributions in this prediction problem. Particularly, different geometrical and network topological properties of protein structures are evaluated to improve interface residue prediction accuracy. 相似文献

18.

Human Pol II promoter recognition based on primary sequences and free energy of dinucleotides

Jian-Yi Yang Yu Zhou Zu-Guo Yu Vo Anh Li-Qian Zhou 《BMC bioinformatics》2008,9(1):113

相似文献

19.

Analysis of superfamily specific profile-profile recognition accuracy

James?A?Casbon Mansoor?AS?Saqi Email author 《BMC bioinformatics》2004,5(1):200

Background

Annotation of sequences that share little similarity to sequences of known function remains a major obstacle in genome annotation. Some of the best methods of detecting remote relationships between protein sequences are based on matching sequence profiles. We analyse the superfamily specific performance of sequence profile-profile matching. Our benchmark consists of a set of 16 protein superfamilies that are highly diverse at the sequence level. We relate the performance to the number of sequences in the profiles, the profile diversity and the extent of structural conservation in the superfamily. 相似文献

20.

Species-specific analysis of protein sequence motifs using mutual information

Jan?Hummel Nima?Keshvari Wolfram?Weckwerth Joachim?Selbig Email author 《BMC bioinformatics》2005,6(1):164

相似文献