首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
B-cell epitope prediction facilitates the design and synthesis of short peptides for various immunological applications. Several algorithms have been developed to predict B-cell linear epitopes (LEs) from primary sequences of antigens, providing important information for immunobiological experiments and antibody design. This paper describes two robust methods, LE prediction with/without local peak extraction (LEP-LP and LEP-NLP), based on antigenicity scale and mathematical morphology for the prediction of B-cell LEs. Previous studies revealed that LEs could occur in regions with low-to-moderate but not globally high antigenicity scales. Hence, we developed a method adopting mathematical morphology to extract local peaks from a linear combination of the propensity scales of physico-chemical characteristics at each antigen residue. Comparison among LEP-LP/LEP-NLP, BepiPred and BEPITOPE revealed that our algorithms performed better in retrieving epitopes with low-to-moderate antigenicity and achieved comparable performance according to receiver operation characteristics (ROC) curve analysis. Of the identified LEs, over 30% were unable to be predicted by BepiPred and BEPITOPE employing an average threshold of antigenicity index or default settings. Our LEP-LP method provides a bioinformatics approach for predicting B-cell LEs with low- to-moderate antigenicity. The web-based server was established at http://biotools.cs.ntou.edu.tw/lepd_antigenicity. php for free use.  相似文献   

2.
This paper describes a novel computer graphics tool for predicting protein structures. The method is based on structural profiles; which are plots of hydrophobicity, parameters used for secondary structure prediction, or other residue-specific traits against sequence number. Similar structural profiles can indicate similar tertiary structures, in the absence of sequence homology. The profiles of reference proteins, with known structure, can be used for prediction. In the method presented here, structural profiles are compared by interactive computer graphics, using the program Multiplot. As a test, a structural profile comparison of several proteins known to have similar 3D structures is presented. Comparison of structural profiles detects similar folding of the two domains of rhodanese, which was not easily detected by sequence homology.  相似文献   

3.
Secondary structure predictions are increasingly becoming the workhorse for several methods aiming at predicting protein structure and function. Here we use ensembles of bidirectional recurrent neural network architectures, PSI-BLAST-derived profiles, and a large nonredundant training set to derive two new predictors: (a) the second version of the SSpro program for secondary structure classification into three categories and (b) the first version of the SSpro8 program for secondary structure classification into the eight classes produced by the DSSP program. We describe the results of three different test sets on which SSpro achieved a sustained performance of about 78% correct prediction. We report confusion matrices, compare PSI-BLAST to BLAST-derived profiles, and assess the corresponding performance improvements. SSpro and SSpro8 are implemented as web servers, available together with other structural feature predictors at: http://promoter.ics.uci.edu/BRNN-PRED/.  相似文献   

4.
MGraph: graphical models for microarray data analysis   总被引:2,自引:0,他引:2  
  相似文献   

5.
MOTIVATION: Disulfide bonds play an important role in protein folding. A precise prediction of disulfide connectivity can strongly reduce the conformational search space and increase the accuracy in protein structure prediction. Conventional disulfide connectivity predictions use sequence information, and prediction accuracy is limited. Here, by using an alternative scheme with global information for disulfide connectivity prediction, higher performance is obtained with respect to other approaches. RESULT: Cysteine separation profiles have been used to predict the disulfide connectivity of proteins. The separations among oxidized cysteine residues on a protein sequence have been encoded into vectors named cysteine separation profiles (CSPs). Through comparisons of their CSPs, the disulfide connectivity of a test protein is inferred from a non-redundant template set. For non-redundant proteins in SwissProt 39 (SP39) sharing less than 30% sequence identity, the prediction accuracy of a fourfold cross-validation is 49%. The prediction accuracy of disulfide connectivity for proteins in SwissProt 43 (SP43) is even higher (53%). The relationship between the similarity of CSPs and the prediction accuracy is also discussed. The method proposed in this work is relatively simple and can generate higher accuracies compared to conventional methods. It may be also combined with other algorithms for further improvements in protein structure prediction. AVAILABILITY: The program and datasets are available from the authors upon request. CONTACT: cykao@csie.ntu.edu.tw.  相似文献   

6.
MOTIVATION: A large body of evidence suggests that protein structural information is frequently encoded in local sequences-sequence-structure relationships derived from local structure/sequence analyses could significantly enhance the capacities of protein structure prediction methods. In this paper, the prediction capacity of a database (LSBSP2) that organizes local sequence-structure relationships encoded in local structures with two consecutive secondary structure elements is tested with two computational procedures for protein structure prediction. The goal is twofold: to test the folding hypothesis that local structures are determined by local sequences, and to enhance our capacity in predicting protein structures from their amino acid sequences. RESULTS: The LSBSP2 database contains a large set of sequence profiles derived from exhaustive pair-wise structural alignments for local structures with two consecutive secondary structure elements. One computational procedure makes use of the PSI-BLAST alignment program to predict local structures for testing sequence fragments by matching the testing sequence fragments onto the sequence profiles in the LSBSP2 database. The results show that 54% of the test sequence fragments were predicted with local structures that match closely with their native local structures. The other computational procedure is a filter system that is capable of removing false positives as possible from a set of PSI-BLAST hits. An assessment with a large set of non-redundant protein structures shows that the PSI-BLAST + filter system improves the prediction specificity by up to two-fold over the prediction specificity of the PSI-BLAST program for distantly related protein pairs. Tests with the two computational procedures above demonstrate that local sequence-structure relationships can indeed enhance our capacity in protein structure prediction. The results also indicate that local sequences encoded with strong local structure propensities play an important role in determining the native state folding topology.  相似文献   

7.
Homology detection and protein structure prediction are central themes in bioinformatics. Establishment of relationship between protein sequences or prediction of their structure by sequence comparison methods finds limitations when there is low sequence similarity. Recent works demonstrate that the use of profiles improves homology detection and protein structure prediction. Profiles can be inferred from protein multiple alignments using different approaches. The "Conservatism-of-Conservatism" is an effective profile analysis method to identify structural features between proteins having the same fold but no detectable sequence similarity. The information obtained from protein multiple alignments varies according to the amino acid classification employed to calculate the profile. In this work, we calculated entropy profiles from PSI-BLAST-derived multiple alignments and used different amino acid classifications summarizing almost 500 different attributes. These entropy profiles were converted into pseudocodes which were compared using the FASTA program with an ad-hoc matrix. We tested the performance of our method to identify relationships between proteins with similar fold using a nonredundant subset of sequences having less than 40% of identity. We then compared our results using Coverage Versus Error per query curves, to those obtained by methods like PSI-BLAST, COMPASS and HHSEARCH. Our method, named HIP (Homology Identification with Profiles) presented higher accuracy detecting relationships between proteins with the same fold. The use of different amino acid classifications reflecting a large number of amino acid attributes, improved the recognition of distantly related folds. We propose the use of pseudocodes representing profile information as a fast and powerful tool for homology detection, fold assignment and analysis of evolutionary information enclosed in protein profiles.  相似文献   

8.

Background

With the development of sequencing technologies, more and more sequence variants are available for investigation. Different classes of variants in the human genome have been identified, including single nucleotide substitutions, insertion and deletion, and large structural variations such as duplications and deletions. Insertion and deletion (indel) variants comprise a major proportion of human genetic variation. However, little is known about their effects on humans. The absence of understanding is largely due to the lack of both biological data and computational resources.

Results

This paper presents a new indel functional prediction method HMMvar based on HMM profiles, which capture the conservation information in sequences. The results demonstrate that a scoring strategy based on HMM profiles can achieve good performance in identifying deleterious or neutral variants for different data sets, and can predict the protein functional effects of both single and multiple mutations.

Conclusions

This paper proposed a quantitative prediction method, HMMvar, to predict the effect of genetic variation using hidden Markov models. The HMM based pipeline program implementing the method HMMvar is freely available at https://bioinformatics.cs.vt.edu/zhanglab/hmm.  相似文献   

9.
A new approach to the analysis of regular structures in proteins that is based on the method of molecular mechanics is proposed. The method uses only the information about the amino acid sequence. The alpha-helical conformation was simulated using the ICM program of molecular mechanics. Energy profiles of the sequences in the alpha-helical conformation, spanning the entire polypeptide chain, were plotted for eight proteins from the Protein Data Bank. The regions of each profile that exhibit energy minima were found to correspond to the alpha-helical regions of the real spatial structure of the protein. Twenty-four out of 25 helices were distinctly pronounced, which indicates a rather high accuracy of the prediction. The energy profiles also help reveal the short regions that correspond to 3/10-helices and the turns that include local alpha-helical conformations. Unlike the known statistical methods of prediction, this method makes it possible to establish the physical principles of the formation of alpha-helical conformations. The English version of the paper: Russian Journal of Bioorganic Chemistry, 2002, vol. 28, no. 6; see also http://www.maik.ru.  相似文献   

10.
Predicting allergenic proteins using wavelet transform   总被引:2,自引:0,他引:2  
MOTIVATION: With many transgenic proteins introduced today, the ability to predict their potential allergenicity has become an important issue. Previous studies were based on either sequence similarity or the protein motifs identified from known allergen databases. The similarity-based approaches, although being able to produce high recalls, usually have low prediction precisions. Previous motif-based approaches have been shown to be able to improve the precisions on cross-validation experiments. In this study, a system that combines the advantages of similarity-based and motif-based prediction is described. RESULTS: The new prediction system uses a clustering algorithm that groups the known allergenic proteins into clusters. Proteins within each cluster are assumed to carry one or more common motifs. After a multiple sequence alignment, proteins in each cluster go through a wavelet analysis program whereby conserved motifs will be identified. A hidden Markov model (HMM) profile will then be prepared for each identified motif. The allergens that do not appear to carry detectable allergen motifs will be saved in a small database. The allergenicity of an unknown protein may be predicted by comparing it against the HMM profiles, and, if no matching profiles are found, against the small allergen database by BLASTP. Over 70% of recall and over 90% of precision were observed using cross-validation experiments. Using the entire Swiss-Prot as the query, we predicted about 2000 potential allergens. AVAILABILITY: The software is available upon request from the authors.  相似文献   

11.
Large-scale parallel measurement of whole-genome RNA expression is now possible with high-density arrays of cDNA or oligonucleotides. Using this technology efficiently will require the integration of other sources of biological information, such as gene identity, biomedical literature and biochemical pathway for a given gene. Such integration is essential to understand the cellular program of gene expression and the molecular physiology of an organism. Advances in microarray technology, and the expected rapid rise in microarray data will lead to new insight into fundamental biological problems such as the prediction of gene function from expression profiles and the identification of potential drug targets from biologically active compounds.  相似文献   

12.
In this study we compare commonly used coiled-coil prediction methods against a database derived from proteins of known structure. We find that the two older programs COILS and PairCoil/MultiCoil are significantly outperformed by two recent developments: Marcoil, a program built on hidden Markov models, and PCOILS, a new COILS version that uses profiles as inputs; and to a lesser extent by a PairCoil update, PairCoil2. Overall Marcoil provides a slightly better performance over the reference database than PCOILS and is considerably faster, but it is sensitive to highly charged false positives, whereas the weighting option of PCOILS allows the identification of such sequences.  相似文献   

13.
In recent years, the advent of experimental methods to probe gene expression profiles of cancer on a genome-wide scale has led to widespread use of supervised machine learning algorithms to characterize these profiles. The main applications of these analysis methods range from assigning functional classes of previously uncharacterized genes to classification and prediction of different cancer tissues. This article surveys the application of machine learning algorithms to classification and diagnosis of cancer based on expression profiles. To exemplify the important issues of the classification procedure, the emphasis of this article is on one such method, namely artificial neural networks. In addition, methods to extract genes that are important for the performance of a classifier, as well as the influence of sample selection on prediction results are discussed.  相似文献   

14.
PMUT allows the fast and accurate prediction (approximately 80% success rate in humans) of the pathological character of single point amino acidic mutations based on the use of neural networks. The program also allows the fast scanning of mutational hot spots, which are obtained by three procedures: (1) alanine scanning, (2) massive mutation and (3) genetically accessible mutations. A graphical interface for Protein Data Bank (PDB) structures, when available, and a database containing hot spot profiles for all non-redundant PDB structures are also accessible from the PMUT server.  相似文献   

15.
16.
An algorithm for on-site computation with a hand-held programmable calculator (TI-59, Texas Instruments) of single inert-gas decompression schedules is described. This program is based on Workman's 'M-value' method. It can compute decompression schedules with changes in the oxygen content of the breathing mixture and extension of stay at any decompression stop. The features of the program that enable calculation of atypical dive profiles, along with the portability of small calculators, would make such an algorithm suitable for on-site applications. However, since dive profiles generated by the program have not yet been tested, divers are warned not to generate schedules until their safety has been established by field tests.  相似文献   

17.
The functional annotation of the new protein sequences represents a major drawback for genomic science. The best way to suggest the function of a protein from its sequence is by finding a related one for which biological information is available. Current alignment algorithms display a list of protein sequence stretches presenting significant similarity to different protein targets, ordered by their respective mathematical scores. However, statistical and biological significance do not always coincide, therefore, the rearrangement of the program output according to more biological characteristics than the mathematical scoring would help functional annotation. A new method that predicts the putative function for the protein integrating the results from the PSI-BLAST program and a fuzzy logic algorithm is described. Several protein sequence characteristics have been checked in their ability to rearrange a PSI-BLAST profile according more to their biological functions. Four of them: amino acid content, matched segment length and hydropathic and flexibility profiles positively contributed, upon being integrated by a fuzzy logic algorithm into a program, BYPASS, to the accurate prediction of the function of a protein from its sequence. Antonio Gómez and Juan Cedano contributed equally to this work.  相似文献   

18.
The internal ribosomal entry site (IRES) functions as cap-independent translation initiation sites in eukaryotic cells. IRES elements have been applied as useful tools for bi-cistronic expression vectors. Current RNA structure prediction programs are unable to predict precisely the potential IRES element. We have designed a viral IRES prediction system (VIPS) to perform the IRES secondary structure prediction. In order to obtain better results for the IRES prediction, the VIPS can evaluate and predict for all four different groups of IRESs with a higher accuracy. RNA secondary structure prediction, comparison, and pseudoknot prediction programs were implemented to form the three-stage procedure for the VIPS. The backbone of VIPS includes: the RNAL fold program, aimed to predict local RNA secondary structures by minimum free energy method; the RNA Align program, intended to compare predicted structures; and pknotsRG program, used to calculate the pseudoknot structure. VIPS was evaluated by using UTR database, IRES database and Virus database, and the accuracy rate of VIPS was assessed as 98.53%, 90.80%, 82.36% and 80.41% for IRES groups 1, 2, 3, and 4, respectively. This advance useful search approach for IRES structures will facilitate IRES related studies. The VIPS on-line website service is available at http://140.135.61.250/vips/.  相似文献   

19.
Cuff JA  Barton GJ 《Proteins》2000,40(3):502-511
The effect of training a neural network secondary structure prediction algorithm with different types of multiple sequence alignment profiles derived from the same sequences, is shown to provide a range of accuracy from 70.5% to 76.4%. The best accuracy of 76.4% (standard deviation 8.4%), is 3.1% (Q(3)) and 4.4% (SOV2) better than the PHD algorithm run on the same set of 406 sequence non-redundant proteins that were not used to train either method. Residues predicted by the new method with a confidence value of 5 or greater, have an average Q(3) accuracy of 84%, and cover 68% of the residues. Relative solvent accessibility based on a two state model, for 25, 5, and 0% accessibility are predicted at 76.2, 79.8, and 86. 6% accuracy respectively. The source of the improvements obtained from training with different representations of the same alignment data are described in detail. The new Jnet prediction method resulting from this study is available in the Jpred secondary structure prediction server, and as a stand-alone computer program from: http://barton.ebi.ac.uk/. Proteins 2000;40:502-511.  相似文献   

20.
In protein structure prediction, a central problem is defining the structure of a loop connecting 2 secondary structures. This problem frequently occurs in homology modeling, fold recognition, and in several strategies in ab initio structure prediction. In our previous work, we developed a classification database of structural motifs, ArchDB. The database contains 12,665 clustered loops in 451 structural classes with information about phi-psi angles in the loops and 1492 structural subclasses with the relative locations of the bracing secondary structures. Here we evaluate the extent to which sequence information in the loop database can be used to predict loop structure. Two sequence profiles were used, a HMM profile and a PSSM derived from PSI-BLAST. A jack-knife test was made removing homologous loops using SCOP superfamily definition and predicting afterwards against recalculated profiles that only take into account the sequence information. Two scenarios were considered: (1) prediction of structural class with application in comparative modeling and (2) prediction of structural subclass with application in fold recognition and ab initio. For the first scenario, structural class prediction was made directly over loops with X-ray secondary structure assignment, and if we consider the top 20 classes out of 451 possible classes, the best accuracy of prediction is 78.5%. In the second scenario, structural subclass prediction was made over loops using PSI-PRED (Jones, J Mol Biol 1999;292:195-202) secondary structure prediction to define loop boundaries, and if we take into account the top 20 subclasses out of 1492, the best accuracy is 46.7%. Accuracy of loop prediction was also evaluated by means of RMSD calculations.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号