首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
膜蛋白的结构预测在目前比较困难.本文利用已建立的模式识别方法预测了三个典型的膜蛋白RC,BR和RH的二级结构,预测结果与实验资料的符合率与该方法用于球蛋白时的结果相仿,是成功的.本文进一步完善了模式识别预测蛋白质二级结构的方法.建立了针对球蛋白二级结构预测的多分类方法,预测精度大于60%.事实证明这是一种较好的结构预测方法,鉴于目前国内外运用模式识别方法进行结构预测研究的还不多见,我们拟进一步发展完善这一方法.  相似文献   

2.
MOTIVATION: With the emerging success of protein secondary structure prediction through the applications of various statistical and machine learning techniques, similar techniques have been applied to protein beta-turn prediction. In this study, we perform protein beta-turn prediction using a k-nearest neighbor method, which is combined with a filter that uses predicted protein secondary structure information. Traditional beta-turn prediction from k-nearest neighbor method is modified to account for the unbalanced ratio of the natural occurrence of beta-turns and non-beta-turns. RESULTS: Our prediction scheme is tested on a set of 426 non-homologous protein sequences. The prediction scheme consists of two stages: k-nearest neighbor method stage and filtering stage. Variations of the k-nearest neighbor method were used to take property of beta-turns into consideration. Our filtering method uses beta-turn/non-beta-turn estimates from the k-nearest neighbor method stage and predicted protein secondary structure information from PSI-PRED in order to get new beta-turn/non-beta-turn estimate. Our result is compared with the previously best known beta-turn prediction method on the dataset of 426 non-homologous protein sequences and is shown to give slightly superior performance at significantly lower computational complexity. AVAILABILITY: Contact the author for information on the source code of the programs used.  相似文献   

3.
We have constructed a perceptron type neural network for E. coli promoter prediction and improved its ability to generalize with a new technique for selecting the sequence features shown during training. We have also reconstructed five previous prediction methods and compared the effectiveness of those methods and our neural network. Surprisingly, the simple statistical method of Mulligan et al. performed the best amongst the previous methods. Our neural network was comparable to Mulligan's method when false positives were kept low and better than Mulligan's method when false negatives were kept low. We also showed the correlation between the prediction rates of neural networks achieved by previous researchers and the information content of their data sets.  相似文献   

4.
Improved method for predicting beta-turn using support vector machine   总被引:2,自引:0,他引:2  
MOTIVATION: Numerous methods for predicting beta-turns in proteins have been developed based on various computational schemes. Here, we introduce a new method of beta-turn prediction that uses the support vector machine (SVM) algorithm together with predicted secondary structure information. Various parameters from the SVM have been adjusted to achieve optimal prediction performance. RESULTS: The SVM method achieved excellent performance as measured by the Matthews correlation coefficient (MCC = 0.45) using a 7-fold cross validation on a database of 426 non-homologous protein chains. To our best knowledge, this MCC value is the highest achieved so far for predicting beta-turn. The overall prediction accuracy Qtotal was 77.3%, which is the best among the existing prediction methods. Among its unique attractive features, the present SVM method avoids overtraining and compresses information and provides a predicted reliability index.  相似文献   

5.
细菌sRNA基因及其靶标预测研究进展   总被引:1,自引:0,他引:1  
摘要:细菌sRNA是一类长度在40~500 nt之间的非编码RNA,主要以不完全碱基配对方式与靶标mRNA5′端相互作用进而发挥其生物学功能。鉴于预测方法可以为细菌sRNA及其靶标的实验发现提供指导,因此,细菌sRNA与靶标预测研究受到了广泛重视。文章首先将sRNA预测方法分为3类,分别是基于比较基因组学的预测方法、基于转录单元的预测方法和基于机器学习的预测方法;其次,将sRNA靶标预测方法分为2类,分别是序列比较方法与基于RNA二级结构的预测方法;最后对各类方法的原理、核心思想、优点和局限性进行了分析,并探讨了进一步的发展方向。  相似文献   

6.
Predicting protein functions computationally from massive protein-protein interaction (PPI) data generated by high-throughput technology is one of the challenges and fundamental problems in the post-genomic era. Although there have been many approaches developed for computationally predicting protein functions, the mutual correlations among proteins in terms of protein functions have not been thoroughly investigated and incorporated into existing prediction methods, especially in voting based prediction methods. In this paper, we propose an innovative method to predict protein functions from PPI data by aggregating the functional correlations among relevant proteins using the Choquet-Integral in fuzzy theory. This functional aggregation measures the real impact of each relevant protein function on the final prediction results, and reduces the impact of repeated functional information on the prediction. Accordingly, a new protein similarity and a new iterative prediction algorithm are proposed in this paper. The experimental evaluations on real PPI datasets demonstrate the effectiveness of our method.  相似文献   

7.
Joo K  Lee SJ  Lee J 《Proteins》2012,80(7):1791-1797
We present a method to predict the solvent accessibility of proteins which is based on a nearest neighbor method applied to the sequence profiles. Using the method, continuous real-value prediction as well as two-state and three-state discrete predictions can be obtained. The method utilizes the z-score value of the distance measure in the feature vector space to estimate the relative contribution among the k-nearest neighbors for prediction of the discrete and continuous solvent accessibility. The Solvent accessibility database is constructed from 5717 proteins extracted from PISCES culling server with the cutoff of 25% sequence identities. Using optimal parameters, the prediction accuracies (for discrete predictions) of 78.38% (two-state prediction with the threshold of 25%), 65.1% (three-state prediction with the thresholds of 9 and 36%), and the Pearson correlation coefficient (between the predicted and true RSA's for continuous prediction) of 0.676 are achieved An independent benchmark test was performed with the CASP8 targets where we find that the proposed method outperforms existing methods. The prediction accuracies are 80.89% (for two state prediction with the threshold of 25%), 67.58% (three-state prediction), and the Pearson correlation coefficient of 0.727 (for continuous prediction) with mean absolute error of 0.148. We have also investigated the effect of increasing database sizes on the prediction accuracy, where additional improvement in the accuracy is observed as the database size increases. The SANN web server is available at http://lee.kias.re.kr/~newton/sann/.  相似文献   

8.
In the present study, an attempt has been made to develop a method for predicting gamma-turns in proteins. First, we have implemented the commonly used statistical and machine-learning techniques in the field of protein structure prediction, for the prediction of gamma-turns. All the methods have been trained and tested on a set of 320 nonhomologous protein chains by a fivefold cross-validation technique. It has been observed that the performance of all methods is very poor, having a Matthew's Correlation Coefficient (MCC) 相似文献   

9.
运用BP人工神经网络预测长江中下游梨黑星病发病的研究   总被引:6,自引:3,他引:3  
孙凡 《生物数学学报》2002,17(4):440-443
提出了运用人工神经网络技术进行梨黑星病预测的新思路,并以梨黑星病发病的主要影响因素,即上年7月的降水量和上年8月的降水量作为训练样本模式提供给网络,按照误差逆传播网络的学习规则对网络进行训练,经过计算机2844次学习后,网络达到预先给定的收敛标准,使网络具备了预测梨树黑星病流行趋势和流行强度的功能。检验结果表明,该方法性能良好,预测准确率高,可望成为果树病早害预测预报的有效辅助手段。  相似文献   

10.
The increasing protein sequences from the genome project require theoretical methods to predict transmembrane helical segments (TMHs). So far, several prediction methods have been reported, but there are some deficiencies in prediction accuracy and adaptability in these methods. In this paper, a method based on discrete wavelet transform (DWT) has been developed to predict the number and location of TMHs in membrane proteins. PDB coded as 1KQG is chosen as an example to describe the prediction process by this method. 80 proteins with known 3D structure from Mptopo database are chosen at random as data sets (including 325 TMHs) and 80 sequences are divided into 13 groups according to their function and type. TMHs prediction is carried out for each group of membrane protein sequences and obtain satisfactory result. To verify the feasibility of this method, 80 membrane protein sequences are treated as test sets, 308 TMHs can be predicted and the prediction accuracy is 96.3%. Compared with the main prediction results of seven popular prediction methods, the obtained results indicate that the proposed method in this paper has higher prediction accuracy.  相似文献   

11.
12.
Li X  Jacobson MP  Zhu K  Zhao S  Friesner RA 《Proteins》2007,66(4):824-837
We have developed a new method (Independent Cluster Decomposition Algorithm, ICDA) for creating all-atom models of proteins given the heavy-atom coordinates, provided by X-ray crystallography, and the pH. In our method the ionization states of titratable residues, the crystallographic mis-assignment of amide orientations in Asn/Gln, and the orientations of OH/SH groups are addressed under the unified framework of polar states assignment. To address the large number of combinatorial possibilities for the polar hydrogen states of the protein, we have devised a novel algorithm to decompose the system into independent interacting clusters, based on the observation of the crucial interdependence between the short range hydrogen bonding network and polar residue states, thus significantly reducing the computational complexity of the problem and making our algorithm tractable using relatively modest computational resources. We utilize an all atom protein force field (OPLS) and a Generalized Born continuum solvation model, in contrast to the various empirical force fields adopted in most previous studies. We have compared our prediction results with a few well-documented methods in the literature (WHATIF, REDUCE). In addition, as a preliminary attempt to couple our polar state assignment method with real structure predictions, we further validate our method using single side chain prediction, which has been demonstrated to be an effective way of validating structure prediction methods without incurring sampling problems. Comparisons of single side chain prediction results after the application of our polar state prediction method with previous results with default polar state assignments indicate a significant improvement in the single side chain predictions for polar residues.  相似文献   

13.
Prediction of disordered regions in proteins based on the meta approach   总被引:1,自引:0,他引:1  
MOTIVATION: Intrinsically disordered regions in proteins have no unique stable structures without their partner molecules, thus these regions sometimes prevent high-quality structure determination. Furthermore, proteins with disordered regions are often involved in important biological processes, and the disordered regions are considered to play important roles in molecular interactions. Therefore, identifying disordered regions is important to obtain high-resolution structural information and to understand the functional aspects of these proteins. RESULTS: We developed a new prediction method for disordered regions in proteins based on the meta approach and implemented a web-server for this prediction method named 'metaPrDOS'. The method predicts the disorder tendency of each residue using support vector machines from the prediction results of the seven independent predictors. Evaluation of the meta approach was performed using the CASP7 prediction targets to avoid an overestimation due to the inclusion of proteins used in the training set of some component predictors. As a result, the meta approach achieved higher prediction accuracy than all methods participating in CASP7.  相似文献   

14.
BACKGROUND/AIMS: Skeletal maturation is considered a reliable variable in evaluating the 'tempo' of growth. It is important in the diagnosis of endocrinological diseases, in chronic diseases, in hormonal therapy follow-up and in computing height prediction for prognostic and therapeutic purposes. It is also used when chronological age is not available for minors without known birth dates. There are different methods to evaluate skeletal maturation and height prediction. The Tanner-Whitehouse (TW) method 2 (TW2) has been considered to be the most useful method so far, and has recently been updated with modified height prediction equations (TW2-Mark II). TW3 is the newest method. The aim of this study is to evaluate whether TW3 is more accurate in the assessment of height prediction than TW2-Mark II in a sample of healthy north Italian subjects. METHODS: Anthropometrical data were collected as part of a survey in 1977-1978 in Turin. The sample involved 1,384 healthy children. The children, now adults, have been traced and recalled to measure their final height in order to test height prediction reliability. At present, we have collected 118 adult heights. RESULTS: According to the TW2 method 40% of the males had a height prediction error larger than +/- residual SD (4.1 cm), and with TW3 this was 32.9%. The female height prediction error with TW2 was larger than +/- residual SD (3.6 cm) in 29.2% of girls, and the same value was found with TW3. CONCLUSION: According to our preliminary data, TW3 does not represent any real progress.  相似文献   

15.

Background  

Although many genomic features have been used in the prediction of protein-protein interactions (PPIs), frequently only one is used in a computational method. After realizing the limited power in the prediction using only one genomic feature, investigators are now moving toward integration. So far, there have been few integration studies for PPI prediction; one failed to yield appreciable improvement of prediction and the others did not conduct performance comparison. It remains unclear whether an integration of multiple genomic features can improve the PPI prediction and, if it can, how to integrate these features.  相似文献   

16.
Protein phosphorylation is a ubiquitous protein post-translational modification, which plays an important role in cellular signaling systems underlying various physiological and pathological processes. Current in silico methods mainly focused on the prediction of phosphorylation sites, but rare methods considered whether a phosphorylation site is functional or not. Since functional phosphorylation sites are more valuable for further experimental research and a proportion of phosphorylation sites have no direct functional effects, the prediction of functional phosphorylation sites is quite necessary for this research area. Previous studies have shown that functional phosphorylation sites are more conserved than non-functional phosphorylation sites in evolution. Thus, in our method, we developed a web server by integrating existing phosphorylation site prediction methods, as well as both absolute and relative evolutionary conservation scores to predict the most likely functional phosphorylation sites. Using our method, we predicted the most likely functional sites of the human, rat and mouse proteomes and built a database for the predicted sites. By the analysis of overall prediction results, we demonstrated that protein phosphorylation plays an important role in all the enriched KEGG pathways. By the analysis of protein-specific prediction results, we demonstrated the usefulness of our method for individual protein studies. Our method would help to characterize the most likely functional phosphorylation sites for further studies in this research area.  相似文献   

17.
MiRNAs are a class of small non‐coding RNAs that are involved in the development and progression of various complex diseases. Great efforts have been made to discover potential associations between miRNAs and diseases recently. As experimental methods are in general expensive and time‐consuming, a large number of computational models have been developed to effectively predict reliable disease‐related miRNAs. However, the inherent noise and incompleteness in the existing biological datasets have inevitably limited the prediction accuracy of current computational models. To solve this issue, in this paper, we propose a novel method for miRNA‐disease association prediction based on matrix completion and label propagation. Specifically, our method first reconstructs a new miRNA/disease similarity matrix by matrix completion algorithm based on known experimentally verified miRNA‐disease associations and then utilizes the label propagation algorithm to reliably predict disease‐related miRNAs. As a result, MCLPMDA achieved comparable performance under different evaluation metrics and was capable of discovering greater number of true miRNA‐disease associations. Moreover, case study conducted on Breast Neoplasms further confirmed the prediction reliability of the proposed method. Taken together, the experimental results clearly demonstrated that MCLPMDA can serve as an effective and reliable tool for miRNA‐disease association prediction.  相似文献   

18.
Chen X  Su Z  Dam P  Palenik B  Xu Y  Jiang T 《Nucleic acids research》2004,32(7):2147-2157
We present a computational method for operon prediction based on a comparative genomics approach. A group of consecutive genes is considered as a candidate operon if both their gene sequences and functions are conserved across several phylogenetically related genomes. In addition, various supporting data for operons are also collected through the application of public domain computer programs, and used in our prediction method. These include the prediction of conserved gene functions, promoter motifs and terminators. An apparent advantage of our approach over other operon prediction methods is that it does not require many experimental data (such as gene expression data and pathway data) as input. This feature makes it applicable to many newly sequenced genomes that do not have extensive experimental information. In order to validate our prediction, we have tested the method on Escherichia coli K12, in which operon structures have been extensively studied, through a comparative analysis against Haemophilus influenzae Rd and Salmonella typhimurium LT2. Our method successfully predicted most of the 237 known operons. After this initial validation, we then applied the method to a newly sequenced and annotated microbial genome, Synechococcus sp. WH8102, through a comparative genome analysis with two other cyanobacterial genomes, Prochlorococcus marinus sp. MED4 and P.marinus sp. MIT9313. Our results are consistent with previously reported results and statistics on operons in the literature.  相似文献   

19.
De novo prediction of protein structures, the prediction of structures from amino acid sequences which are not similar to those of hitherto resolved structures, has been one of the major challenges in molecular biophysics. In this paper, we develop a new method of de novo prediction, which combines the fragment assembly method and the simulation of physical folding process: structures which have consistently assembled fragments are dynamically searched by Langevin molecular dynamics of conformational change. The benchmarking test shows that the prediction is improved when the candidate structures are cross-checked by an empirically derived score function.  相似文献   

20.
Protein backbone angle prediction with machine learning approaches   总被引:2,自引:0,他引:2  
MOTIVATION: Protein backbone torsion angle prediction provides useful local structural information that goes beyond conventional three-state (alpha, beta and coil) secondary structure predictions. Accurate prediction of protein backbone torsion angles will substantially improve modeling procedures for local structures of protein sequence segments, especially in modeling loop conformations that do not form regular structures as in alpha-helices or beta-strands. RESULTS: We have devised two novel automated methods in protein backbone conformational state prediction: one method is based on support vector machines (SVMs); the other method combines a standard feed-forward back-propagation artificial neural network (NN) with a local structure-based sequence profile database (LSBSP1). Extensive benchmark experiments demonstrate that both methods have improved the prediction accuracy rate over the previously published methods for conformation state prediction when using an alphabet of three or four states. AVAILABILITY: LSBSP1 and the NN algorithm have been implemented in PrISM.1, which is available from www.columbia.edu/~ay1/. SUPPLEMENTARY INFORMATION: Supplementary data for the SVM method can be downloaded from the Website www.cs.columbia.edu/compbio/backbone.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号