共查询到20条相似文献,搜索用时 15 毫秒
1.
On-line prediction of fermentation variables using neural networks 总被引:10,自引:0,他引:10
This article presents an introduction to the use of neural network computational algorithms for the dynamic modeling of bioprocesses. The dynamic neural model is used for the prediction of key fermentation variables. This relatively hew method is compared with a more traditional prediction technique to judge its performance for prediction. Illustrative simulation results of a continuous stirred tank fermentor are used for this comparison. It is shown that neural network models are accurate with a certain degree of noise immunity. They offer the distinctive ability over more traditional methods to learn very naturally complex relationships without requiring the knowledge of the model structure. 相似文献
2.
Kedarisetti KD Mizianty MJ Dick S Kurgan L 《Journal of bioinformatics and computational biology》2011,9(1):67-89
Accurate identification of strand residues aids prediction and analysis of numerous structural and functional aspects of proteins. We propose a sequence-based predictor, BETArPRED, which improves prediction of strand residues and β-strand segments. BETArPRED uses a novel design that accepts strand residues predicted by SSpro and predicts the remaining positions utilizing a logistic regression classifier with nine custom-designed features. These are derived from the primary sequence, the secondary structure (SS) predicted by SSpro, PSIPRED and SPINE, and residue depth as predicted by RDpred. Our features utilize certain local (window-based) patterns in the predicted SS and combine information about the predicted SS and residue depth. BETArPRED is evaluated on 432 sequences that share low identity with the training chains, and on the CASP8 dataset. We compare BETArPRED with seven modern SS predictors, and the top-performing automated structure predictor in CASP8, the ZHANG-server. BETArPRED provides statistically significant improvements over each of the SS predictors; it improves prediction of strand residues and β-strands, and it finds β-strands that were missed by the other methods. When compared with the ZHANG-server, we improve predictions of strand segments and predict more actual strand residues, while the other predictor achieves higher rate of correct strand residue predictions when under-predicting them. 相似文献
3.
Improved prediction of bacterial transcription start sites 总被引:2,自引:0,他引:2
Gordon JJ Towsey MW Hogan JM Mathews SA Timms P 《Bioinformatics (Oxford, England)》2006,22(2):142-148
4.
Improved and automated prediction of effective siRNA 总被引:11,自引:0,他引:11
Chalk AM Wahlestedt C Sonnhammer EL 《Biochemical and biophysical research communications》2004,319(1):264-274
Short interfering RNAs are used in functional genomics studies to knockdown a single gene in a reversible manner. The results of siRNA experiments are highly dependent on the choice of siRNA sequence. In order to evaluate siRNA design rules, we collected a database of 398 siRNAs of known efficacy from 92 genes. We used this database to evaluate previously proposed rules from smaller datasets, and to find a new set of rules that are optimal for the entire database. We also trained a regression tree with full cross-validation. It was however difficult to obtain the same precision as methods previously tested on small datasets from one or two genes. We show that those methods are overfitting as they work poorly on independent validation datasets from multiple genes. Our new design rules can predict siRNAs with efficacy >/= 50% in 91% of cases, and with efficacy >/=90% in 52% of cases, which is more than a twofold improvement over random selection. Software for designing siRNAs is available online via a web server at or as a standalone version for high-throughput applications. 相似文献
5.
6.
Background
A protein structure can be determined by solving a so-called distance geometry problem whenever a set of inter-atomic distances is available and sufficient. However, the problem is intractable in general and has proved to be a NP hard problem. An updated geometric build-up algorithm (UGB) has been developed recently that controls numerical errors and is efficient in protein structure determination for cases where only sparse exact distance data is available. In this paper, the UGB method has been improved and revised with aims at solving distance geometry problems more efficiently and effectively.Methods
An efficient algorithm (called the revised updated geometric build-up algorithm (RUGB)) to build up a protein structure from atomic distance data is presented and provides an effective way of determining a protein structure with sparse exact distance data. In the algorithm, the condition to determine an unpositioned atom iteratively is relaxed (when compared with the UGB algorithm) and data structure techniques are used to make the algorithm more efficient and effective. The algorithm is tested on a set of proteins selected randomly from the Protein Structure Database-PDB.Results
We test a set of proteins selected randomly from the Protein Structure Database-PDB. We show that the numerical errors produced by the new RUGB algorithm are smaller when compared with the errors of the UGB algorithm and that the novel RUGB algorithm has a significantly smaller runtime than the UGB algorithm.Conclusions
The RUGB algorithm relaxes the condition for updating and incorporates the data structure for accessing neighbours of an atom. The revisions result in an improvement over the UGB algorithm in two important areas: a reduction on the overall runtime and decrease of the numeric error.7.
Tendulkar AV Wangikar PP Sohoni MA Samant VV Mone CY 《Journal of molecular biology》2003,334(1):157-172
We present a scheme for the classification of 3487 non-redundant protein structures into 1207 non-hierarchical clusters by using recurring structural patterns of three to six amino acids as keys of classification. This results in several signature patterns, which seem to decide membership of a protein in a functional category. The patterns provide clues to the key residues involved in functional sites as well as in protein-protein interaction. The discovered patterns include a "glutamate double bridge" of superoxide dismutase, the functional interface of the serine protease and inhibitor, interface of homo/hetero dimers, and functional sites of several enzyme families. We use geometric invariants to decide superimposability of structural patterns. This allows the parameterization of patterns and discovery of recurring patterns via clustering. The geometric invariant-based approach eliminates the computationally explosive step of pair-wise comparison of structures. The results provide a vast resource for the biologists for experimental validation of the proposed functional sites, and for the design of synthetic enzymes, inhibitors and drugs. 相似文献
8.
The SLoop database of supersecondary fragments, first described by Donate et al. (Protein Sci., 1996, 5, 2600-2616), contains protein loops, classified according to structural similarity. The database has recently been updated and currently contains over 10 000 loops up to 20 residues in length, which cluster into over 560 well populated classes. The database can be found at http://www-cryst.bioc.cam.ac.uk/~sloop. In this paper, we identify conserved structural features such as main chain conformation and hydrogen bonding. Using the original approach of Rufino and co-workers (1997), the correct structural class is predicted with the highest SLoop score for 35% of loops. This rises to 65% by considering the three highest scoring class predictions and to 75% in the top five scoring class predictions. Inclusion of residues from the neighbouring secondary structures and use of substitution tables derived using a reduced definition of secondary structure increase these prediction accuracies to 58, 78 and 85%, respectively. This suggests that capping residues can stabilize the loop conformation as well as that of the secondary structure. Further increases are achieved if only well-populated classes are considered in the prediction. These results correspond to an average loop root mean square deviation of between 0.4 and 2.6 A for loops up to five residues in length. 相似文献
9.
Protein complex prediction via cost-based clustering 总被引:13,自引:0,他引:13
MOTIVATION: Understanding principles of cellular organization and function can be enhanced if we detect known and predict still undiscovered protein complexes within the cell's protein-protein interaction (PPI) network. Such predictions may be used as an inexpensive tool to direct biological experiments. The increasing amount of available PPI data necessitates an accurate and scalable approach to protein complex identification. RESULTS: We have developed the Restricted Neighborhood Search Clustering Algorithm (RNSC) to efficiently partition networks into clusters using a cost function. We applied this cost-based clustering algorithm to PPI networks of Saccharomyces cerevisiae, Drosophila melanogaster and Caenorhabditis elegans to identify and predict protein complexes. We have determined functional and graph-theoretic properties of true protein complexes from the MIPS database. Based on these properties, we defined filters to distinguish between identified network clusters and true protein complexes. Conclusions: Our application of the cost-based clustering algorithm provides an accurate and scalable method of detecting and predicting protein complexes within a PPI network. 相似文献
10.
11.
Improved prediction of signal peptides: SignalP 3.0 总被引:63,自引:0,他引:63
We describe improvements of the currently most popular method for prediction of classically secreted proteins, SignalP. SignalP consists of two different predictors based on neural network and hidden Markov model algorithms, where both components have been updated. Motivated by the idea that the cleavage site position and the amino acid composition of the signal peptide are correlated, new features have been included as input to the neural network. This addition, combined with a thorough error-correction of a new data set, have improved the performance of the predictor significantly over SignalP version 2. In version 3, correctness of the cleavage site predictions has increased notably for all three organism groups, eukaryotes, Gram-negative and Gram-positive bacteria. The accuracy of cleavage site prediction has increased in the range 6-17% over the previous version, whereas the signal peptide discrimination improvement is mainly due to the elimination of false-positive predictions, as well as the introduction of a new discrimination score for the neural network. The new method has been benchmarked against other available methods. Predictions can be made at the publicly available web server 相似文献
12.
Chiu JJ Chen CN Lee PL Yang CT Chuang HS Chien S Usami S 《Journal of biomechanics》2003,36(12):1883-1895
The preferential adhesion of monocytes to vascular endothelial cells (ECs) at regions near branches and curvatures of the arterial tree, where flow is disturbed, suggests that hemodynamic conditions play significant roles in monocyte adhesion. The present study aims to elucidate the effects of disturbed flow on monocyte adhesion to ECs and the adhesive properties of ECs. We applied, for the first time, the micron-resolution particle image velocimetry (μPIV) technique to analyze the characteristics of the disturbed flow produced in our vertical-step flow (VSF) chamber. The results demonstrated the existence of a higher near-wall concentration and a longer residence time of the monocytic analog THP-1 cells near the step and the reattachment point. THP-1 cells showed prominent adhesion to ECs pretreated with TNF in the regions near the step and the reattachment point, but they showed virtually no adhesion to un-stimulated ECs. Pre-incubation of the TNF-treated ECs with antibodies against intercellular adhesion molecule-1 (ICAM-1), vascular adhesion molecule-1 (VCAM-1), and E-selectin inhibited the THP-1 adhesion; the maximal inhibition was observed with a combination of these antibodies. Pre-exposure of ECs to disturbed flow in VSF for 24 h led to significant increases in their surface expressions of ICAM-1 and E-selectin, but not VCAM-1, and in the adhesion of THP-1 cells. Our findings demonstrate the importance of complex flow environment in modulating the adhesive properties of vascular endothelium and consequently monocyte adhesion in regions of prevalence of atherosclerotic lesions. 相似文献
13.
14.
15.
Chemical shift frequencies represent a time-average of all the conformational states populated by a protein. Thus, chemical shift prediction programs based on sequence and database analysis yield higher accuracy for rigid rather than flexible protein segments. Here we show that the prediction accuracy can be significantly improved by averaging over an ensemble of structures, predicted solely from amino acid sequence with the Rosetta program. This approach to chemical shift and structure prediction has the potential to be useful for guiding resonance assignments, especially in solid-state NMR structural studies of membrane proteins in proteoliposomes. 相似文献
16.
Protein palmitoylation is an important and common post-translational lipid modification of proteins and plays a critical role in various cellular processes. Identification of Palmitoylation sites is fundamental to decipher the mechanisms of these biological processes. However, experimental determination of palmitoylation residues without prior knowledge is laborious and costly. Thus computational approaches for prediction of palmitoylation sites in proteins have become highly desirable. Here, we propose PPWMs, a computational predictor using Position Weight Matrices (PWMs) encoding scheme and support vector machine (SVM) for identifying protein palmitoylation sites. Our PPWMs shows a nice predictive performance with the area under the ROC curve (AUC) of 0.9472 for the S-palmitoylation sites prediction and 0.9964 for the N-palmitoylation sites prediction on the newly proposed dataset. Comparison results show the superiority of PPWMs over two existing widely known palmitoylation site predictors CSS-Palm 2.0 and CKSAAP-Palm in many cases. Moreover, an attempt of incorporating structure information such as accessible surface area (ASA) and secondary structure (SS) into prediction is made and the structure characteristics are analyzed roughly. The corresponding software can be freely downloaded from http://math.cau.edu.cn/PPWMs.html. 相似文献
17.
This article describes a novel method for predicting ligand-binding sites of proteins. This method uses only 8 structural properties as input vector to train 9 random forest classifiers which are combined to predict binding residues. These predicted binding residues are then clustered into some predicted ligand-binding sites. According to our measurement criterion, this method achieved a success rate of 0.914 in the bound state dataset and 0.800 in the unbound state dataset, which are better than three other methods: Q-SiteFinder, SCREEN and Morita's method. It indicates that the proposed method here is successful for predicting ligand-binding sites. 相似文献
18.
When an artificial neural network (ANN) is trained to predict signals p steps ahead, the quality of the prediction typically decreases for large values of p. In this paper, we compare two methods for prediction with ANNs: the classical recursion of one-step ahead predictors and a new kind of chain structure. When applying both techniques to the prediction of the temperature at the end of a blast furnace, we conclude that the chaining approach leads to an improved prediction of the temperature and avoidance of instabilities, since the chained networks gradually take the prediction of their predecessors in the chain as an extra input. It is observed that instabilities might occur in the iterative case, which does not happen with the chaining approach. To select relevant inputs and decrease the number of weights in this approach, Automatic Relevance Determination (ARD) for multilayer perceptrons is applied. 相似文献
19.
The identification and validation of protein allergens have become more important nowadays as more and more transgenic proteins are introduced into our food chains. Current allergen prediction algorithms focus on the identification of single motif or single allergen peptide for allergen detection. However, an analysis of the 575 allergen dataset shows that most allergens contain multiple motifs. Here, we present a novel algorithm that detects allergen by making use of combinations of motifs. Sensitivity of 0.772 and specificity of 0.904 were achieved by the proposed algorithm to predict allergen. The specificity of the proposed approach is found to be significantly higher than traditional single motif approaches. The high specificity of the proposed algorithm is useful in filtering out false positives, especially when laboratory resources are limited. 相似文献