首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
The development of multi-model ensembles for reliable predictions of inter-annual climate fluctuations and climate change, and their application to health, agronomy and water management, are discussed.  相似文献   

2.
3.
Turn prediction in proteins using a pattern-matching approach   总被引:16,自引:0,他引:16  
We extend the use of amino acid sequence patterns [Cohen, F.E., Abarbanel, R. M., Kuntz, I. D., & Fletterick, R. J. (1983) Biochemistry 22, 4894-4904] to the identification of turns in globular proteins. The approach uses a conservative strategy, combined with a hierarchical search (strongest patterns first) and length-dependent masking, to achieve high accuracy (95%) on a test set of proteins of known structure. Applying the same procedure to homologous families gives a 90% success rate. Straightforward changes are suggested to improve the predictive power. The computer program, written in Lisp, provides a general pattern-recognition language well suited for a number of investigations of protein and nucleic acid sequences.  相似文献   

4.
5.
Membrane proteins, which constitute approximately 20% of most genomes, form two main classes: alpha helical and beta barrel transmembrane proteins. Using methods based on Bayesian Networks, a powerful approach for statistical inference, we have sought to address beta-barrel topology prediction. The beta-barrel topology predictor reports individual strand accuracies of 88.6%. The method outlined here represents a potentially important advance in the computational determination of membrane protein topology.  相似文献   

6.
Membrane proteins, which constitute approximately 20% of most genomes, are poorly tractable targets for experimental structure determination, thus analysis by prediction and modelling makes an important contribution to their on-going study. Membrane proteins form two main classes: alpha helical and beta barrel trans-membrane proteins. By using a method based on Bayesian Networks, which provides a flexible and powerful framework for statistical inference, we addressed alpha-helical topology prediction. This method has accuracies of 77.4% for prokaryotic proteins and 61.4% for eukaryotic proteins. The method described here represents an important advance in the computational determination of membrane protein topology and offers a useful, and complementary, tool for the analysis of membrane proteins for a range of applications.  相似文献   

7.
8.
Since protein complexes play a crucial role in biological cells, one of the major goals in bioinformatics is the elucidation of protein complexes. A general approach is to build a prediction rule based on multiple data sources, e.g. gene expression data and protein interaction data, to assess the likelihood of two proteins having complex association. We critically revisit the step of predictor construction, i.e. the determination of a proper training set, an optimal classifier, and, most importantly, an optimal feature set. We use an exhaustive set of features, which includes the 2hop-feature as introduced by Wong et al. for predicting synthetic sick or lethal interactions. Post-processing of the likelihoods of protein interaction is then required to extract protein complexes. We propose a new protocol for combining these likelihood estimates. The protocol interprets the probabilities of complex association as output by the prediction rule as distances and employs hierarchical clustering to find groups of interacting proteins. In contrast to the computationally expensive search-and-score approach of Sharan et al., this protocol is very fast and can be applied to fully connected graphs. The protocol identifies trusted protein complexes with high confidence. We show that the 2hop-feature is relevant for predicting protein complexes. Furthermore, several interesting hypotheses about new protein complexes have been generated. For example, our approach linked the protein FYV4 to the mitochondrial ribosomal subunit. Interestingly, it is known that this protein is located in the mitochondrion, but its biological role is unknown. Vid22 and YGR071C were also linked, which corresponds to the new TAP data of Krogan et al.  相似文献   

9.
MOTIVATION: Structural genomics projects are beginning to produce protein structures with unknown function, therefore, accurate, automated predictors of protein function are required if all these structures are to be properly annotated in reasonable time. Identifying the interface between two interacting proteins provides important clues to the function of a protein and can reduce the search space required by docking algorithms to predict the structures of complexes. RESULTS: We have combined a support vector machine (SVM) approach with surface patch analysis to predict protein-protein binding sites. Using a leave-one-out cross-validation procedure, we were able to successfully predict the location of the binding site on 76% of our dataset made up of proteins with both transient and obligate interfaces. With heterogeneous cross-validation, where we trained the SVM on transient complexes to predict on obligate complexes (and vice versa), we still achieved comparable success rates to the leave-one-out cross-validation suggesting that sufficient properties are shared between transient and obligate interfaces. AVAILABILITY: A web application based on the method can be found at http://www.bioinformatics.leeds.ac.uk/ppi_pred. The dataset of 180 proteins used in this study is also available via the same web site. CONTACT: westhead@bmb.leeds.ac.uk SUPPLEMENTARY INFORMATION: http://www.bioinformatics.leeds.ac.uk/ppi-pred/supp-material.  相似文献   

10.
11.
12.
We investigated the validity of employing a fuzzy piecewise prediction equation (PW) [Gonzalez et al. J Appl Physiol 107: 379-388, 2009] defined by sweat rate (m(sw), g·m(-2)·h(-1)) = 147 + 1.527·(E(req)) - 0.87·(E(max)), which integrates evaporation required (E(req)) and the maximum evaporative capacity of the environment (E(max)). Heat exchange and physiological responses were determined throughout the trials. Environmental conditions were ambient temperature (T(a)) = 16-26°C, relative humidity (RH) = 51-55%, and wind speed (V) = 0.5-1.5 m/s. Volunteers wore military fatigues [clothing evaporative potential (i(m)/clo) = 0.33] and carried loads (15-31 kg) while marching 14-37 km over variable terrains either at night (N = 77, trials 1-5) or night with increasing daylight (N = 33, trials 6 and 7). PW was modified (Pw,sol) for transient solar radiation (R(sol), W) determined from measured solar loads and verified in trials 6 and 7. PW provided a valid m(sw) prediction during night trials (1-5) matching previous laboratory values and verified by bootstrap correlation (r(bs) of 0.81, SE ± 0.014, SEE = ± 69.2 g·m(-2)·h(-1)). For trials 6 and 7, E(req) and E(max) components included R(sol) applying a modified equation Pw,sol, in which m(sw) = 147 + 1.527·(E(req,sol)) - 0.87·(E(max)). Linear prediction of m(sw) = 0.72·Pw,sol + 135 (N = 33) was validated (R(2) = 0.92; SEE = ±33.8 g·m(-2)·h(-1)) with PW β-coefficients unaltered during field marches between 16°C and 26°C T(a) for m(sw) ≤ 700 g·m(-2)·h(-1). PW was additionally derived for cool laboratory/night conditions (T(a) < 20°C) in which E(req) is low but E(max) is high, as: PW,cool (g·m(-2)·h(-1)) = 350 + 1.527·E(req) - 0.87·E(max). These sweat prediction equations allow valid tools for civilian, sports, and military medicine communities to predict water needs during a variety of heat stress/exercise conditions.  相似文献   

13.
The secondary structures of the human membrane-associated folate binding protein (FBP) and bovine soluble FBP are assessed by a joint prediction approach that combines neural network models, information theory, homology modeling and the Chou-Fasman methods. Two new profile maps are used to characterize the non-regular secondary structure and to assist in assigning buried and exposed parts of secondary structure: (i) the loop potential profile and (ii) the long range contact profile. Approximately half of human FBP is predicted to form regular secondary structure (alpha-helices-35% or beta-sheets - 12%, excluding the transmembrane helices) and the rest is predicted to form coil, turns or loops. The bovine milk soluble FBP is predicted to have a similar secondary structure as expected because of the high degree of homology between the FBP's. Discriminant analysis predicts two transmembrane segments for the human FBP sequence, one at the amino terminus (a leader sequence) and the other at the carboxy terminus. These predicted transmembrane domains are absent in the bovine milk soluble FBP, further supporting these predictions. The present set of secondary structural predictions for human FBP is obtained by 'consensus' to aid in modeling the super-secondary structure of the protein.  相似文献   

14.

Background  

Eukaryotic promoter prediction using computational analysis techniques is one of the most difficult jobs in computational genomics that is essential for constructing and understanding genetic regulatory networks. The increased availability of sequence data for various eukaryotic organisms in recent years has necessitated for better tools and techniques for the prediction and analysis of promoters in eukaryotic sequences. Many promoter prediction methods and tools have been developed to date but they have yet to provide acceptable predictive performance. One obvious criteria to improve on current methods is to devise a better system for selecting appropriate features of promoters that distinguish them from non-promoters. Secondly improved performance can be achieved by enhancing the predictive ability of the machine learning algorithms used.  相似文献   

15.
Invasion ecology urgently requires predictive methodologies that can forecast the ecological impacts of existing, emerging and potential invasive species. We argue that many ecologically damaging invaders are characterised by their more efficient use of resources. Consequently, comparison of the classical ‘functional response’ (relationship between resource use and availability) between invasive and trophically analogous native species may allow prediction of invader ecological impact. We review the utility of species trait comparisons and the history and context of the use of functional responses in invasion ecology, then present our framework for the use of comparative functional responses. We show that functional response analyses, by describing the resource use of species over a range of resource availabilities, avoids many pitfalls of ‘snapshot’ assessments of resource use. Our framework demonstrates how comparisons of invader and native functional responses, within and between Type II and III functional responses, allow testing of the likely population-level outcomes of invasions for affected species. Furthermore, we describe how recent studies support the predictive capacity of this method; for example, the invasive ‘bloody red shrimp’ Hemimysis anomala shows higher Type II functional responses than native mysids and this corroborates, and could have predicted, actual invader impacts in the field. The comparative functional response method can also be used to examine differences in the impact of two or more invaders, two or more populations of the same invader, and the abiotic (e.g. temperature) and biotic (e.g. parasitism) context-dependencies of invader impacts. Our framework may also address the previous lack of rigour in testing major hypotheses in invasion ecology, such as the ‘enemy release’ and ‘biotic resistance’ hypotheses, as our approach explicitly considers demographic consequences for impacted resources, such as native and invasive prey species. We also identify potential challenges in the application of comparative functional responses in invasion ecology. These include incorporation of numerical responses, multiple predator effects and trait-mediated indirect interactions, replacement versus non-replacement study designs and the inclusion of functional responses in risk assessment frameworks. In future, the generation of sufficient case studies for a meta-analysis could test the overall hypothesis that comparative functional responses can indeed predict invasive species impacts.  相似文献   

16.
An approach of encoding for prediction of splice sites using SVM   总被引:1,自引:0,他引:1  
Huang J  Li T  Chen K  Wu J 《Biochimie》2006,88(7):923-929
In splice sites prediction, the accuracy is lower than 90% though the sequences adjacent to the splice sites have a high conservation. In order to improve the prediction accuracy, much attention has been paid to the improvement of the performance of the algorithms used, and few used for solving the fundamental issues, namely, nucleotide encoding. In this paper, a predictor is constructed to predict the true and false splice sites for higher eukaryotes based on support vector machines (SVM). Four types of encoding, which were mono-nucleotide (MN) encoding, MN with frequency difference between the true sites and false sites (FDTF) encoding, Pair-wise nucleotides (PN) encoding and PN with FDTF encoding, were applied to generate the input for the SVM. The results showed that PN with FDTF encoding as input to SVM led to the most reliable recognition of splice sites and the accuracy for the prediction of true donor sites and false sites were 96.3%, 93.7%, respectively, and the accuracy for predicting of true acceptor sites and false sites were 94.0%, 93.2%, respectively.  相似文献   

17.
MOTIVATION: Prediction of which peptides will bind a specific major histocompatibility complex (MHC) constitutes an important step in identifying potential T-cell epitopes suitable as vaccine candidates. MHC class II binding peptides have a broad length distribution complicating such predictions. Thus, identifying the correct alignment is a crucial part of identifying the core of an MHC class II binding motif. In this context, we wish to describe a novel Gibbs motif sampler method ideally suited for recognizing such weak sequence motifs. The method is based on the Gibbs sampling method, and it incorporates novel features optimized for the task of recognizing the binding motif of MHC classes I and II. The method locates the binding motif in a set of sequences and characterizes the motif in terms of a weight-matrix. Subsequently, the weight-matrix can be applied to identifying effectively potential MHC binding peptides and to guiding the process of rational vaccine design. RESULTS: We apply the motif sampler method to the complex problem of MHC class II binding. The input to the method is amino acid peptide sequences extracted from the public databases of SYFPEITHI and MHCPEP and known to bind to the MHC class II complex HLA-DR4(B1*0401). Prior identification of information-rich (anchor) positions in the binding motif is shown to improve the predictive performance of the Gibbs sampler. Similarly, a consensus solution obtained from an ensemble average over suboptimal solutions is shown to outperform the use of a single optimal solution. In a large-scale benchmark calculation, the performance is quantified using relative operating characteristics curve (ROC) plots and we make a detailed comparison of the performance with that of both the TEPITOPE method and a weight-matrix derived using the conventional alignment algorithm of ClustalW. The calculation demonstrates that the predictive performance of the Gibbs sampler is higher than that of ClustalW and in most cases also higher than that of the TEPITOPE method.  相似文献   

18.
19.
MOTIVATION: The Majority Vote approach has demonstrated that protein-protein interactions can be used to predict the structure or function of a protein. In this article we propose a novel method for the prediction of such protein characteristics based on frequencies of pairwise interactions. In addition, we study a second new approach using the pattern frequencies of triplets of proteins, thus for the first time taking network structure explicitly into account. Both these methods are extended to jointly consider multiple organisms and multiple characteristics. RESULTS: Compared to the standard non-network-based method, namely the Majority Vote method, in large networks our predictions tend to be more accurate. For structure prediction, the Frequency-based method reaches up to 71% accuracy, and the Triplet-based method reaches up to 72% accuracy, whereas for function prediction, both the Triplet-based method and the Frequency-based method reach up to 90% accuracy. Function prediction on proteins without homologues showed slightly less but comparable accuracies. Including partially annotated proteins substantially increases the number of proteins for which our methods predict their characteristics with reasonable accuracy. We find that the enhanced Triplet-based method does not currently yield significantly better results than the enhanced Frequency-based method, suggesting that triplets of interactions do not contain substantially more information about protein characteristics than interaction pairs. Our methods offer two main improvements over current approaches--first, multiple protein characteristics are considered simultaneously, and second, data is integrated from multiple species. In addition, the Triplet-based method includes network structure more explicitly than the Majority Vote and the Frequency-based method. AVAILABILITY: The program is available upon request. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

20.
Ho WH  Lee KT  Chen HY  Ho TW  Chiu HC 《PloS one》2012,7(1):e29179

Background

A database for hepatocellular carcinoma (HCC) patients who had received hepatic resection was used to develop prediction models for 1-, 3- and 5-year disease-free survival based on a set of clinical parameters for this patient group.

Methods

The three prediction models included an artificial neural network (ANN) model, a logistic regression (LR) model, and a decision tree (DT) model. Data for 427, 354 and 297 HCC patients with histories of 1-, 3- and 5-year disease-free survival after hepatic resection, respectively, were extracted from the HCC patient database. From each of the three groups, 80% of the cases (342, 283 and 238 cases of 1-, 3- and 5-year disease-free survival, respectively) were selected to provide training data for the prediction models. The remaining 20% of cases in each group (85, 71 and 59 cases in the three respective groups) were assigned to validation groups for performance comparisons of the three models. Area under receiver operating characteristics curve (AUROC) was used as the performance index for evaluating the three models.

Conclusions

The ANN model outperformed the LR and DT models in terms of prediction accuracy. This study demonstrated the feasibility of using ANNs in medical decision support systems for predicting disease-free survival based on clinical databases in HCC patients who have received hepatic resection.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号