首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

The ability to predict antibody binding sites (aka antigenic determinants or B-cell epitopes) for a given protein is a precursor to new vaccine design and diagnostics. Among the various methods of B-cell epitope identification X-ray crystallography is one of the most reliable methods. Using these experimental data computational methods exist for B-cell epitope prediction. As the number of structures of antibody-protein complexes grows, further interest in prediction methods using 3D structure is anticipated. This work aims to establish a benchmark for 3D structure-based epitope prediction methods.  相似文献   

2.

Background  

Local structures of target mRNAs play a significant role in determining the efficacies of antisense oligonucleotides (ODNs), but some structure-based target site selection methods are limited by uncertainties in RNA secondary structure prediction. If all the predicted structures of a given mRNA within a certain energy limit could be used simultaneously, target site selection would obviously be improved in both reliability and efficiency. In this study, some key problems in ODN target selection on the basis of multiple predicted target mRNA structures are systematically discussed.  相似文献   

3.

Background  

Prediction of antigenic epitopes on protein surfaces is important for vaccine design. Most existing epitope prediction methods focus on protein sequences to predict continuous epitopes linear in sequence. Only a few structure-based epitope prediction algorithms are available and they have not yet shown satisfying performance.  相似文献   

4.

Background  

Identifying pockets on protein surfaces is of great importance for many structure-based drug design applications and protein-ligand docking algorithms. Over the last ten years, many geometric methods for the prediction of ligand-binding sites have been developed.  相似文献   

5.

Background  

Reduced representations of proteins have been playing a keyrole in the study of protein folding. Many such models are available, with different representation detail. Although the usefulness of many such models for structural bioinformatics applications has been demonstrated in recent years, there are few intermediate resolution models endowed with an energy model capable, for instance, of detecting native or native-like structures among decoy sets. The aim of the present work is to provide a discrete empirical potential for a reduced protein model termed here PC2CA, because it employs a PseudoCovalent structure with only 2 Centers of interactions per Amino acid, suitable for protein model quality assessment.  相似文献   

6.
We have investigated some of the basic principles that influence generation of protein structures using a fragment-based, random insertion method. We tested buildup methods and fragment library quality for accuracy in constructing a set of known structures. The parameters most influential in the construction procedure are bond and torsion angles with minor inaccuracies in bond angles alone causing >6 A CalphaRMSD for a 150-residue protein. Idealization to a standard set of values corrects this problem, but changes the torsion angles and does not work for every structure. Alternatively, we found using Cartesian coordinates instead of torsion angles did not reduce performance and can potentially increase speed and accuracy. Under conditions simulating ab initio structure prediction, fragment library quality can be suboptimal and still produce near-native structures. Using various clustering criteria, we created a number of libraries and used them to predict a set of native structures based on nonnative fragments. Local CalphaRMSD fit of fragments, library size, and takeoff/landing angle criteria weakly influence the accuracy of the models. Based on a fragment's minimal perturbation upon insertion into a known structure, a seminative fragment library was created that produced more accurate structures with fragments that were less similar to native fragments than the other sets. These results suggest that fragments need only contain native-like subsections, which when correctly overlapped, can recreate a native-like model. For fragment-based, random insertion methods used in protein structure prediction and design, our findings help to define the parameters this method needs to generate near-native structures.  相似文献   

7.

Background  

The selection of the most accurate protein model from a set of alternatives is a crucial step in protein structure prediction both in template-based and ab initio approaches. Scoring functions have been developed which can either return a quality estimate for a single model or derive a score from the information contained in the ensemble of models for a given sequence. Local structural features occurring more frequently in the ensemble have a greater probability of being correct. Within the context of the CASP experiment, these so called consensus methods have been shown to perform considerably better in selecting good candidate models, but tend to fail if the best models are far from the dominant structural cluster. In this paper we show that model selection can be improved if both approaches are combined by pre-filtering the models used during the calculation of the structural consensus.  相似文献   

8.

Background  

Profile hidden Markov model (HMM) techniques are among the most powerful methods for protein homology detection. Yet, the critical features for successful modelling are not fully known. In the present work we approached this by using two of the most popular HMM packages: SAM and HMMER. The programs' abilities to build models and score sequences were compared on a SCOP/Pfam based test set. The comparison was done separately for local and global HMM scoring.  相似文献   

9.

Background  

Since many of the new protein structures delivered by high-throughput processes do not have any known function, there is a need for structure-based prediction of protein function. Protein 3D structures can be clustered according to their fold or secondary structures to produce classes of some functional significance. A recent alternative has been to detect specific 3D motifs which are often associated to active sites. Unfortunately, there are very few known 3D motifs, which are usually the result of a manual process, compared to the number of sequential motifs already known. In this paper, we report a method to automatically generate 3D motifs of protein structure binding sites based on consensus atom positions and evaluate it on a set of adenine based ligands.  相似文献   

10.
WGCNA: an R package for weighted correlation network analysis   总被引:12,自引:0,他引:12  

Background

Modelling the time-related behaviour of biological systems is essential for understanding their dynamic responses to perturbations. In metabolic profiling studies, the sampling rate and number of sampling points are often restricted due to experimental and biological constraints.

Results

A supervised multivariate modelling approach with the objective to model the time-related variation in the data for short and sparsely sampled time-series is described. A set of piecewise Orthogonal Projections to Latent Structures (OPLS) models are estimated, describing changes between successive time points. The individual OPLS models are linear, but the piecewise combination of several models accommodates modelling and prediction of changes which are non-linear with respect to the time course. We demonstrate the method on both simulated and metabolic profiling data, illustrating how time related changes are successfully modelled and predicted.

Conclusion

The proposed method is effective for modelling and prediction of short and multivariate time series data. A key advantage of the method is model transparency, allowing easy interpretation of time-related variation in the data. The method provides a competitive complement to commonly applied multivariate methods such as OPLS and Principal Component Analysis (PCA) for modelling and analysis of short time-series data.  相似文献   

11.
12.

Background

Several gene sets for prediction of breast cancer survival have been derived from whole-genome mRNA expression profiles. Here, we develop a statistical framework to explore whether combination of the information from such sets may improve prediction of recurrence and breast cancer specific death in early-stage breast cancers. Microarray data from two clinically similar cohorts of breast cancer patients are used as training (n = 123) and test set (n = 81), respectively. Gene sets from eleven previously published gene signatures are included in the study.

Principal Findings

To investigate the relationship between breast cancer survival and gene expression on a particular gene set, a Cox proportional hazards model is applied using partial likelihood regression with an L2 penalty to avoid overfitting and using cross-validation to determine the penalty weight. The fitted models are applied to an independent test set to obtain a predicted risk for each individual and each gene set. Hierarchical clustering of the test individuals on the basis of the vector of predicted risks results in two clusters with distinct clinical characteristics in terms of the distribution of molecular subtypes, ER, PR status, TP53 mutation status and histological grade category, and associated with significantly different survival probabilities (recurrence: p = 0.005; breast cancer death: p = 0.014). Finally, principal components analysis of the gene signatures is used to derive combined predictors used to fit a new Cox model. This model classifies test individuals into two risk groups with distinct survival characteristics (recurrence: p = 0.003; breast cancer death: p = 0.001). The latter classifier outperforms all the individual gene signatures, as well as Cox models based on traditional clinical parameters and the Adjuvant! Online for survival prediction.

Conclusion

Combining the predictive strength of multiple gene signatures improves prediction of breast cancer survival. The presented methodology is broadly applicable to breast cancer risk assessment using any new identified gene set.  相似文献   

13.
A combined ligand and structure-based drug design approach provides a synergistic advantage over either methods performed individually. Present work bestows a good assembly of ligand and structure-based pharmacophore generation concept. Ligand-oriented study was accomplished by employing the HypoGen module of Catalyst in which we have translated the experimental findings into 3-D pharmacophore models by identifying key features (four point pharmacophore) necessary for interaction of the inhibitors with the active site of HIV-1 protease enzyme using a training set of 33 compounds belonging to the cyclic cyanoguanidines and cyclic urea derivatives. The most predictive pharmacophore model (hypothesis 1), consisting of four features, namely, two hydrogen bond acceptors and two hydrophobic, showed a correlation (r) of 0.90 and a root mean square of 0.71 and cost difference of 56.59 bits between null cost and fixed cost. The model was validated using CatScramble technique, internal and external test set prediction. In the second phase of our study, a structure-based five feature pharmacophore hypothesis was generated which signifies the importance of hydrogen bond donor, hydrogen bond acceptors and hydrophobic interaction between the HIV-1 protease enzyme and its inhibitors. This work has taken a significant step towards the full integration of ligand and structure-based drug design methodologies as pharmacophoric features retrieved from structure-based strategy complemented the features from ligand-based study hence proving the accuracy of the developed models. The ligand-based pharmacophore model was used in virtual screening of Maybridge and NCI compound database resulting in the identification of four structurally diverse druggable compounds with nM activities.  相似文献   

14.

Background  

The rapid development of structural genomics has resulted in many "unknown function" proteins being deposited in Protein Data Bank (PDB), thus, the functional prediction of these proteins has become a challenge for structural bioinformatics. Several sequence-based and structure-based methods have been developed to predict protein function, but these methods need to be improved further, such as, enhancing the accuracy, sensitivity, and the computational speed. Here, an accurate algorithm, the CMASA (Contact MAtrix based local Structural Alignment algorithm), has been developed to predict unknown functions of proteins based on the local protein structural similarity. This algorithm has been evaluated by building a test set including 164 enzyme families, and also been compared to other methods.  相似文献   

15.

Background  

Secondary structure prediction is a useful first step toward 3D structure prediction. A number of successful secondary structure prediction methods use neural networks, but unfortunately, neural networks are not intuitively interpretable. On the contrary, hidden Markov models are graphical interpretable models. Moreover, they have been successfully used in many bioinformatic applications. Because they offer a strong statistical background and allow model interpretation, we propose a method based on hidden Markov models.  相似文献   

16.

Introduction

External validation studies are essential to study the generalizability of prediction models. Recently a permutation test, focusing on discrimination as quantified by the c-statistic, was proposed to judge whether a prediction model is transportable to a new setting. We aimed to evaluate this test and compare it to previously proposed procedures to judge any changes in c-statistic from development to external validation setting.

Methods

We compared the use of the permutation test to the use of benchmark values of the c-statistic following from a previously proposed framework to judge transportability of a prediction model. In a simulation study we developed a prediction model with logistic regression on a development set and validated them in the validation set. We concentrated on two scenarios: 1) the case-mix was more heterogeneous and predictor effects were weaker in the validation set compared to the development set, and 2) the case-mix was less heterogeneous in the validation set and predictor effects were identical in the validation and development set. Furthermore we illustrated the methods in a case study using 15 datasets of patients suffering from traumatic brain injury.

Results

The permutation test indicated that the validation and development set were homogenous in scenario 1 (in almost all simulated samples) and heterogeneous in scenario 2 (in 17%-39% of simulated samples). Previously proposed benchmark values of the c-statistic and the standard deviation of the linear predictors correctly pointed at the more heterogeneous case-mix in scenario 1 and the less heterogeneous case-mix in scenario 2.

Conclusion

The recently proposed permutation test may provide misleading results when externally validating prediction models in the presence of case-mix differences between the development and validation population. To correctly interpret the c-statistic found at external validation it is crucial to disentangle case-mix differences from incorrect regression coefficients.  相似文献   

17.
Danielson ML  Lill MA 《Proteins》2012,80(1):246-260
Flexible loop regions play a critical role in the biological function of many proteins and have been shown to be involved in ligand binding. In the context of structure-based drug design, using or predicting an incorrect loop configuration can be detrimental to the study if the loop is capable of interacting with the ligand. Three protein systems, each with at least one flexible loop region in close proximity to the known binding site, were selected for loop prediction using the CorLps program; a six residue loop region from phosphoribosylglycinamide formyltransferase (GART), two nine residue loop regions from cytochrome P450 (CYP) 119, and an 11 residue loop region from enolase were selected for loop prediction. The results of this study indicate that the statistically based DFIRE scoring function implemented in the CorLps program did not accurately rank native-like predicted loop configurations in any protein system. In an attempt to improve the ranking of the native-like predicted loop configurations, the MM/GBSA and the optimized MM/GBSA-dsr scoring functions were used to re-rank the predicted loops with and without bound ligand. In general, single snapshot MM/GBSA scoring provided the best ranking of native-like loop configurations. Based on the scoring function analyses presented, the optimal ranking of native-like loop configurations is still a difficult challenge and the choice of the "best" scoring function appears to be system dependent.  相似文献   

18.

Background

The interest in prognostic reviews is increasing, but to properly review existing evidence an accurate search filer for finding prediction research is needed. The aim of this paper was to validate and update two previously introduced search filters for finding prediction research in Medline: the Ingui filter and the Haynes Broad filter.

Methodology/Principal Findings

Based on a hand search of 6 general journals in 2008 we constructed two sets of papers. Set 1 consisted of prediction research papers (n = 71), and set 2 consisted of the remaining papers (n = 1133). Both search filters were validated in two ways, using diagnostic accuracy measures as performance measures. First, we compared studies in set 1 (reference) with studies retrieved by the search strategies as applied in Medline. Second, we compared studies from 4 published systematic reviews (reference) with studies retrieved by the search filter as applied in Medline. Next – using word frequency methods – we constructed an additional search string for finding prediction research. Both search filters were good in identifying clinical prediction models: sensitivity ranged from 0.94 to 1.0 using our hand search as reference, and 0.78 to 0.89 using the systematic reviews as reference. This latter performance measure even increased to around 0.95 (range 0.90 to 0.97) when either search filter was combined with the additional string that we developed. Retrieval rate of explorative prediction research was poor, both using our hand search or our systematic review as reference, and even combined with our additional search string: sensitivity ranged from 0.44 to 0.85.

Conclusions/Significance

Explorative prediction research is difficult to find in Medline, using any of the currently available search filters. Yet, application of either the Ingui filter or the Haynes broad filter results in a very low number missed clinical prediction model studies.  相似文献   

19.

Background  

The number of protein targets with a known or predicted tri-dimensional structure and of drug-like chemical compounds is growing rapidly and so is the need for new therapeutic compounds or chemical probes. Performing flexible structure-based virtual screening computations on thousands of targets with millions of molecules is intractable to most laboratories nor indeed desirable. Since shape complementarity is of primary importance for most protein-ligand interactions, we have developed a tool/protocol based on rigid-body docking to select compounds that fit well into binding sites.  相似文献   

20.

Background

Mortality prediction models generally require clinical data or are derived from information coded at discharge, limiting adjustment for presenting severity of illness in observational studies using administrative data.

Objectives

To develop and validate a mortality prediction model using administrative data available in the first 2 hospital days.

Research Design

After dividing the dataset into derivation and validation sets, we created a hierarchical generalized linear mortality model that included patient demographics, comorbidities, medications, therapies, and diagnostic tests administered in the first 2 hospital days. We then applied the model to the validation set.

Subjects

Patients aged ≥18 years admitted with pneumonia between July 2007 and June 2010 to 347 hospitals in Premier, Inc.’s Perspective database.

Measures

In hospital mortality.

Results

The derivation cohort included 200,870 patients and the validation cohort had 50,037. Mortality was 7.2%. In the multivariable model, 3 demographic factors, 25 comorbidities, 41 medications, 7 diagnostic tests, and 9 treatments were associated with mortality. Factors that were most strongly associated with mortality included receipt of vasopressors, non-invasive ventilation, and bicarbonate. The model had a c-statistic of 0.85 in both cohorts. In the validation cohort, deciles of predicted risk ranged from 0.3% to 34.3% with observed risk over the same deciles from 0.1% to 33.7%.

Conclusions

A mortality model based on detailed administrative data available in the first 2 hospital days had good discrimination and calibration. The model compares favorably to clinically based prediction models and may be useful in observational studies when clinical data are not available.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号