首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Selection of protein targets for study is central to structural biology and may be influenced by numerous factors. A key aim is to maximise returns for effort invested by identifying proteins with the balance of biophysical properties that are conducive to success at all stages (e.g. solubility, crystallisation) in the route towards a high resolution structural model. Selected targets can be optimised through construct design (e.g. to minimise protein disorder), switching to a homologous protein, and selection of experimental methodology (e.g. choice of expression system) to prime for efficient progress through the structural proteomics pipeline. Here we discuss computational techniques in target selection and optimisation, with more detailed focus on tools developed within the Scottish Structural Proteomics Facility (SSPF); namely XANNpred, ParCrys, OB-Score (target selection) and TarO (target optimisation). TarO runs a large number of algorithms, searching for homologues and annotating the pool of possible alternative targets. This pool of putative homologues is presented in a ranked, tabulated format and results are also visualised as an automatically generated and annotated multiple sequence alignment. The target selection algorithms each predict the propensity of a selected protein target to progress through the experimental stages leading to diffracting crystals. This single predictor approach has advantages for target selection, when compared with an approach using two or more predictors that each predict for success at a single experimental stage. The tools described here helped SSPF achieve a high (21%) success rate in progressing cloned targets to diffraction-quality crystals.  相似文献   

2.
X-ray crystallography is the primary approach to solve the three-dimensional structure of a protein. However, a major bottleneck of this method is the failure of multi-step experimental procedures to yield diffraction-quality crystals, including sequence cloning, protein material production, purification, crystallization and ultimately, structural determination. Accordingly, prediction of the propensity of a protein to successfully undergo these experimental procedures based on the protein sequence may help narrow down laborious experimental efforts and facilitate target selection. A number of bioinformatics methods based on protein sequence information have been developed for this purpose. However, our knowledge on the important determinants of propensity for a protein sequence to produce high diffraction-quality crystals remains largely incomplete. In practice, most of the existing methods display poorer performance when evaluated on larger and updated datasets. To address this problem, we constructed an up-to-date dataset as the benchmark, and subsequently developed a new approach termed ‘PredPPCrys’ using the support vector machine (SVM). Using a comprehensive set of multifaceted sequence-derived features in combination with a novel multi-step feature selection strategy, we identified and characterized the relative importance and contribution of each feature type to the prediction performance of five individual experimental steps required for successful crystallization. The resulting optimal candidate features were used as inputs to build the first-level SVM predictor (PredPPCrys I). Next, prediction outputs of PredPPCrys I were used as the input to build second-level SVM classifiers (PredPPCrys II), which led to significantly enhanced prediction performance. Benchmarking experiments indicated that our PredPPCrys method outperforms most existing procedures on both up-to-date and previous datasets. In addition, the predicted crystallization targets of currently non-crystallizable proteins were provided as compendium data, which are anticipated to facilitate target selection and design for the worldwide structural genomics consortium. PredPPCrys is freely available at http://www.structbioinfor.org/PredPPCrys.  相似文献   

3.

Background  

Current protocols yield crystals for <30% of known proteins, indicating that automatically identifying crystallizable proteins may improve high-throughput structural genomics efforts. We introduce CRYSTALP2, a kernel-based method that predicts the propensity of a given protein sequence to produce diffraction-quality crystals. This method utilizes the composition and collocation of amino acids, isoelectric point, and hydrophobicity, as estimated from the primary sequence, to generate predictions. CRYSTALP2 extends its predecessor, CRYSTALP, by enabling predictions for sequences of unrestricted size and provides improved prediction quality.  相似文献   

4.
Overton IM  Barton GJ 《FEBS letters》2006,580(16):4005-4009
Target selection and ranking is fundamental to structural genomics. We present a Z-score scale, the "OB-Score", to rank potential targets by their predicted propensity to produce diffraction-quality crystals. The OB-Score is derived from a matrix of predicted isoelectric point and hydrophobicity values for nonredundant PDB entries solved to or=1 member with a high OB-Score, presenting favourable candidates for structural studies.  相似文献   

5.
Determining the structure of biological macromolecules by X-ray crystallography involves a series of steps: selection of the target molecule; cloning, expression, purification and crystallization; collection of diffraction data and determination of atomic positions. However, even when pure soluble protein is available, producing high-quality crystals remains a major bottleneck in structure determination. Here we present a guide for the non-expert to screen for appropriate crystallization conditions and optimize diffraction-quality crystal growth.  相似文献   

6.
Advances in genomics have yielded entire genetic sequences for a variety of prokaryotic and eukaryotic organisms. This accumulating information has escalated the demands for three-dimensional protein structure determinations. As a result, high-throughput structural genomics has become a major international research focus. This effort has already led to several significant improvements in X-ray crystallographic and nuclear magnetic resonance methodologies. Crystallography is currently the major contributor to three-dimensional protein structure information. However, the production of soluble, purified protein and diffraction-quality crystals are clearly the major roadblocks preventing the realization of high-throughput structure determination.

This paper discusses a novel approach that may improve the efficiency and success rate for protein crystallization. An automated nanodispensing system is used to rapidly prepare crystallization conditions using minimal sample. Proteins are subjected to an incomplete factorial screen (balanced parameter screen), thereby efficiently searching the entire “crystallization space” for suitable conditions. The screen conditions and scored experimental results are subsequently analyzed using a neural network algorithm to predict new conditions likely to yield improved crystals. Results based on a small number of proteins suggest that the combination of a balanced incomplete factorial screen and neural network analysis may provide an efficient method for producing diffraction-quality protein crystals.  相似文献   


7.
8.
The prediction of functional sites in newly solved protein structures is a challenge for computational structural biology. Most methods for approaching this problem use evolutionary conservation as the primary indicator of the location of functional sites. However, sequence conservation reflects not only evolutionary selection at functional sites to maintain protein function, but also selection throughout the protein to maintain the stability of the folded state. To disentangle sequence conservation due to protein functional constraints from sequence conservation due to protein structural constraints, we use all atom computational protein design methodology to predict sequence profiles expected under solely structural constraints, and to compute the free energy difference between the naturally occurring amino acid and the lowest free energy amino acid at each position. We show that functional sites are more likely than non-functional sites to have computed sequence profiles which differ significantly from the naturally occurring sequence profiles and to have residues with sub-optimal free energies, and that incorporation of these two measures improves sequence based prediction of protein functional sites. The combined sequence and structure based functional site prediction method has been implemented in a publicly available web server.  相似文献   

9.
The limiting step in macromolecular crystallography is the preparation protein crystals suitable for X-ray diffraction studies. A strong prerequisite for the success of crystallization experiments is the ability to produce monodisperse and properly folded protein samples. Since the production of most protein is usually achieved using recombinant methods, it has become possible to engineer target proteins with increased propensities to form well diffracting crystals. Recent advances in bioinformatics, which takes advantage from an enhanced information in the protein databases, are of enormous help for the design of modified proteins. Based on bioinformatics analyses, the reduction of the structural complexity of proteins or their site-specific mutagenesis has proven to have a dramatic impact on both the yield of heterologous protein expression and its crystallizability. Therefore, protein engineering represents a valid tool which supports the classical crystallization screenings with a more rational approach. This review describes key methods of protein-engineering and provides a number of examples of their successful use in crystallization. Scope of proposed topic: This Topic is focused on state-of-art protein engineering techniques to increase the propensity of proteins to form crystals with suitable X-ray diffraction properties. Protein engineering methods have proven to be of great help for the crystallization of difficult targets. We herein review molecular biology and chemical methods to help protein crystallization.  相似文献   

10.
Therapeutic antibodies must encompass drug product suitable attributes to be commercially marketed. An undesirable antibody characteristic is the propensity to aggregate. Although there are computational algorithms that predict the propensity of a protein to aggregate from sequence information alone, few consider the relevance of the native structure. The Spatial Aggregation Propensity (SAP) algorithm developed by Chennamsetty et. al. incorporates structural and sequence information to identify motifs that contribute to protein aggregation. We have utilized the algorithm to design variants of a highly aggregation prone IgG2. All variants were tested in a variety of high-throughput, small-scale assays to assess the utility of the method described herein. Many variants exhibited improved aggregation stability whether induced by agitation or thermal stress while still retaining bioactivity.  相似文献   

11.
《MABS-AUSTIN》2013,5(6):1540-1550
Therapeutic antibodies must encompass drug product suitable attributes to be commercially marketed. An undesirable antibody characteristic is the propensity to aggregate. Although there are computational algorithms that predict the propensity of a protein to aggregate from sequence information alone, few consider the relevance of the native structure. The Spatial Aggregation Propensity (SAP) algorithm developed by Chennamsetty et. al. incorporates structural and sequence information to identify motifs that contribute to protein aggregation. We have utilized the algorithm to design variants of a highly aggregation prone IgG2. All variants were tested in a variety of high-throughput, small-scale assays to assess the utility of the method described herein. Many variants exhibited improved aggregation stability whether induced by agitation or thermal stress while still retaining bioactivity.  相似文献   

12.
Relatively low success rates of X-ray crystallography, which is the most popular method for solving proteins structures, motivate development of novel methods that support selection of tractable protein targets. This aspect is particularly important in the context of the current structural genomics efforts that allow for a certain degree of flexibility in the target selection. We propose CRYSpred, a novel in-silico crystallization propensity predictor that uses a set of 15 novel features which utilize a broad range of inputs including charge, hydrophobicity, and amino acid composition derived from the protein chain, and the solvent accessibility and disorder predicted from the protein sequence. Our method outperforms seven modern crystallization propensity predictors on three, independent from training dataset, benchmark test datasets. The strong predictive performance offered by the CRYSpred is attributed to the careful design of the features, utilization of the comprehensive set of inputs, and the usage of the Support Vector Machine classifier. The inputs utilized by CRYSpred are well-aligned with the existing rules-of-thumb that are used in the structural genomics studies.  相似文献   

13.
The production of diffraction-quality crystals remains a difficult obstacle on the road to high-resolution structural characterization of proteins. This is primarily a result of the empirical nature of the process. Although crystallization is not predictable, factors inhibiting it are well established. First, crystal formation is always entropically unfavorable. Reducing the entropic cost of crystallizing a given protein is thus desirable. It is common practice to map boundaries and remove unstructured regions surrounding the folded protein domain. However, a problem arises when flexible regions are not at the boundaries but within a domain. Such regions cannot be deleted without adding new restraints to the domain. We encountered this problem during an attempt to crystallize the beta subunit of the eukaryotic signal recognition particle (SRbeta), bearing a long and flexible internal loop. Native SRbeta did not crystallize. However, after circularly permuting the protein by connecting the spatially close N and C termini with a short heptapeptide linker GGGSGGG and removing 26 highly flexible loop residues within the domain, we obtained diffraction-quality crystals. This protein-engineering method is simple and should be applicable to other proteins, especially because N and C termini of protein domains are often close in space. The success of this method profits from prior knowledge of the domain fold, which is becoming increasingly common in today's postgenomic era.  相似文献   

14.
The high-throughput structure determination pipelines developed by structural genomics programs offer a unique opportunity for data mining. One important question is how protein properties derived from a primary sequence correlate with the protein’s propensity to yield X-ray quality crystals (crystallizability) and 3D X-ray structures. A set of protein properties were computed for over 1,300 proteins that expressed well but were insoluble, and for ~720 unique proteins that resulted in X-ray structures. The correlation of the protein’s iso-electric point and grand average hydropathy (GRAVY) with crystallizability was analyzed for full length and domain constructs of protein targets. In a second step, several additional properties that can be calculated from the protein sequence were added and evaluated. Using statistical analyses we have identified a set of the attributes correlating with a protein’s propensity to crystallize and implemented a Support Vector Machine (SVM) classifier based on these. We have created applications to analyze and provide optimal boundary information for query sequences and to visualize the data. These tools are available via the web site .  相似文献   

15.
Local modeling of global interactome networks   总被引:3,自引:0,他引:3  
MOTIVATION: Systems biology requires accurate models of protein complexes, including physical interactions that assemble and regulate these molecular machines. Yeast two-hybrid (Y2H) and affinity-purification/mass-spectrometry (AP-MS) technologies measure different protein-protein relationships, and issues of completeness, sensitivity and specificity fuel debate over which is best for high-throughput 'interactome' data collection. Static graphs currently used to model Y2H and AP-MS data neglect dynamic and spatial aspects of macromolecular complexes and pleiotropic protein function. RESULTS: We apply the local modeling methodology proposed by Scholtens and Gentleman (2004) to two publicly available datasets and demonstrate its uses, interpretation and limitations. Specifically, we use this technology to address four major issues pertaining to protein-protein networks. (1) We motivate the need to move from static global interactome graphs to local protein complex models. (2) We formally show that accurate local interactome models require both Y2H and AP-MS data, even in idealized situations. (3) We briefly discuss experimental design issues and how bait selection affects interpretability of results. (4) We point to the implications of local modeling for systems biology including functional annotation, new complex prediction, pathway interactivity and coordination with gene-expression data. AVAILABILITY: The local modeling algorithm and all protein complex estimates reported here can be found in the R package apComplex, available at http://www.bioconductor.org CONTACT: dscholtens@northwestern.edu SUPPLEMENTARY INFORMATION: http://daisy.prevmed.northwestern.edu/~denise/pubs/LocalModeling  相似文献   

16.
Application of molecular biology techniques to the production of new vaccines against different strains of the Newcastle disease virus (NDV) has been the subject of recent research reports. Development of improved techniques for genome sequencing has led to the recognition of protective mechanisms and the identification of possible candidate antigens. Such procedures could generate meaningful results regarding the virulence determinants of NDV. This study proposed an in silico approach by assembling potential and conserved epitopic regions of hemagglutinin–neuraminidase (HN) and fusion (F) glycoproteins of NDV to induce multiepitopic responses against the virus. Epitope predictions showed that the hypothetical synthetic construct could induce immature B and T cell epitopes that expect a high immune response because of their location in four distinct parts of the construct, namely the head, stalk and the heptad repeated regions known as the HRA and HRB domains. Most regions of the chimeric construct were found to have high antigenic propensity and surface accessibility, which further confirmed the strategy for selection of precise continuous and discontinuous epitopes of HN and F antigens. Thermodynamic folding of mRNA structures revealed correct folding of the RNA construct, indicating good stability of the mRNA to increase the efficiency of translation in the target host. The three-dimensional structure of the native HN-F chimeric protein was successfully generated and validated as a proper model which may define reliability, structural quality and conformation.  相似文献   

17.
Graph layout is extensively used in the field of mathematics and computer science, however these ideas and methods have not been extended in a general fashion to the construction of graphs for biological data. To this end, we have implemented a version of the Fruchterman Rheingold graph layout algorithm, extensively modified for the purpose of similarity analysis in biology. This algorithm rapidly and effectively generates clear two (2D) or three-dimensional (3D) graphs representing similarity relationships such as protein sequence similarity. The implementation of the algorithm is general and applicable to most types of similarity information for biological data. AVAILABILITY: BioLayout is available for most UNIX platforms at the following web-site: http://www.ebi.ac.uk/research/cgg/services/layout.  相似文献   

18.
Protein aggregation is the phenomenon of protein self-association potentially leading to detrimental effects on physiology, which is closely related to numerous human diseases such as Alzheimer's and Parkinson's disease. Despite progress in understanding the mechanism of protein aggregation, how natural selection against protein aggregation acts on subunits of protein complexes and on proteins with different contributions to organism fitness remains largely unknown. Here, we perform a proteome-wide analysis by using an experimentally validated algorithm TANGO and utilizing sequence, interactomic and phenotype-based functional genomic data from yeast, fly, and nematode. We find that proteins that are capable of forming homooligomeric complex have lower aggregation propensity compared with proteins that do not function as homooligomer. Further, proteins that are essential to the fitness of an organism have lower aggregation propensity compared with nonessential ones. Our finding suggests that the selection force against protein aggregation acts across different hierarchies of biological system.  相似文献   

19.
Studying similarities in protein molecules has become a fundamental activity in much of biology and biomedical research, for which methods such as multiple sequence alignments are widely used. Most methods available for such comparisons cater to studying proteins which have clearly recognizable evolutionary relationships but not to proteins that recognize the same or similar ligands but do not share similarities in their sequence or structural folds. In many cases, proteins in the latter class share structural similarities only in their binding sites. While several algorithms are available for comparing binding sites, there are none for deriving structural motifs of the binding sites, independent of the whole proteins. We report the development of SiteMotif, a new algorithm that compares binding sites from multiple proteins and derives sequence-order independent structural site motifs. We have tested the algorithm at multiple levels of complexity and demonstrate its performance in different scenarios. We have benchmarked against 3 current methods available for binding site comparison and demonstrate superior performance of our algorithm. We show that SiteMotif identifies new structural motifs of spatially conserved residues in proteins, even when there is no sequence or fold-level similarity. We expect SiteMotif to be useful for deriving key mechanistic insights into the mode of ligand interaction, predict the ligand type that a protein can bind and improve the sensitivity of functional annotation.  相似文献   

20.
Understanding the conformational propensities of proteins is key to solving many problems in structural biology and biophysics. The co‐variation of pairs of mutations contained in multiple sequence alignments of protein families can be used to build a Potts Hamiltonian model of the sequence patterns which accurately predicts structural contacts. This observation paves the way to develop deeper connections between evolutionary fitness landscapes of entire protein families and the corresponding free energy landscapes which determine the conformational propensities of individual proteins. Using statistical energies determined from the Potts model and an alignment of 2896 PDB structures, we predict the propensity for particular kinase family proteins to assume a “DFG‐out” conformation implicated in the susceptibility of some kinases to type‐II inhibitors, and validate the predictions by comparison with the observed structural propensities of the corresponding proteins and experimental binding affinity data. We decompose the statistical energies to investigate which interactions contribute the most to the conformational preference for particular sequences and the corresponding proteins. We find that interactions involving the activation loop and the C‐helix and HRD motif are primarily responsible for stabilizing the DFG‐in state. This work illustrates how structural free energy landscapes and fitness landscapes of proteins can be used in an integrated way, and in the context of kinase family proteins, can potentially impact therapeutic design strategies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号