首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
Currently, 119 high resolution structures of Thermotoga maritima proteins have been determined by the Joint Center for Structural Genomics (JCSG, www.jcsg.org). Sixty-seven of these were solved using the first implementation of the multi-tiered crystallization strategy at the JCSG for the efficient crystallization of large numbers of protein targets. Previously, we reported the analysis of all proteins crystallized using this multi-tiered strategy [Lesley, S.A. et al. (2002) Proc. Natl. Acad. Sci. USA 99, 11664–11669; Page, R. et al. (2003) Acta Crystallogr. D Biol. Crystallogr. 59, 1028–1037]. Here, we extend the analysis and describe the crystallization characteristics of those proteins that produced diffraction quality crystals, ultimately resulting in high resolution structures. First, we found that over 77% (52) of the crystals used for structure determination were produced directly from high-throughput coarse screens, indicating that less than one quarter of the crystals (15) required fine screening. In addition, as observed for the proteome screen [Page, R. et al. (2003) Acta Crystallogr. D Biol. Crystallogr. 59, 1028–1037], the majority of conditions that produced crystals for natively expressed proteins, whose structures have been determined, were distinct from those of their more extensively purified and selenomethionine-labeled counterparts. Finally, 99% of the proteins whose structures were solved crystallized in conditions contained in the JCSG Minimal Core Screen [Page, R. et al. (2003) Acta Crystallogr. D Biol. Crystallogr. 59, 1028–1037; Page, R. and Stevens, R.C. (2004) Methods 34, 373–389], a set of 67 conditions previously identified as those most likely to produce crystals of a diverse set of proteins, confirming its success for rapid identification of proteins with a natural propensity to crystallize.  相似文献   

2.
MOTIVATION: Modeling of protein interactions is often possible from known structures of related complexes. It is often time-consuming to find the most appropriate template. Hypothesized biological units (BUs) often differ from the asymmetric units and it is usually preferable to model from the BUs. RESULTS: ProtBuD is a database of BUs for all structures in the Protein Data Bank (PDB). We use both the PDBs BUs and those from the Protein Quaternary Server. ProtBuD is searchable by PDB entry, the Structural Classification of Proteins (SCOP) designation or pairs of SCOP designations. The database provides the asymmetric and BU contents of related proteins in the PDB as identified in SCOP and Position-Specific Iterated BLAST (PSI-BLAST). The asymmetric unit is different from PDB and/or Protein Quaternary Server (PQS) BUs for 52% of X-ray structures, and the PDB and PQS BUs disagree on 18% of entries. AVAILABILITY: The database is provided as a standalone program and a web server from http://dunbrack.fccc.edu/ProtBuD.php.  相似文献   

3.
The J-UNIO (JCSG protocol using the software UNIO) procedure for automated protein structure determination by NMR in solution is introduced. In the present implementation, J-UNIO makes use of APSY-NMR spectroscopy, 3D heteronuclear-resolved [(1)H,(1)H]-NOESY experiments, and the software UNIO. Applications with proteins from the JCSG target list with sizes up to 150 residues showed that the procedure is highly robust and efficient. In all instances the correct polypeptide fold was obtained in the first round of automated data analysis and structure calculation. After interactive validation of the data obtained from the automated routine, the quality of the final structures was comparable to results from interactive structure determination. Special advantages are that the NMR data have been recorded with 6-10 days of instrument time per protein, that there is only a single step of chemical shift adjustments to relate the backbone signals in the APSY-NMR spectra with the corresponding backbone signals in the NOESY spectra, and that the NOE-based amino acid side chain chemical shift assignments are automatically focused on those residues that are heavily weighted in the structure calculation. The individual working steps of J-UNIO are illustrated with the structure determination of the protein YP_926445.1 from Shewanella amazonensis, and the results obtained with 17 JCSG targets are critically evaluated.  相似文献   

4.
MOTIVATION: Experimental techniques alone cannot keep up with the production rate of protein sequences, while computational techniques for protein structure predictions have matured to such a level to provide reliable structural characterization of proteins at large scale. Integration of multiple computational tools for protein structure prediction can complement experimental techniques. RESULTS: We present an automated pipeline for protein structure prediction. The centerpiece of the pipeline is our threading-based protein structure prediction system PROSPECT. The pipeline consists of a dozen tools for identification of protein domains and signal peptide, protein triage to determine the protein type (membrane or globular), protein fold recognition, generation of atomic structural models, prediction result validation, etc. Different processing and prediction branches are determined automatically by a prediction pipeline manager based on identified characteristics of the protein. The pipeline has been implemented to run in a heterogeneous computational environment as a client/server system with a web interface. Genome-scale applications on Caenorhabditis elegans, Pyrococcus furiosus and three cyanobacterial genomes are presented. AVAILABILITY: The pipeline is available at http://compbio.ornl.gov/proteinpipeline/  相似文献   

5.
6.
The POLYVIEW visualization server can be used to generate protein sequence annotations, including secondary structures, relative solvent accessibilities, functional motifs and polymorphic sites. Two-dimensional graphical representations in a customizable format may be generated for both known protein structures and predictions obtained using protein structure prediction servers. POLYVIEW may be used for automated generation of pictures with structural and functional annotations for publications and proteomic on-line resources. AVAILABILITY: http://polyview.cchmc.org.  相似文献   

7.
Membrane proteins constitute ~30% of prokaryotic and eukaryotic genomes but comprise a small fraction of the entries in protein structural databases. A number of features of membrane proteins render them challenging targets for the structural biologist, among which the most important is the difficulty in obtaining sufficient quantities of purified protein. We are exploring procedures to express and purify large numbers of prokaryotic membrane proteins. A set of 280 membrane proteins from Escherichia coli and Thermotoga maritima, a thermophile, was cloned and tested for expression in Escherichia coli. Under a set of standard conditions, expression could be detected in the membrane fraction for approximately 30% of the cloned targets. About 22 of the highest expressing membrane proteins were purified, typically in just two chromatographic steps. There was a clear correlation between the number of predicted transmembrane domains in a given target and its propensity to express and purify. Accordingly, the vast majority of successfully expressed and purified proteins had six or fewer transmembrane domains. We did not observe any clear advantage to the use of thermophilic targets. Two of the purified membrane proteins formed crystals. By comparison with protein production efforts for soluble proteins, where ∼70% of cloned targets express and ∼25% can be readily purified for structural studies [Christendat et al. (2000) Nat. Struct. Biol., 7, 903], our results demonstrate that a similar approach will succeed for membrane proteins, albeit with an expected higher attrition rate.  相似文献   

8.
9.
Introduction: Protein glycosylation is recognized as an important post-translational modification, with specific substructures having significant effects on protein folding, conformation, distribution, stability and activity. However, due to the structural complexity of glycans, elucidating glycan structure-function relationships is demanding. The fine detail of glycan structures attached to proteins (including sequence, branching, linkage and anomericity) is still best analysed after the glycans are released from the purified or mixture of glycoproteins (glycomics). The technologies currently available for glycomics are becoming streamlined and standardized and many features of protein glycosylation can now be determined using instruments available in most protein analytical laboratories.

Areas covered: This review focuses on the current glycomics technologies being commonly used for the analysis of the microheterogeneity of monosaccharide composition, sequence, branching and linkage of released N- and O-linked glycans that enable the determination of precise glycan structural determinants presented on secreted proteins and on the surface of all cells.

Expert commentary: Several emerging advances in these technologies enabling glycomics analysis are discussed. The technological and bioinformatics requirements to be able to accurately assign these precise glycan features at biological levels in a disease context are assessed.  相似文献   


10.
A standard set of three APSY-NMR experiments has been used in daily practice to obtain polypeptide backbone NMR assignments in globular proteins with sizes up to about 150 residues, which had been identified as targets for structure determination by the Joint Center for Structural Genomics (JCSG) under the auspices of the Protein Structure Initiative (PSI). In a representative sample of 30 proteins, initial fully automated data analysis with the software UNIO-MATCH-2014 yielded complete or partial assignments for over 90 % of the residues. For most proteins the APSY data acquisition was completed in less than 30 h. The results of the automated procedure provided a basis for efficient interactive validation and extension to near-completion of the assignments by reference to the same 3D heteronuclear-resolved [1H,1H]-NOESY spectra that were subsequently used for the collection of conformational constraints. High-quality structures were obtained for all 30 proteins, using the J-UNIO protocol, which includes extensive automation of NMR structure determination.  相似文献   

11.

Background

Knottins are small, diverse and stable proteins with important drug design potential. They can be classified in 30 families which cover a wide range of sequences (1621 sequenced), three-dimensional structures (155 solved) and functions (> 10). Inter knottin similarity lies mainly between 15% and 40% sequence identity and 1.5 to 4.5 Å backbone deviations although they all share a tightly knotted disulfide core. This important variability is likely to arise from the highly diverse loops which connect the successive knotted cysteines. The prediction of structural models for all knottin sequences would open new directions for the analysis of interaction sites and to provide a better understanding of the structural and functional organization of proteins sharing this scaffold.

Results

We have designed an automated modeling procedure for predicting the three-dimensionnal structure of knottins. The different steps of the homology modeling pipeline were carefully optimized relatively to a test set of knottins with known structures: template selection and alignment, extraction of structural constraints and model building, model evaluation and refinement. After optimization, the accuracy of predicted models was shown to lie between 1.50 and 1.96 Å from native structures at 50% and 10% maximum sequence identity levels, respectively. These average model deviations represent an improvement varying between 0.74 and 1.17 Å over a basic homology modeling derived from a unique template. A database of 1621 structural models for all known knottin sequences was generated and is freely accessible from our web server at http://knottin.cbs.cnrs.fr. Models can also be interactively constructed from any knottin sequence using the structure prediction module Knoter1D3D available from our protein analysis toolkit PAT at http://pat.cbs.cnrs.fr.

Conclusions

This work explores different directions for a systematic homology modeling of a diverse family of protein sequences. In particular, we have shown that the accuracy of the models constructed at a low level of sequence identity can be improved by 1) a careful optimization of the modeling procedure, 2) the combination of multiple structural templates and 3) the use of conserved structural features as modeling restraints.
  相似文献   

12.
The identification of protein–protein interactions is vital for understanding protein function, elucidating interaction mechanisms, and for practical applications in drug discovery. With the exponentially growing protein sequence data, fully automated computational methods that predict interactions between proteins are becoming essential components of system‐level function inference. A thorough analysis of protein complex structures demonstrated that binding site locations as well as the interfacial geometry are highly conserved across evolutionarily related proteins. Because the conformational space of protein–protein interactions is highly covered by experimental structures, sensitive protein threading techniques can be used to identify suitable templates for the accurate prediction of interfacial residues. Toward this goal, we developed eFindSitePPI, an algorithm that uses the three‐dimensional structure of a target protein, evolutionarily remotely related templates and machine learning techniques to predict binding residues. Using crystal structures, the average sensitivity (specificity) of eFindSitePPI in interfacial residue prediction is 0.46 (0.92). For weakly homologous protein models, these values only slightly decrease to 0.40–0.43 (0.91–0.92) demonstrating that eFindSitePPI performs well not only using experimental data but also tolerates structural imperfections in computer‐generated structures. In addition, eFindSitePPI detects specific molecular interactions at the interface; for instance, it correctly predicts approximately one half of hydrogen bonds and aromatic interactions, as well as one third of salt bridges and hydrophobic contacts. Comparative benchmarks against several dimer datasets show that eFindSitePPI outperforms other methods for protein‐binding residue prediction. It also features a carefully tuned confidence estimation system, which is particularly useful in large‐scale applications using raw genomic data. eFindSitePPI is freely available to the academic community at http://www.brylinski.org/efindsiteppi . Copyright © 2014 John Wiley & Sons, Ltd.  相似文献   

13.
We describe a fully automated algorithm for finding functional sites on protein structures. Our method finds surface patches of unusual physicochemical properties on protein structures, and estimates the patches' probability of overlapping functional sites. Other methods for predicting the locations of specific types of functional sites exist, but in previous analyses, it has been difficult to compare methods when they are applied to different types of sites. Thus, we introduce a new statistical framework that enables rigorous comparisons of the usefulness of different physicochemical properties for predicting virtually any kind of functional site. The program's statistical models were trained for 11 individual properties (electrostatics, concavity, hydrophobicity, etc.) and for 15 neural network combination properties, all optimized and tested on 15 diverse protein functions. To simulate what to expect if the program were run on proteins of unknown function, as might arise from structural genomics, we tested it on 618 proteins of diverse mixed functions. In the higher-scoring top half of all predictions, a functional residue could typically be found within the first 1.7 residues chosen at random. The program may or may not use partial information about the protein's function type as an input, depending on which statistical model the user chooses to employ. If function type is used as an additional constraint, prediction accuracy usually increases, and is particularly good for enzymes, DNA-interacting sites, and oligomeric interfaces. The program can be accessed online (at http://hotpatch.mbi.ucla.edu).  相似文献   

14.
15.
《遗传学报》2022,49(1):20-29
G-quadruplexes in viral genomes can be applied as the targets of antiviral therapies, which has attracted wide interest. However, it is still not clear whether the pervasive number of such elements in the viral world is the result of natural selection for functionality. In this study, we identified putative quadruplex-forming sequences (PQSs) across the known viral genomes and analyzed the abundance, structural stability, and conservation of viral PQSs. A Viral Putative G-quadruplex Database (http://jsjds.hzau.edu.cn/MBPC/ViPGD/index.php/home/index) was constructed to collect the details of each viral PQS, which provides guidance for selecting the desirable PQS. The PQS with two putative G-tetrads (G2-PQS) was significantly enriched in both eukaryotic viruses and prokaryotic viruses, whereas the PQSs with three putative G-tetrads (G3-PQS) were only enriched in eukaryotic viruses and depleted in prokaryotic viruses. The structural stability of PQSs in prokaryotic viruses was significantly lower than that in eukaryotic viruses. Conservation analysis showed that the G2-PQS, instead of G3-PQS, was highly conserved within the genus. This suggested that the G2-quadruplex might play an important role in viral biology, and the difference in the occurrence of G-quadruplex between eukaryotic viruses and prokaryotic viruses may result from the different selection pressures from hosts.  相似文献   

16.
17.
The wealth of genomic data available for many organisms has set the stage for the next phase of structure-function analysis. High-throughput structural genomics is currently the method of choice for rapid analysis of protein structure-function relationships on a proteome-wide basis. The Joint Center for Structural Genomics (JCSG), established in 2000 under the NIH/NIGMS Protein Structure Initiative, has developed and implemented an integrated high-throughput structure pipeline and applied it in a 2-tiered approach to mining the proteome of the thermophilic bacterium Thermotoga maritima. In the first tier, the successful application of this integrated pipeline has resulted in the cloning and expression of 73% of the T. maritima proteome (1376 out of 1877 predicted genes), and has identified 465 proteins which produced crystal hits. These 465 proteins were compared with existing structural information and a subset of 269 targets were selected to process towards structure determination in a second tier effort. To date, the JCSG pipeline applied to the Thermotoga maritima proteome has resulted in 55 new structures and has identified 6 novel folds and continues to identify structures with novel features.  相似文献   

18.
G protein-coupled receptors (GPCRs), encoded by about 5% of human genes, comprise the largest family of integral membrane proteins and act as cell surface receptors responsible for the transduction of endogenous signal into a cellular response. Although tertiary structural information is crucial for function annotation and drug design, there are few experimentally determined GPCR structures. To address this issue, we employ the recently developed threading assembly refinement (TASSER) method to generate structure predictions for all 907 putative GPCRs in the human genome. Unlike traditional homology modeling approaches, TASSER modeling does not require solved homologous template structures; moreover, it often refines the structures closer to native. These features are essential for the comprehensive modeling of all human GPCRs when close homologous templates are absent. Based on a benchmarked confidence score, approximately 820 predicted models should have the correct folds. The majority of GPCR models share the characteristic seven-transmembrane helix topology, but 45 ORFs are predicted to have different structures. This is due to GPCR fragments that are predominantly from extracellular or intracellular domains as well as database annotation errors. Our preliminary validation includes the automated modeling of bovine rhodopsin, the only solved GPCR in the Protein Data Bank. With homologous templates excluded, the final model built by TASSER has a global C(alpha) root-mean-squared deviation from native of 4.6 angstroms, with a root-mean-squared deviation in the transmembrane helix region of 2.1 angstroms. Models of several representative GPCRs are compared with mutagenesis and affinity labeling data, and consistent agreement is demonstrated. Structure clustering of the predicted models shows that GPCRs with similar structures tend to belong to a similar functional class even when their sequences are diverse. These results demonstrate the usefulness and robustness of the in silico models for GPCR functional analysis. All predicted GPCR models are freely available for noncommercial users on our Web site (http://www.bioinformatics.buffalo.edu/GPCR).  相似文献   

19.
FliG and FliM are switch proteins that regulate the rotation and switching of the flagellar motor. Several assembly models for FliG and FliM have recently been proposed; however, it remains unclear whether the assembly of the switch proteins is conserved among different bacterial species. We applied a combination of pull‐down, thermodynamic and structural analyses to characterize the FliM–FliG association from the mesophilic bacterium Helicobacter pylori. FliM binds to FliG with micromolar binding affinity, and their interaction is mediated through the middle domain of FliG (FliGM), which contains the EHPQR motif. Crystal structures of the middle domain of H. pylori FliM (FliMM) and FliGM–FliMM complex revealed that FliG binding triggered a conformational change of the FliM α3‐α1′ loop, especially Asp130 and Arg144. We furthermore showed that various highly conserved residues in this region are required for FliM–FliG complex formation. Although the FliM–FliG complex structure displayed a conserved binding mode when compared with Thermotoga maritima, variable residues were identified that may contribute to differential binding affinities across bacterial species. Comparison of the thermodynamic parameters of FliG–FliM interactions between H. pylori and Escherichia coli suggests that molecular basis and binding properties of FliM to FliG is likely different between these two species.  相似文献   

20.
Circularly permuted green fluorescent protein (cGFP) was inserted into the hyperthermophilic maltose binding protein at two different locations. cGFP was inserted between amino acid residues 206 and 207, or fused to the N-terminal of maltose binding protein from Thermotoga maritima. The cloned DNA constructs were expressed in Escherichia coli cells, and purified by metal chelate affinity chromatography. Conformational change upon ligand binding was monitored by the increase in fluorescence intensity. Both of the fusion proteins developed significant fluorescence change at 0.5 mM maltose concentration, whereas their maltose binding affinities and optimum incubation times were different. Fluorescent biosensors based on mesophilic maltose binding proteins have been described in the literature, but there is a growing interest in biosensors based on thermostable proteins. Therefore, the developed protein constructs could be models for thermophilic protein-based fluorescent biosensors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号