首页 | 本学科首页   官方微博 | 高级检索  
 共查询到20条相似文献,搜索用时 31 毫秒
Currently, 119 high resolution structures of Thermotoga maritima proteins have been determined by the Joint Center for Structural Genomics (JCSG, www.jcsg.org). Sixty-seven of these were solved using the first implementation of the multi-tiered crystallization strategy at the JCSG for the efficient crystallization of large numbers of protein targets. Previously, we reported the analysis of all proteins crystallized using this multi-tiered strategy [Lesley, S.A. et al. (2002) Proc. Natl. Acad. Sci. USA 99, 11664–11669; Page, R. et al. (2003) Acta Crystallogr. D Biol. Crystallogr. 59, 1028–1037]. Here, we extend the analysis and describe the crystallization characteristics of those proteins that produced diffraction quality crystals, ultimately resulting in high resolution structures. First, we found that over 77% (52) of the crystals used for structure determination were produced directly from high-throughput coarse screens, indicating that less than one quarter of the crystals (15) required fine screening. In addition, as observed for the proteome screen [Page, R. et al. (2003) Acta Crystallogr. D Biol. Crystallogr. 59, 1028–1037], the majority of conditions that produced crystals for natively expressed proteins, whose structures have been determined, were distinct from those of their more extensively purified and selenomethionine-labeled counterparts. Finally, 99% of the proteins whose structures were solved crystallized in conditions contained in the JCSG Minimal Core Screen [Page, R. et al. (2003) Acta Crystallogr. D Biol. Crystallogr. 59, 1028–1037; Page, R. and Stevens, R.C. (2004) Methods 34, 373–389], a set of 67 conditions previously identified as those most likely to produce crystals of a diverse set of proteins, confirming its success for rapid identification of proteins with a natural propensity to crystallize.  相似文献   

Protein crystallization is a major bottleneck in protein X-ray crystallography, the workhorse of most structural proteomics projects. Because the principles that govern protein crystallization are too poorly understood to allow them to be used in a strongly predictive sense, the most common crystallization strategy entails screening a wide variety of solution conditions to identify the small subset that will support crystal nucleation and growth. We tested the hypothesis that more efficient crystallization strategies could be formulated by extracting useful patterns and correlations from the large data sets of crystallization trials created in structural proteomics projects. A database of crystallization conditions was constructed for 755 different proteins purified and crystallized under uniform conditions. Forty-five percent of the proteins formed crystals. Data mining identified the conditions that crystallize the most proteins, revealed that many conditions are highly correlated in their behavior, and showed that the crystallization success rate is markedly dependent on the organism from which proteins derive. Of the proteins that crystallized in a 48-condition experiment, 60% could be crystallized in as few as 6 conditions and 94% in 24 conditions. Consideration of the full range of information coming from crystal screening trials allows one to design screens that are maximally productive while consuming minimal resources, and also suggests further useful conditions for extending existing screens.  相似文献   

By definition, structural genomics centers must be able to address a large number of diverse protein targets. The methods developed should permit parallel and cost-effective processing while allowing for the diverse nature of proteins. Our approach to this problem is a multi-tiered effort where targets are characterized and categorized by behavior and processed in parallel by appropriate methods. The Joint Center for Structural Genomics (JCSG) has applied this tactic to create a fully integrated and scaleable structure determination pipeline. Highlights of the development of the current pipeline for protein production and crystallization are presented here.  相似文献   

Structural genomics (SG) initiatives are currently attempting to achieve the high-throughput determination of protein structures on a genome-wide scale. Here we analyze the SG target data that have been publicly released over a period of 16 months to assess the potential of the SG initiatives. We use statistical techniques most commonly applied in epidemiology to describe the dynamics of targets through the experimental SG pipeline. There is no clear bottleneck among the key stages of cloning, expression, purification and crystallization. An SG target will progress through each of these steps with a probability of approximately 45%. Around 80% of targets with diffraction data will yield a crystal structure, and 20% of targets with HSQC spectra will yield an NMR structure. We also find the overlaps among SG targets: 61% of SG protein sequences share at least 30% sequence identity with one or more other SG targets. There is no significant difference in average structure quality among SG structures and other structures in the PDB determined by "traditional" methods, but on average SG structures are deposited to the PDB twice as quickly after X-ray data collection.  相似文献   

Conceptually, protein crystallization can be divided into two phases search and optimization. Robotic protein crystallization screening can speed up the search phase, and has a potential to increase process quality. Automated image classification helps to increase throughput and consistently generate objective results. Although the classification accuracy can always be improved, our image analysis system can classify images from 1536-well plates with high classification accuracy (85%) and ROC score (0.87), as evaluated on 127 human-classified protein screens containing 5600 crystal images and 189472 non-crystal images. Data mining can integrate results from high-throughput screens with information about crystallizing conditions, intrinsic protein properties, and results from crystallization optimization. We apply association mining, a data mining approach that identifies frequently occurring patterns among variables and their values. This approach segregates proteins into groups based on how they react in a broad range of conditions, and clusters cocktails to reflect their potential to achieve crystallization. These results may lead to crystallization screen optimization, and reveal associations between protein properties and crystallization conditions. We also postulate that past experience may lead us to the identification of initial conditions favorable to crystallization for novel proteins.  相似文献   



Structural studies of integral membrane proteins (IMPs) are often hampered by difficulties in producing stable homogenous samples for crystallization. To overcome this hurdle it has become common practice to screen large numbers of target proteins to find suitable candidates for crystallization. For such an approach to be effective, an efficient screening strategy is imperative. To this end, strategies have been developed that involve the use of green fluorescent protein (GFP) fusion constructs. However, these approaches suffer from two drawbacks; proteins with a translocated C-terminus cannot be tested and scale-up from analytical to preparative purification is often non-trivial and may require re-cloning.


Here we present a screening approach that prioritizes IMP targets based on three criteria: expression level, detergent solubilization yield and homogeneity as determined by high-throughput small-scale immobilized metal affinity chromatography (IMAC) and automated size-exclusion chromatography (SEC).


To validate the strategy, we screened 48 prokaryotic IMPs in two different vectors and two Escherichia coli strains. A set of 11 proteins passed all preset quality control checkpoints and was subjected to crystallization trials. Four of these crystallized directly in initial sparse matrix screens, highlighting the robustness of the strategy.


We have developed a rapid and cost efficient screening strategy that can be used for all IMPs regardless of topology. The analytical steps have been designed to be a good mimic of preparative purification, which greatly facilitates scale-up.

General significance

The screening approach presented here is intended and expected to help drive forward structural biology of membrane proteins.  相似文献   

As a part of the Joint Center for Structural Genomics (JCSG) biological targets, the structures of soluble domains of membrane proteins from Thermotoga maritima were pursued. Here, we report the crystal structure of the soluble domain of TM1634, a putative membrane protein of 128 residues (15.1 kDa) and unknown function. The soluble domain of TM1634 is an alpha-helical dimer that contains a single tetratrico peptide repeat (TPR) motif in each monomer where each motif is similar to that found in Tom20. The overall fold, however, is unique and a DALI search does not identify similar folds beyond the 38-residue TPR motif. Two different putative ligand binding sites, in which PEG200 and Co(2+) were located, were identified using crystallography and NMR, respectively.  相似文献   

Using a high degree of automation, the Southeast Collaboratory for Structural Genomics (SECSG) has developed high throughput pipelines for protein production, and crystallization using a two-tiered approach. Primary, or tier-1, protein production focuses on producing proteins for members of large Pfam families that lack a representative structure in the Protein Data Bank. Target genomes are Pyrococcus furiosus and Caenorhabditis elegans. Selected human proteins are also under study. Tier-2 protein production, or target rescue, focuses on those tier-1 proteins, which either fail to crystallize or give poorly diffracting crystals. This two tier approach is more efficient since it allows the primary protein production groups to focus on the production of new targets while the tier-2 efforts focus on providing additional sample for further studies and modified protein for structure determination. Both efforts feed the SECSG high throughput crystallization pipeline, which is capable of screening over 40 proteins per week. Details of the various pipelines in use by the SECSG for protein production and crystallization, as well as some examples of target rescue are described.  相似文献   

Protein crystallization is one of the major bottlenecks in protein structure elucidation with new strategies being constantly developed to improve the chances of crystallization. Generally, well‐ordered epitopes possessing complementary surface and capable of producing stable inter‐protein interactions generate a regular three‐dimensional arrangement of protein molecules which eventually results in a crystal lattice. Metals, when used for crystallization, with their various coordination numbers and geometries, can generate such epitopes mediating protein oligomerization and/or establish crystal contacts. Some examples of metal‐mediated oligomerization and crystallization together with our experience on metal‐mediated crystallization of a putative rRNA methyltransferase from Sinorhizobium meliloti are presented. Analysis of crystal structures from protein data bank (PDB) using a non‐redundant data set with a 90% identity cutoff, reveals that around 67% of proteins contain at least one metal ion, with ~14% containing combination of metal ions. Interestingly, metal containing conditions in most commercially available and popular crystallization kits generally contain only a single metal ion, with combinations of metals only in a very few conditions. Based on the results presented in this review, it appears that the crystallization screens need expansion with systematic screening of metal ions that could be crucial for stabilizing the protein structure or for establishing crystal contact and thereby aiding protein crystallization.  相似文献   

In structural genomics centers, nuclear magnetic resonance (NMR) screening is in increasing use as a tool to identify folded proteins that are promising targets for three-dimensional structure determination by X-ray crystallography or NMR spectroscopy. The use of 1D 1H NMR spectra or 2D [1H,15N]-correlation spectroscopy (COSY) typically requires milligram quantities of unlabeled or isotope-labeled protein, respectively. Here, we outline ways towards miniaturization of a structural genomics pipeline with NMR screening for folded globular proteins, using a high-density micro-fermentation device and a microcoil NMR probe. The proteins are micro-expressed in unlabeled or isotope-labeled media, purified, and then subjected to 1D 1H NMR and/or 2D [1H,15N]-COSY screening. To demonstrate that the miniaturization is functioning effectively, we processed nine mouse homologue protein targets and compared the results with those from the “macro-scale” Joint Center of Structural Genomics (JCSG) high-throughput pipeline. The results from the two pipelines were comparable, illustrating that the data were not compromised in the miniaturized approach.  相似文献   

Structural Genomics has been successful in determining the structures of many unique proteins in a high throughput manner. Still, the number of known protein sequences is much larger than the number of experimentally solved protein structures. Homology (or comparative) modeling methods make use of experimental protein structures to build models for evolutionary related proteins. Thereby, experimental structure determination efforts and homology modeling complement each other in the exploration of the protein structure space. One of the challenges in using model information effectively has been to access all models available for a specific protein in heterogeneous formats at different sites using various incompatible accession code systems. Often, structure models for hundreds of proteins can be derived from a given experimentally determined structure, using a variety of established methods. This has been done by all of the PSI centers, and by various independent modeling groups. The goal of the Protein Model Portal (PMP) is to provide a single portal which gives access to the various models that can be leveraged from PSI targets and other experimental protein structures. A single interface allows all existing pre-computed models across these various sites to be queried simultaneously, and provides links to interactive services for template selection, target-template alignment, model building, and quality assessment. The current release of the portal consists of 7.6 million model structures provided by different partner resources (CSMP, JCSG, MCSG, NESG, NYSGXRC, JCMM, ModBase, SWISS-MODEL Repository). The PMP is available at and from the PSI Structural Genomics Knowledgebase.  相似文献   

The crystallization facility of the TB Structural Genomics Consortium, one of nine NIH-sponsored structural genomics pilot projects, employs a combinatorial random sampling technique in high-throughput crystallization screening. Although data are still sparse and a comprehensive analysis cannot be performed at this stage, preliminary results appear to validate the random-screening concept. A discussion of statistical crystallization data analysis aims to draw attention to the need for comprehensive and valid sampling protocols. In view of limited overlap in techniques and sampling parameters between the publicly funded high-throughput crystallography initiatives, exchange of information should be encouraged, aiming to effectively integrate data mining efforts into a comprehensive predictive framework for protein crystallization.  相似文献   

Structural characterization of the protein universe is the main mission of Structural Genomics (SG) programs. However, progress in gene sequencing technology, set in motion in the 1990s, has resulted in rapid expansion of protein sequence space--a twelvefold increase in the past seven years. For the SG field, this creates new challenges and necessitates a re-assessment of its strategies. Nevertheless, despite the growth of sequence space, at present nearly half of the content of the Swiss-Prot database and over 40% of Pfam protein families can be structurally modeled based on structures determined so far, with SG projects making an increasingly significant contribution. The SG contribution of new Pfam structures nearly doubled from 27.2% in 2003 to 51.6% in 2006.  相似文献   

The Joint Center for Structural Genomics (JCSG) has emphasized automation and parallel processing approaches. Here, we describe automated methods used across the cloning process with results from JCSG projects. The protocols for PCR, restriction digests and ligations, as well as for gel electrophoresis and microtiter plate assays have all been automated. The system has the capacity to routinely process 384 clones a week. This throughput can adequately supply our expression and purification pipeline with expression-ready clones, including novel targets and truncations. The utility of our system is demonstrated by our results from three diverse projects. In summary, 94% of the PCR amplicons generated to date have been successfully cloned and verified by sequencing (83% of the total attempted targets). Our results demonstrate the capabilities of this robotic platform to provide an avenue to high-throughput cloning which requires little manpower and is rapid and cost-effective while providing insights for method optimization.  相似文献   

Structural genomics (SG) has significantly increased the number of novel protein structures of targets with medical relevance. In the protein kinase area, SG has contributed >50% of all novel kinases structures during the past three years and determined more than 30 novel catalytic domain structures. Many of the released structures are inhibitor complexes and a number of them have identified new inhibitor binding modes and scaffolds. In addition, generated reagents, assays, and inhibitor screening data provide a diversity of chemogenomic data that can be utilized for early drug development. Here we discuss the currently available structural data for the kinase family considering novel structures as well as inhibitor complexes. Our analysis revealed that the structural coverage of many kinases families is still rather poor, and inhibitor complexes with diverse inhibitors are only available for a few kinases. However, we anticipate that with the current rate of structure determination and high throughput technologies developed by SG programs these gaps will be closed soon. In addition, the generated reagents will put SG initiatives in a unique position providing data beyond protein structure determination by identifying chemical probes, determining their binding modes and target specificity.  相似文献   

High-throughput molecular biology and crystallography advances have placed an increasing demand on crystallization, the one remaining bottleneck in macromolecular crystallography. This paper describes three experimental approaches, an incomplete factorial crystallization screen, a high-throughput nanoliter crystallization system, and the use of a neural net to predict crystallization conditions via a small sample (approximately 0.1%) of screening results. The use of these technologies has the potential to reduce time and sample requirements. Initial experimental results indicate that the incomplete factorial design detects initial crystallization conditions not previously discovered using commercial screens. This may be due to the ability of the incomplete factorial screen to sample a broader portion of "crystallization space," using a multidimensional set of components, concentrations, and physical conditions. The incomplete factorial screen is complemented by a neural network program used to model crystallization. This capability is used to help predict new crystallization conditions. An automated, nanoliter crystallization system, with a throughput of up to 400 conditions/h in 40-nl droplets (total volume), accommodates microbatch or traditional "sitting-drop" vapor diffusion experiments. The goal of this research is to develop a fully-automated high-throughput crystallization system that integrates incomplete factorial screen and neural net capabilities.  相似文献   

Proteins derived from the coding regions of Pyrococcus furiosus are targets for three-dimensional X-ray and NMR structure determination by the Southeast Collaboratory for Structural Genomics (SECSG). Of the 2200 open reading frames (ORFs) in this organism, 220 protein targets were cloned and expressed in a high-throughput (HT) recombinant system for crystallographic studies. However, only 96 of the expressed proteins could be crystallized and, of these, only 15 have led to structures. To address this issue, SECSG has recently developed a two-tier approach to protein production and crystallization. In this approach, tier-1 efforts are focused on producing protein for new Pfu(italics?) targets using a high-throughput approach. Tier-2 protein production efforts support tier-1 activities by (1) producing additional protein for further crystallization trials, (2) producing modified protein (further purification, methylation, tag removal, selenium labeling, etc) as required and (3) serving as a salvaging pathway for failed tier-1 proteins. In a recent study using this two-tiered approach, nine structures were determined from a set of 50 Pfu proteins, which failed to produce crystals suitable for X-ray diffraction analysis. These results validate this approach and suggest that it has application to other HT crystal structure determination applications.  相似文献   

X-ray crystallography is the predominant method for obtaining atomic-scale information about biological macromolecules. Despite the success of the technique, obtaining well diffracting crystals still critically limits going from protein to structure. In practice, the crystallization process proceeds through knowledge-informed empiricism. Better physico-chemical understanding remains elusive because of the large number of variables involved, hence little guidance is available to systematically identify solution conditions that promote crystallization. To help determine relationships between macromolecular properties and their crystallization propensity, we have trained statistical models on samples for 182 proteins supplied by the Northeast Structural Genomics consortium. Gaussian processes, which capture trends beyond the reach of linear statistical models, distinguish between two main physico-chemical mechanisms driving crystallization. One is characterized by low levels of side chain entropy and has been extensively reported in the literature. The other identifies specific electrostatic interactions not previously described in the crystallization context. Because evidence for two distinct mechanisms can be gleaned both from crystal contacts and from solution conditions leading to successful crystallization, the model offers future avenues for optimizing crystallization screens based on partial structural information. The availability of crystallization data coupled with structural outcomes analyzed through state-of-the-art statistical models may thus guide macromolecular crystallization toward a more rational basis.  相似文献   

The flood of new genomic sequence information together with technological innovations in protein structure determination have led to worldwide structural genomics (SG) initiatives. The goals of SG initiatives are to accelerate the process of protein structure determination, to fill in protein fold space and to provide information about the function of uncharacterized proteins. In the long-term, these outcomes are likely to impact on medical biotechnology and drug discovery, leading to a better understanding of disease as well as the development of new therapeutics. Here we describe the high throughput pipeline established at the University of Queensland in Australia. In this focused pipeline, the targets for structure determination are proteins that are expressed in mouse macrophage cells and that are inferred to have a role in innate immunity. The aim is to characterize the molecular structure and the biochemical and cellular function of these targets by using a parallel processing pipeline. The pipeline is designed to work with tens to hundreds of target gene products and comprises target selection, cloning, expression, purification, crystallization and structure determination. The structures from this pipeline will provide insights into the function of previously uncharacterized macrophage proteins and could lead to the validation of new drug targets for chronic obstructive pulmonary disease and arthritis.  相似文献   

Integral membrane proteins carry out some of the most important functions of living cells, yet relatively few details are known about their structures. This is due, in large part, to the difficulties associated with preparing membrane protein crystals suitable for X-ray diffraction analysis. Mechanistic studies of membrane protein crystallization may provide insights that will aid in determining future membrane protein structures. Accordingly, the solution behavior of the bacterial outer membrane protein OmpF porin was studied by static light scattering under conditions favorable for crystal growth. The second osmotic virial coefficient (B22) was found to be a predictor of the crystallization behavior of porin, as has previously been found for soluble proteins. Both tetragonal and trigonal porin crystals were found to form only within a narrow window of B22 values located at approximately -0.5 to -2 X 10(-4) mol mL g(-2), which is similar to the "crystallization slot" observed for soluble proteins. The B22 behavior of protein-free detergent micelles proved very similar to that of porin-detergent complexes, suggesting that the detergent's contribution dominates the behavior of protein-detergent complexes under crystallizing conditions. This observation implies that, for any given detergent, it may be possible to construct membrane protein crystallization screens of general utility by manipulating the solution properties so as to drive detergent B22 values into the crystallization slot. Such screens would limit the screening effort to the detergent systems most likely to yield crystals, thereby minimizing protein requirements and improving productivity.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号