首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
The implementation of efficient technologies for the production of recombinant mammalian proteins remains an outstanding challenge in many structural and functional genomics programs. We have developed a new method for rapid identification of soluble protein expression in E. coli, based on a separation of soluble protein from inclusion bodies by a filtration step at the colony level. The colony filtration (CoFi) blot is very well suited to screen libraries, and in the present work we used it to screen a deletion mutagenesis library.  相似文献   

3.
A key challenge for the academic and biopharmaceutical communities is the rapid and scalable production of recombinant proteins for supporting downstream applications ranging from therapeutic trials to structural genomics efforts. Here, we describe a novel system for the production of recombinant mammalian proteins, including immune receptors, cytokines and antibodies, in a human cell line culture system, often requiring <3 weeks to achieve stable, high-level expression: Daedalus. The inclusion of minimized ubiquitous chromatin opening elements in the transduction vectors is key for preventing genomic silencing and maintaining the stability of decigram levels of expression. This system can bypass the tedious and time-consuming steps of conventional protein production methods by employing the secretion pathway of serum-free adapted human suspension cell lines, such as 293 Freestyle. Using optimized lentiviral vectors, yields of 20-100 mg/l of correctly folded and post-translationally modified, endotoxin-free protein of up to ~70 kDa in size, can be achieved in conventional, small-scale (100 ml) culture. At these yields, most proteins can be purified using a single size-exclusion chromatography step, immediately appropriate for use in structural, biophysical or therapeutic applications.  相似文献   

4.
Structural genomics programs are distributed worldwide and funded by large institutions such as the NIH in United-States, the RIKEN in Japan or the European Commission through the SPINE network in Europe. Such initiatives, essentially managed by large consortia, led to technology and method developments at the different steps required to produce biological samples compatible with structural studies. Besides specific applications, method developments resulted mainly upon miniaturization and parallelization. The challenge that academic laboratories faces to pursue structural genomics programs is to produce, at a higher rate, protein samples. The Structural Biology and Genomics Department (IGBMC – Illkirch – France) is implicated in a structural genomics program of high eukaryotes whose goal is solving crystal structures of proteins and their complexes (including large complexes) related to human health and biotechnology. To achieve such a challenging goal, the Department has established a medium-throughput pipeline for producing protein samples suitable for structural biology studies. Here, we describe the setting up of our initiative from cloning to crystallization and we demonstrate that structural genomics may be manageable by academic laboratories by strategic investments in robotic and by adapting classical bench protocols and new developments, in particular in the field of protein expression, to parallelization.  相似文献   

5.
Canaves JM 《Proteins》2004,56(1):19-27
Recently, the structures of two proteins belonging to the archease family, TM1083 from Thermotoga maritima and MTH1598 from Methanobacterium thermoautotrophicum, have been solved independently by two Protein Structure Initiative structural genomics pilot centers using X-ray crystallography and NMR, respectively. The archease protein family is a good example of one of the paradoxes of structural genomics: Approximately one third of protein structures produced by structural genomics centers have no known function and are still annotated as "hypothetical proteins" in the Protein Data Bank. In the case of archeases, despite the existence of two protein structures and abundant sequence information, there is still no function assigned to this protein family. Here, our group predicts, based on structural similarity, sequence conservation, and gene context analyses, that members of this protein family might function as chaperones or modulators of proteins involved in DNA/RNA processing. The conservation of genomic context for this protein family is constant from Archaea and Bacteria to humans, and suggests that unannotated open reading frames contiguous to them could be novel RNA/DNA binding proteins.  相似文献   

6.

Background  

Protein expression in E. coli is the most commonly used system to produce protein for structural studies, because it is fast and inexpensive and can produce large quantity of proteins. However, when proteins from other species such as mammalian are produced in this system, problems of protein expression and solubility arise [1]. Structural genomics project are currently investigating proteomics pipelines that would produce sufficient quantities of recombinant proteins for structural studies of protein complexes. To investigate how the E. coli protein expression system could be used for this purpose, we purified apoptotic binary protein complexes formed between members of the Caspase Associated Recruitment Domain (CARD) family.  相似文献   

7.
We developed a new bacterial expression system that utilizes a combination of attributes (low temperature, induction of an mRNA-specific endoribonuclease causing host cell growth arrest, and culture condensation) to facilitate stable, high level protein expression, almost 30% of total cellular protein, without background protein synthesis. With the use of an optimized vector, exponentially growing cultures could be condensed 40-fold without affecting protein yields, which lowered sample labeling costs to a few percent of the cost of a typical labeling experiment. Because the host cells were completely growth-arrested, toxic amino acids such as selenomethionine and fluorophenylalanine were efficiently incorporated into recombinant proteins in the absence of cytotoxicity. Therefore, this expression system using Escherichia coli as a bioreactor is especially well suited to structural genomics, large-scale protein expressions, and the production of cytotoxic proteins.  相似文献   

8.
Structural genomics projects are providing large quantities of new 3D structural data for proteins. To monitor the quality of these data, we have developed the protein structure validation software suite (PSVS), for assessment of protein structures generated by NMR or X-ray crystallographic methods. PSVS is broadly applicable for structure quality assessment in structural biology projects. The software integrates under a single interface analyses from several widely-used structure quality evaluation tools, including PROCHECK (Laskowski et al., J Appl Crystallog 1993;26:283-291), MolProbity (Lovell et al., Proteins 2003;50:437-450), Verify3D (Luthy et al., Nature 1992;356:83-85), ProsaII (Sippl, Proteins 1993;17: 355-362), the PDB validation software, and various structure-validation tools developed in our own laboratory. PSVS provides standard constraint analyses, statistics on goodness-of-fit between structures and experimental data, and knowledge-based structure quality scores in standardized format suitable for database integration. The analysis provides both global and site-specific measures of protein structure quality. Global quality measures are reported as Z scores, based on calibration with a set of high-resolution X-ray crystal structures. PSVS is particularly useful in assessing protein structures determined by NMR methods, but is also valuable for assessing X-ray crystal structures or homology models. Using these tools, we assessed protein structures generated by the Northeast Structural Genomics Consortium and other international structural genomics projects, over a 5-year period. Protein structures produced from structural genomics projects exhibit quality score distributions similar to those of structures produced in traditional structural biology projects during the same time period. However, while some NMR structures have structure quality scores similar to those seen in higher-resolution X-ray crystal structures, the majority of NMR structures have lower scores. Potential reasons for this "structure quality score gap" between NMR and X-ray crystal structures are discussed.  相似文献   

9.
Protein disorder prediction: implications for structural proteomics   总被引:26,自引:0,他引:26  
A great challenge in the proteomics and structural genomics era is to predict protein structure and function, including identification of those proteins that are partially or wholly unstructured. Disordered regions in proteins often contain short linear peptide motifs (e.g., SH3 ligands and targeting signals) that are important for protein function. We present here DisEMBL, a computational tool for prediction of disordered/unstructured regions within a protein sequence. As no clear definition of disorder exists, we have developed parameters based on several alternative definitions and introduced a new one based on the concept of "hot loops," i.e., coils with high temperature factors. Avoiding potentially disordered segments in protein expression constructs can increase expression, foldability, and stability of the expressed protein. DisEMBL is thus useful for target selection and the design of constructs as needed for many biochemical studies, particularly structural biology and structural genomics projects. The tool is freely available via a web interface (http://dis.embl.de) and can be downloaded for use in large-scale studies.  相似文献   

10.
For future structural and functional genomics programs new tools will be required. The implementation of high-throughput (HTP) methods for protein production will be an essential element. Present HTP protein production developments in structural genomics are aimed at obtaining well-expressing and highly soluble proteins, which are preferred candidates for structure-function studies. Here, we describe a cheap and efficient procedure to identify well-expressing soluble proteins in Escherichia coli in a compact 96-well format. Reproducible lysis on filter plates, followed by a filtration step on 96-well filter plates, allows the efficient separation of inclusion bodies from the soluble fraction. In the following step a dot blot procedure using anti-RGS-His(4) antibody (Qiagen) to detect expression of recombinant His-tagged protein is applied allowing direct detection of the target protein in the soluble fraction. The method is well suited for automation and should be applicable to expression screening of most proteins and fusion domains to which specific antibodies are available.  相似文献   

11.
The dramatically increasing number of new protein sequences arising from genomics 4 proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions.Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs [1–6], a considerable number of protein structures have already been produced, some of them coming directly out of semi-automated structure determination pipelines [6–10]. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http://www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.  相似文献   

12.
Structural genomics efforts have led to increasing numbers of novel, uncharacterized protein structures with low sequence identity to known proteins, resulting in a growing need for structure-based function recognition tools. Our method, SeqFEATURE, robustly models protein functions described by sequence motifs using a structural representation. We built a library of models that shows good performance compared to other methods. In particular, SeqFEATURE demonstrates significant improvement over other methods when sequence and structural similarity are low.  相似文献   

13.
结构基因组学研究与核磁共振   总被引:4,自引:0,他引:4  
各种生物的基因组DNA测序计划的完成,将结构生物学带入了结构基因组学时代.结构基因组学是对所有基因组产物结构的系统性测定,它运用高通量的选择、表达、纯化以及结构测定和计算分析手段,为基因组的每个蛋白质产物提供实验测定的结构或较好的理论模型,这将加速生命科学各个领域的研究.生物信息学、基因工程、结构测定技术等的发展为结构基因组学研究提供了保证.近年来核磁共振在技术方法上的进展,使其成为结构基因组学高通量结构分析中的一个关键方法.  相似文献   

14.
Producing recombinant proteins in Escherichia coli (E. coli) is generally performed using a trial and error approach with the different expression variables being tested independently from each other. As a consequence, variable interactions are lost which makes the trial and error approach quite time-consuming. In this paper, we report how switching from a trial and error to a fractional factorial approach allows testing in less than 2 weeks four expression variables (E. coli strains, culture media, expression temperatures and N-terminal fusion tags) in a single experiment. The method, called "Fusion-InFFact", was validated using four test proteins. In all cases, Fusion-InFFact allowed finding conditions for expressing high yields of soluble proteins. The method was originally set-up for high throughput structural genomics programs, but can be used in any recombinant protein expression project.  相似文献   

15.
Finding small molecules that modulate protein function is of primary importance in drug development and in the emerging field of chemical genomics. To facilitate the identification of such molecules, we developed a novel strategy making use of structural conservatism found in protein domain architecture and natural product inspired compound library design. Domains and proteins identified as being structurally similar in their ligand-sensing cores are grouped in a protein structure similarity cluster (PSSC). Natural products can be considered as evolutionary pre-validated ligands for multiple proteins and therefore natural products that are known to interact with one of the PSSC member proteins are selected as guiding structures for compound library synthesis. Application of this novel strategy for compound library design provided enhanced hit rates in small compound libraries for structurally similar proteins.  相似文献   

16.
Recent progress in structure determination techniques has led to a significant growth in the number of known membrane protein structures, and the first structural genomics projects focusing on membrane proteins have been initiated, warranting an investigation of appropriate bioinformatics strategies for optimal structural target selection for these molecules. What determines a membrane protein fold? How many membrane structures need to be solved to provide sufficient structural coverage of the membrane protein sequence space? We present the CAMPS database (Computational Analysis of the Membrane Protein Space) containing almost 45,000 proteins with three or more predicted transmembrane helices (TMH) from 120 bacterial species. This large set of membrane proteins was subjected to single‐linkage clustering using only sequence alignments covering at least 40% of the TMH present in a given family. This process yielded 266 sequence clusters with at least 15 members, roughly corresponding to membrane structural folds, sufficiently structurally homogeneous in terms of the variation of TMH number between individual sequences. These clusters were further subdivided into functionally homogeneous subclusters according to the COG (Clusters of Orthologous Groups) system as well as more stringently defined families sharing at least 30% identity. The CAMPS sequence clusters are thus designed to reflect three main levels of interest for structural genomics: fold, function, and modeling distance. We present a library of Hidden Markov Models (HMM) derived from sequence alignments of TMH at these three levels of sequence similarity. Given that 24 out of 266 clusters corresponding to membrane folds already have associated known structures, we estimate that 242 additional new structures, one for each remaining cluster, would provide structural coverage at the fold level of roughly 70% of prokaryotic membrane proteins belonging to the currently most populated families. Proteins 2006. © 2006 Wiley‐Liss, Inc.  相似文献   

17.
For structural and functional genomics programs, new high-throughput methods to characterize well-expressing and highly soluble proteins are essential. A faster and more convenient approach to screen expression conditions of recombinant proteins compared to classical in vivo systems is the Escherichia coli cell-free expression system. Here, we describe a rapid procedure to screen for expression and solubility of recombinant proteins using an E. coli cell-free extract. The results presented cover 24 open reading frames of unknown function from different micro-organisms. In order to screen different variables that may interfere with solubility, we expressed the recombinant proteins with a histidine6 tag, either N-terminal or C-terminal at two temperatures (25 degrees C and 30 degrees C). The identification of recombinant proteins is performed by the dot blot procedure using an anti-histidine tag antibody. We designed a rapid method that allows the characterization of soluble candidates from a large number of genes or from a large number of variants that is highly compatible with structural genomics expectations.  相似文献   

18.
The production of complex multidomain (membrane) proteins is a major hurdle in structural genomics and a generic approach for optimizing membrane protein expression is still lacking. We have devised a selection method to isolate mutant strains with improved functional expression of recombinant membrane proteins. By fusing green fluorescent protein and an erythromycin resistance marker (ErmC) to the C-terminus of a target protein, one simultaneously selects for variants with enhanced expression (increased erythromycin resistance) and correct folding (green fluorescent protein fluorescence). Three evolved hosts, displaying 2- to 8-fold increased expression of a plethora of proteins, were fully sequenced and shown to carry single-site mutations in the nisK gene. NisK is the sensor protein of a two-component regulatory system that directs nisin-A-mediated expression. The levels of recombinant membrane proteins were increased in the evolved strains, and in some cases their folding states were improved. The generality and simplicity of our approach allow rapid improvements of protein production yields by directed evolution in a high-throughput way.  相似文献   

19.
Mirkovic N  Li Z  Parnassa A  Murray D 《Proteins》2007,66(4):766-777
The technological breakthroughs in structural genomics were designed to facilitate the solution of a sufficient number of structures, so that as many protein sequences as possible can be structurally characterized with the aid of comparative modeling. The leverage of a solved structure is the number and quality of the models that can be produced using the structure as a template for modeling and may be viewed as the "currency" with which the success of a structural genomics endeavor can be measured. Moreover, the models obtained in this way should be valuable to all biologists. To this end, at the Northeast Structural Genomics Consortium (NESG), a modular computational pipeline for automated high-throughput leverage analysis was devised and used to assess the leverage of the 186 unique NESG structures solved during the first phase of the Protein Structure Initiative (January 2000 to July 2005). Here, the results of this analysis are presented. The number of sequences in the nonredundant protein sequence database covered by quality models produced by the pipeline is approximately 39,000, so that the average leverage is approximately 210 models per structure. Interestingly, only 7900 of these models fulfill the stringent modeling criterion of being at least 30% sequence-identical to the corresponding NESG structures. This study shows how high-throughput modeling increases the efficiency of structure determination efforts by providing enhanced coverage of protein structure space. In addition, the approach is useful in refining the boundaries of structural domains within larger protein sequences, subclassifying sequence diverse protein families, and defining structure-based strategies specific to a particular family.  相似文献   

20.
Protein-fusion constructs have been used with great success for enhancing expression of soluble recombinant protein and as tags for affinity purification. Unfortunately the most popular tags, such as GST and MBP, are large, which hinders direct NMR studies of the fusion proteins. Cleavage of the fusion proteins often re-introduces problems with solubility and stability. Here we describe the use of N-terminally fused protein G (B1 domain) as a non-cleavable solubility-enhancement tag (SET) for structure determination of a dimeric protein complex. The SET enhances the solubility and stability of the fusion product dramatically while not interacting directly with the protein of interest. This approach can be used for structural characterization of poorly behaving protein systems, and would be especially useful for structural genomics studies.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号