首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Overton IM  Barton GJ 《FEBS letters》2006,580(16):4005-4009
Target selection and ranking is fundamental to structural genomics. We present a Z-score scale, the "OB-Score", to rank potential targets by their predicted propensity to produce diffraction-quality crystals. The OB-Score is derived from a matrix of predicted isoelectric point and hydrophobicity values for nonredundant PDB entries solved to or=1 member with a high OB-Score, presenting favourable candidates for structural studies.  相似文献   

2.
Solution NMR structure determination of proteins revisited   总被引:2,自引:2,他引:0  
This 'Perspective' bears on the present state of protein structure determination by NMR in solution. The focus is on a comparison of the infrastructure available for NMR structure determination when compared to protein crystal structure determination by X-ray diffraction. The main conclusion emerges that the unique potential of NMR to generate high resolution data also on dynamics, interactions and conformational equilibria has contributed to a lack of standard procedures for structure determination which would be readily amenable to improved efficiency by automation. To spark renewed discussion on the topic of NMR structure determination of proteins, procedural steps with high potential for improvement are identified.  相似文献   

3.
Pache RA  Aloy P 《Proteomics》2008,8(10):1959-1964
The last years have seen the emergence of many large-scale proteomics initiatives that have identified thousands of new protein interactions and macromolecular assemblies. However, unfortunately, only a few among the discovered complexes meet the high-quality standards required to be promptly used in structural studies. This has thus created an increasing gap between the number of known protein interactions and complexes and those for which a high-resolution 3-D structure is available. Here, we present and validate a computational strategy to distinguish those complexes found in high-throughput affinity purification experiments that will stand the best chances to successfully express, purify and crystallize with little further intervention. Our method suggests that there are some 50 complexes recently discovered in yeast that could readily enter the structural biology pipelines.  相似文献   

4.
To study the substrate specificity of enzymes, we use the amidohydrolase and enolase superfamilies as model systems; members of these superfamilies share a common TIM barrel fold and catalyze a wide range of chemical reactions. Here, we describe a collaboration between the Enzyme Specificity Consortium (ENSPEC) and the New York SGX Research Center for Structural Genomics (NYSGXRC) that aims to maximize the structural coverage of the amidohydrolase and enolase superfamilies. Using sequence- and structure-based protein comparisons, we first selected 535 target proteins from a variety of genomes for high-throughput structure determination by X-ray crystallography; 63 of these targets were not previously annotated as superfamily members. To date, 20 unique amidohydrolase and 41 unique enolase structures have been determined, increasing the fraction of sequences in the two superfamilies that can be modeled based on at least 30% sequence identity from 45% to 73%. We present case studies of proteins related to uronate isomerase (an amidohydrolase superfamily member) and mandelate racemase (an enolase superfamily member), to illustrate how this structure-focused approach can be used to generate hypotheses about sequence–structure–function relationships. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.
Andrej Sali (Corresponding author)Email: URL: http://salilab.org
  相似文献   

5.
The frequency and distribution of microsatellites were analyzed in the 19 mitogenomes of phytopathogenic fungi covering five phyla. Our analysis revealed that in all the mitogenomes studied, the frequency and relative abundance varied, and it was neither influenced by genome size nor by GC content. SSRs were found to be differential distributed in genic and intergenic regions. An average of 5.14 (23.6%) SSRs were present in genic sequences and 21.7 (76.4%) SSRs were located in the intergenic sequences. Relative abundance of SSRs in mitogenomes was the highest in Aspergillus tubigensis, whereas, it was the least in Phaeosphaeria nodurum, the average being 0.45. Trinucleotide repeats were the most abundant motifs in the genic and intergenic regions of the mitogenomes of the phytopathogenic fungi. Among the genes, cox1 harbors the maximum SSRs, whereas cox3 and nad 7 contain the least. Based on the presence of SSRs in a particular gene, genetic relationships among individual organisms were also established.  相似文献   

6.
Persistent hurdles impede the successful determination of high-resolution crystal structures of eukaryotic integral membrane proteins (IMP). We designed a high-throughput structural genomics oriented pipeline that seeks to minimize effort in uncovering high-quality, responsive non-redundant targets for crystallization. This “discovery-oriented” pipeline sidesteps two significant bottlenecks in the IMP structure determination pipeline: expression and membrane extraction with detergent. In addition, proteins that enter the pipeline are then rapidly vetted by their presence in the included volume on a size-exclusion column—a hallmark of well-behaved IMP targets. A screen of 384 rationally selected eukaryotic IMPs in baker’s yeast Saccharomyces cerevisiae is outlined to demonstrate the results expected when applying this discovery-oriented pipeline to whole-organism membrane proteomes. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users. Franklin A. Hays and Zygy Roe-Zurz have contributed equally to this work.  相似文献   

7.
Although pedigree selection is the most commonly used method for developing inbred lines of maize, there are no studies on its effect on the heterozygosity of the lines. The objective of this work was to study the effect of pedigree selection on their heterozygosity. Thirteen F5 or F6 maize inbred lines developed by the pedigree selection method in four breeding programs and their F1 and F2 − F4 ancestors were genotyped with simple sequence repeat markers distributed along the genome. Simulation was also conducted assuming different models of selection to investigate the selective forces needed to explain the data. In the F2, F3 and F4 40%, 66% and 86% of the markers segregating in the F1 were fixed; that is, in the F2 and F3 fixation was lower than neutral expectation, but higher in the F4. Due to such opposite apparent directions of selection, the heterozygosity of the lines in the F5 or F6 generations did not differ significantly from neutral expectations. The time to fixation differed from that expected with neutrality for most of the chromosomes, indicating that selection is distributed across the genome; but apparent overdominant effects in chromosome 7 were higher than in other chromosomes. In conclusion, the relationship between heterozygosity and vigour may reduce the effectiveness of pedigree selection in its goal of selecting the more vigorous, homozygous individuals. A more effective procedure is proposed using molecular markers for the identification of the more homozygous individuals, the most vigorous of those individuals being selected.  相似文献   

8.
Automated protein structure calculation from NMR data   总被引:3,自引:1,他引:2  
Current software is almost at the stage to permit completely automatic structure determination of small proteins of <15 kDa, from NMR spectra to structure validation with minimal user interaction. This goal is welcome, as it makes structure calculation more objective and therefore more easily validated, without any loss in the quality of the structures generated. Moreover, it releases expert spectroscopists to carry out research that cannot be automated. It should not take much further effort to extend automation to ca 20 kDa. However, there are technological barriers to further automation, of which the biggest are identified as: routines for peak picking; adoption and sharing of a common framework for structure calculation, including the assembly of an automated and trusted package for structure validation; and sample preparation, particularly for larger proteins. These barriers should be the main target for development of methodology for protein structure determination, particularly by structural genomics consortia.
Mike P. WilliamsonEmail:
  相似文献   

9.
Structural genomic projects envision almost routine protein structure determinations, which are currently imaginable only for small proteins with molecular weights below 25,000 Da. For larger proteins, structural insight can be obtained by breaking them into small segments of amino acid sequences that can fold into native structures, even when isolated from the rest of the protein. Such segments are autonomously folding units (AFU) and have sizes suitable for fast structural analyses. Here, we propose to expand an intuitive procedure often employed for identifying biologically important domains to an automatic method for detecting putative folded protein fragments. The procedure is based on the recognition that large proteins can be regarded as a combination of independent domains conserved among diverse organisms. We thus have developed a program that reorganizes the output of BLAST searches and detects regions with a large number of similar sequences. To automate the detection process, it is reduced to a simple geometrical problem of recognizing rectangular shaped elevations in a graph that plots the number of similar sequences at each residue of a query sequence. We used our program to quantitatively corroborate the premise that segments with conserved sequences correspond to domains that fold into native structures. We applied our program to a test data set composed of 99 amino acid sequences containing 150 segments with structures listed in the Protein Data Bank, and thus known to fold into native structures. Overall, the fragments identified by our program have an almost 50% probability of forming a native structure, and comparable results are observed with sequences containing domain linkers classified in SCOP. Furthermore, we verified that our program identifies AFU in libraries from various organisms, and we found a significant number of AFU candidates for structural analysis, covering an estimated 5 to 20% of the genomic databases. Altogether, these results argue that methods based on sequence similarity can be useful for dissecting large proteins into small autonomously folding domains, and such methods may provide an efficient support to structural genomics projects.  相似文献   

10.
Intrinsically disordered proteins (IDPs) and proteins with long disordered regions are highly abundant in various proteomes. Despite their lack of well-defined ordered structure, these proteins and regions are frequently involved in crucial biological processes. Although in recent years these proteins have attracted the attention of many researchers, IDPs represent a significant challenge for structural characterization since these proteins can impact many of the processes in the structure determination pipeline. Here we investigate the effects of IDPs on the structure determination process and the utility of disorder prediction in selecting and improving proteins for structural characterization. Examination of the extent of intrinsic disorder in existing crystal structures found that relatively few protein crystal structures contain extensive regions of intrinsic disorder. Although intrinsic disorder is not the only cause of crystallization failures and many structured proteins cannot be crystallized, filtering out highly disordered proteins from structure-determination target lists is still likely to be cost effective. Therefore it is desirable to avoid highly disordered proteins from structure-determination target lists and we show that disorder prediction can be applied effectively to enrich structure determination pipelines with proteins more likely to yield crystal structures. For structural investigation of specific proteins, disorder prediction can be used to improve targets for structure determination. Finally, a framework for considering intrinsic disorder in the structure determination pipeline is proposed.  相似文献   

11.
The process of experimental determination of protein structure is marred with a high ratio of failures at many stages. With availability of large quantities of data from high-throughput structure determination in structural genomics centers, we can now learn to recognize protein features correlated with failures; thus, we can recognize proteins more likely to succeed and eventually learn how to modify those that are less likely to succeed. Here, we identify several protein features that correlate strongly with successful protein production and crystallization and combine them into a single score that assesses "crystallization feasibility." The formula derived here was tested with a jackknife procedure and validated on independent benchmark sets. The "crystallization feasibility" score described here is being applied to target selection in the Joint Center for Structural Genomics, and is now contributing to increasing the success rate, lowering the costs, and shortening the time for protein structure determination. Analyses of PDB depositions suggest that very similar features also play a role in non-high-throughput structure determination, suggesting that this crystallization feasibility score would also be of significant interest to structural biology, as well as to molecular and biochemistry laboratories.  相似文献   

12.
13.
DeWeese-Scott C  Moult J 《Proteins》2004,55(4):942-961
Experimental protein structures often provide extensive insight into the mode and specificity of small molecule binding, and this information is useful for understanding protein function and for the design of drugs. We have performed an analysis of the reliability with which ligand-binding information can be deduced from computer model structures, as opposed to experimentally derived ones. Models produced as part of the CASP experiments are used. The accuracy of contacts between protein model atoms and experimentally determined ligand atom positions is the main criterion. Only comparative models are included (i.e., models based on a sequence relationship between the protein of interest and a known structure). We find that, as expected, contact errors increase with decreasing sequence identity used as a basis for modeling. Analysis of the causes of errors shows that sequence alignment errors between model and experimental template have the most deleterious effect. In general, good, but not perfect, insight into ligand binding can be obtained from models based on a sequence relationship, providing there are no alignment errors in the model. The results support a structural genomics strategy based on experimental sampling of structure space so that all protein domains can be modeled on the basis of 30% or higher sequence identity.  相似文献   

14.
15.
Targeting of proteins for structure determination in structural genomic programs often includes the use of threading and fold recognition methods to exclude proteins belonging to well-populated fold families, but such methods can still fail to recognize preexisting folds. The authors illustrate here a method in which limited amounts of structural data are used to improve an initial homology search and the data are subsequently used to produce a structure by data-constrained refinement of an identified structural template. The data used are primarily NMR-based residual dipolar couplings, but they also include additional chemical shift and backbone-nuclear Overhauser effect data. Using this methodology, a backbone structure was efficiently produced for a 10 kDa protein (PF1455) from Pyrococcus furiosus. Its relationship to existing structures and its probable function are discussed.  相似文献   

16.
This paper describes efforts of the structural genomics project in the nuclear magnetic resonance (NMR) laboratory at the University of Science and Technology of China. This structural genomics project is biological-functional driven. Targets are mainly selected from two systems: proteins related with regulation of gene expression in humans and other eukaryotes, and proteins existing in the cell junction in humans. The majority of proteins selected from these two systems are related with human health and diseases, and some are potential drug targets. Twenty-five protein structures from Homo sapiens and other eukaryotes have been determined during last 5 years in this laboratory. Nuclear magnetic resonance (NMR) spectroscopy is highly suited to investigate molecular interactions at a close physiological condition and is particularly suited for the study of low-affinity, transient complexes. It can provide information on protein surface interaction, their complex structure, and their dynamic properties during protein recognition. Several examples are given in this paper.  相似文献   

17.
18.
A spin-diffusion-suppressed NOE buildup series has been measured for E. coli thioredoxin.The extensive 13C and 15N relaxation data previously reported for this protein allow fordirect interpretation of dynamical contributions to the 1H-1H cross-relaxation rates for a largeproportion of the NOE cross peaks. Estimates of the average accuracy for these derived NOEdistances are bounded by 4% and 10%, based on a comparison to the corresponding X-raydistances. An independent fluctuation model is proposed for prediction of the dynamicalcorrections to 1H-1H cross-relaxation rates, based solely on experimental structural andheteronuclear relaxation data. This analysis is aided by the demonstration that heteronuclearorder parameters greater than 0.6 depend only on the variance of the H-X bond orientation,independent of the motional model in either one- or two-dimensional diffusion (i.e., 1– S2 = 3/4 sin2 2 ). The combination of spin-diffusion-suppressed NOEdata and analysis of dynamical corrections to 1H-1H cross-relaxation rates based onheteronuclear relaxation data has allowed for a detailed interpretation of various discrepanciesbetween the reported solution and crystal structures.  相似文献   

19.
The first structure for a member of the DUF3349 (PF11829) family of proteins, Rv0543c from Mycobacterium tuberculosis, has been determined using NMR-based methods and some of its biophysical properties characterized. Rv0543c is a 100 residue, 11.3 kDa protein that both size exclusion chromatography and NMR spectroscopy show to be a monomer in solution. The structure of the protein consists of a bundle of five α-helices, α1 (M1 – Y16), α2 (P21 – C33), α3 (S37 – G52), α4 (G58 – H65) and α5 (S72 – G87), held together by a largely conserved group of hydrophobic amino acid side chains. Heteronuclear steady-state {1H}–15N NOE, T1, and T2 values are similar through-out the sequence indicating that the backbones of the five helices are in a single motional regime. The thermal stability of Rv0543c, characterized by circular dichroism spectroscopy, indicates that Rv0543c irreversibly unfolds upon heating with an estimated melting temperature of 62.5 °C. While the biological function of Rv0543c is still unknown, the presence of DUF3349 proteins predominately in Mycobacterium and Rhodococcus bacterial species suggests that Rv0543 may have a biological function unique to these bacteria, and consequently, may prove to be an attractive drug target to combat tuberculosis.  相似文献   

20.
Protein functional sites control most biological processes and are important targets for drug design and protein engineering. To characterize them, the evolutionary trace (ET) ranks the relative importance of residues according to their evolutionary variations. Generally, top‐ranked residues cluster spatially to define evolutionary hotspots that predict functional sites in structures. Here, various functions that measure the physical continuity of ET ranks among neighboring residues in the structure, or in the sequence, are shown to inform sequence selection and to improve functional site resolution. This is shown first, in 110 proteins, for which the overlap between top‐ranked residues and actual functional sites rose by 8% in significance. Then, on a structural proteomic scale, optimized ET led to better 3D structure‐function motifs (3D templates) and, in turn, to enzyme function prediction by the Evolutionary Trace Annotation (ETA) method with better sensitivity of (40% to 53%) and positive predictive value (93% to 94%). This suggests that the similarity of evolutionary importance among neighboring residues in the sequence and in the structure is a universal feature of protein evolution. In practice, this yields a tool for optimizing sequence selections for comparative analysis and, via ET, for better predictions of functional site and function. This should prove useful for the efficient mutational redesign of protein function and for pharmaceutical targeting.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号