首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have been analyzing the extent to which protein secondary structure determines protein tertiary structure in simple protein folds. An earlier paper demonstrated that three-dimensional structure can be obtained successfully using only highly approximate backbone torsion angles for every residue. Here, the initial information is further diluted by introducing a realistic degree of experimental uncertainty into this process. In particular, we tackle the practical problem of determining three-dimensional structure solely from backbone chemical shifts, which can be measured directly by NMR and are known to be correlated with a protein's backbone torsion angles. Extending our previous algorithm to incorporate these experimentally determined data, clusters of structures compatible with the experimentally determined chemical shifts were generated by fragment assembly Monte Carlo. The cluster that corresponds to the native conformation was then identified based on four energy terms: steric clash, solvent-squeezing, hydrogen-bonding, and hydrophobic contact. Currently, the method has been applied successfully to five small proteins with simple topology. Although still under development, this approach offers promise for high-throughput NMR structure determination.  相似文献   

2.
Production of sufficient amounts of human proteins is a frequent bottleneck in structural biology. Here we describe an Escherichia coli-based cell-free system which yields mg-quantities of human proteins in N-terminal fusion constructs with the GB1 domain, which show significantly increased translation efficiency. A newly generated E. coli BL21 (DE3) RIPL-Star strain was used, which contains a variant RNase E with reduced activity and an excess of rare-codon tRNAs, and is devoid of lon and ompT protease activity. In the implementation of the expression system we used freshly in-house prepared cell extract. Batch-mode cell-free expression with this setup was up to twofold more economical than continuous-exchange expression, with yields of 0.2-0.9 mg of purified protein per mL of reaction mixture. Native folding of the proteins thus obtained is documented with 2D [(15)N,(1)H]-HSQC NMR.  相似文献   

3.
Obtaining NMR assignments for slowly tumbling molecules such as detergent-solubilized membrane proteins is often compromised by low sensitivity as well as spectral overlap. Both problems can be addressed by amino-acid specific isotope labeling in conjunction with 15N–1H correlation experiments. In this work an extended combinatorial selective in vitro labeling scheme is proposed that seeks to reduce the number of samples required for assignment. Including three different species of amino acids in each sample, 15N, 1-13C, and fully 13C/15N labeled, permits identification of more amino acid types and sequential pairs than would be possible with previously published combinatorial methods. The new protocol involves recording of up to five 2D triple-resonance experiments to distinguish the various isotopomeric dipeptide species. The pattern of backbone NH cross peaks in this series of spectra adds a new dimension to the combinatorial grid, which otherwise mostly relies on comparison of [15N, 1H]–HSQC and possibly 2D HN(CO) spectra of samples with different labeled amino acid compositions. Application to two α-helical membrane proteins shows that using no more than three samples information can be accumulated such that backbone assignments can be completed solely based on 3D HNCA/HN(CO)CA experiments. Alternatively, in the case of severe signal overlap in certain regions of the standard suite of triple-resonance spectra acquired on uniformly labeled protein, or missing signals due to a lack of efficiency of 3D experiments, the remaining gaps can be filled.  相似文献   

4.
Lin HN  Wu KP  Chang JM  Sung TY  Hsu WL 《Nucleic acids research》2005,33(14):4593-4601
NMR data from different experiments often contain errors; thus, automated backbone resonance assignment is a very challenging issue. In this paper, we present a method called GANA that uses a genetic algorithm to automatically perform backbone resonance assignment with a high degree of precision and recall. Precision is the number of correctly assigned residues divided by the number of assigned residues, and recall is the number of correctly assigned residues divided by the number of residues with known human curated answers. GANA takes spin systems as input data and uses two data structures, candidate lists and adjacency lists, to assign the spin systems to each amino acid of a target protein. Using GANA, almost all spin systems can be mapped correctly onto a target protein, even if the data are noisy. We use the BioMagResBank (BMRB) dataset (901 proteins) to test the performance of GANA. To evaluate the robustness of GANA, we generate four additional datasets from the BMRB dataset to simulate data errors of false positives, false negatives and linking errors. We also use a combination of these three error types to examine the fault tolerance of our method. The average precision rates of GANA on BMRB and the four simulated test cases are 99.61, 99.55, 99.34, 99.35 and 98.60%, respectively. The average recall rates of GANA on BMRB and the four simulated test cases are 99.26, 99.19, 98.85, 98.87 and 97.78%, respectively. We also test GANA on two real wet-lab datasets, hbSBD and hbLBD. The precision and recall rates of GANA on hbSBD are 95.12 and 92.86%, respectively, and those of hbLBD are 100 and 97.40%, respectively.  相似文献   

5.
We introduce an efficient approach for sequential protein backbone assignment based on two complementary proton-detected 4D solid-state NMR experiments that correlate \( {\text{H}}_{{\text{i}}}^{{\text{N}}} \) /Ni with CAi/COi or CAi?1/COi?1. The resulting 4D spectra exhibit excellent sensitivity and resolution and are amenable to (semi-)automatic assignment approaches. This strategy allows to obtain sequential connections with high confidence as problems related to peak overlap and multiple assignment possibilities are avoided. Non-uniform sampling schemes were implemented to allow for the acquisition of 4D spectra within a few days. Rather moderate hardware requirements enable the successful demonstration of the method on deuterated type III secretion needles using a 600 MHz spectrometer at a spinning rate of 25 kHz.  相似文献   

6.
A new strategy of backbone resonance assignment is proposed based on a combination of the most sensitive TROSY-type triple resonance experiments such as TROSY-HNCA and TROSY-HNCO with a new 3D multiple-quantum HACACO experiment. The favourable relaxation properties of the multiple-quantum coherences and signal detection using the 13C antiphase coherences optimize the performance of the proposed experiment for application to larger proteins. In addition to the 1HN, 15N,13C and 13C chemical shifts the 3D multiple-quantum HACACO experiment provides assignment for the 1H resonances in constrast to previously proposed experiments for large proteins. The strategy is demonstrated with the 44 kDa uniformly 15N,13C-labeled and fractionally 35% deuterated trimeric B. subtilis Chorismate Mutase measured at 20°C and 9°C. Measurements at the lower temperature indicate that the new strategy can be applied to even larger proteins with molecular weights up to 80 kDa.  相似文献   

7.
MOTIVATION: Backbone resonance assignment is a critical bottleneck in studies of protein structure, dynamics and interactions by nuclear magnetic resonance (NMR) spectroscopy. A minimalist approach to assignment, which we call 'contact-based', seeks to dramatically reduce experimental time and expense by replacing the standard suite of through-bond experiments with the through-space (nuclear Overhauser enhancement spectroscopy, NOESY) experiment. In the contact-based approach, spectral data are represented in a graph with vertices for putative residues (of unknown relation to the primary sequence) and edges for hypothesized NOESY interactions, such that observed spectral peaks could be explained if the residues were 'close enough'. Due to experimental ambiguity, several incorrect edges can be hypothesized for each spectral peak. An assignment is derived by identifying consistent patterns of edges (e.g. for alpha-helices and beta-sheets) within a graph and by mapping the vertices to the primary sequence. The key algorithmic challenge is to be able to uncover these patterns even when they are obscured by significant noise. RESULTS: This paper develops, analyzes and applies a novel algorithm for the identification of polytopes representing consistent patterns of edges in a corrupted NOESY graph. Our randomized algorithm aggregates simplices into polytopes and fixes inconsistencies with simple local modifications, called rotations, that maintain most of the structure already uncovered. In characterizing the effects of experimental noise, we employ an NMR-specific random graph model in proving that our algorithm gives optimal performance in expected polynomial time, even when the input graph is significantly corrupted. We confirm this analysis in simulation studies with graphs corrupted by up to 500% noise. Finally, we demonstrate the practical application of the algorithm on several experimental beta-sheet datasets. Our approach is able to eliminate a large majority of noise edges and to uncover large consistent sets of interactions. AVAILABILITY: Our algorithm has been implemented in the platform-independent Python code. The software can be freely obtained for academic use by request from the authors.  相似文献   

8.
Error tolerant backbone resonance assignment is the cornerstone of the NMR structure determination process. Although a variety of assignment approaches have been developed, none works sufficiently well on noisy fully automatically picked peaks to enable the subsequent automatic structure determination steps. We have designed an integer linear programming (ILP) based assignment system (IPASS) that has enabled fully automatic protein structure determination for four test proteins. IPASS employs probabilistic spin system typing based on chemical shifts and secondary structure predictions. Furthermore, IPASS extracts connectivity information from the inter-residue information and the (automatically picked) (15)N-edited NOESY peaks which are then used to fix reliable fragments. When applied to automatically picked peaks for real proteins, IPASS achieves an average precision and recall of 82% and 63%, respectively. In contrast, the next best method, MARS, achieves an average precision and recall of 77% and 36%, respectively. The assignments generated by IPASS are then fed into our protein structure calculation system, FALCON-NMR, to determine the 3D structures without human intervention. The final models have backbone RMSDs of 1.25?, 0.88?, 1.49?, and 0.67? to the reference native structures for proteins TM1112, CASKIN, VRAR, and HACS1, respectively. The web server is publicly available at http://monod.uwaterloo.ca/nmr/ipass.  相似文献   

9.
A standard set of three APSY-NMR experiments has been used in daily practice to obtain polypeptide backbone NMR assignments in globular proteins with sizes up to about 150 residues, which had been identified as targets for structure determination by the Joint Center for Structural Genomics (JCSG) under the auspices of the Protein Structure Initiative (PSI). In a representative sample of 30 proteins, initial fully automated data analysis with the software UNIO-MATCH-2014 yielded complete or partial assignments for over 90 % of the residues. For most proteins the APSY data acquisition was completed in less than 30 h. The results of the automated procedure provided a basis for efficient interactive validation and extension to near-completion of the assignments by reference to the same 3D heteronuclear-resolved [1H,1H]-NOESY spectra that were subsequently used for the collection of conformational constraints. High-quality structures were obtained for all 30 proteins, using the J-UNIO protocol, which includes extensive automation of NMR structure determination.  相似文献   

10.
11.
The complete sequence-specific assignment of the 15N and 1H backbone resonances of the NMR spectrum of recombinant human interleukin 1 beta (153 residues, Mr = 17,400) has been obtained by using primarily 15N-1H heteronuclear three-dimensional (3D) NMR techniques in combination with 15N-1H heteronuclear and 1H homonuclear two-dimensional NMR. The fingerprint region of the spectrum was analyzed by using a combination of 3D heteronuclear 1H Hartmann-Hahn 15N-1H multiple quantum coherence (3D HOHAHA-HMQC) and 3D heteronuclear 1H nuclear Overhauser 15N-1H multiple quantum coherence (3D NOESY-HMQC) spectroscopies. We show that the problems of amide NH and C alpha H chemical shift degeneracy that are prevalent for proteins of this size are readily overcome by using the 3D heteronuclear NMR technique. A doubling of some peaks in the spectrum was found to be due to N-terminal heterogeneity of the 15N-labeled protein, corresponding to a mixture of wild-type and des-Ala-1-interleukin 1 beta. The complete list of 15N and 1H assignments is given for all the amide NH and C alpha H resonances of all non-proline residues, as well as the 1H assignments for some of the amino acid side chains. This first example of the sequence-specific assignment of a protein using heteronuclear 3D NMR provides a basis for further conformational and dynamic studies of interleukin 1 beta.  相似文献   

12.
Here we present a novel suite of projected 4D triple-resonance NMR experiments for efficient sequential assignment of polypeptide backbone chemical shifts in 13C/15N doubly labeled proteins. In the 3D HNN[CAHA] and 3D HNN(CO)[CAHA] experiments, the 13C and 1H chemical shifts evolve in a common dimension and are simultaneously detected in quadrature. These experiments are particularly useful for the assignment of glycine-rich polypeptide segments. Appropriate setting of the 1H radiofrequency carrier allows one to place cross peaks correlating either backbone 15N/1HN/13C or 15N/1HN/1H chemical shifts in separate spectral regions. Hence, peak overlap is not increased when compared with the conventional 3D HNNCA and HNN(CA)HA. 3D HNN[CAHA] and 3D HNN(CO)[CAHA] are complemented by 3D reduced-dimensionality (RD) HNN COCA and HNN CACO, where 13C and 13C chemical shifts evolve in a common dimension. The 13C shift is detected in quadrature, which yields peak pairs encoding the 13C chemical shift in an in-phase splitting. This suite of four experiments promises to be of value for automated high-throughput NMR structure determination in structural genomics, where the requirement to independently sample many indirect dimensions in a large number of NMR experiments may prevent one from accurately adjusting NMR measurement times to spectrometer sensitivity.  相似文献   

13.
The paper describes an alternative approach to the fragment assembly problem. The key idea is to train a recurrent neural network to tracking the sequence of bases constituting a given fragment and to assign to a same cluster all the sequences which are well tracked by this network. We make use of a 3-layer Recurrent Perceptron and examine both edited sequences from a ftp site and artificial fragments from a common simulation software: the clusters we obtain exhibit interesting properties in terms of error filtering, stability and self consistency; we define as well, with a certain degree of approximation, a metric on the fragment set. The proposed assembly algorithm is susceptible to becoming an alternative method with the following properties: (i) high quality of the rebuilt genomic sequences, (ii) high parallelizability of the computing process with consequent drastic reduction of the running time.  相似文献   

14.
The sequential assignment of backbone resonances is the first step in the structure determination of proteins by heteronuclear NMR. For larger proteins, an assignment strategy based on proton side-chain information is no longer suitable for the use in an automated procedure. Our program PASTA (Protein ASsignment by Threshold Accepting) is therefore designed to partially or fully automate the sequential assignment of proteins, based on the analysis of NMR backbone resonances plus C information. In order to overcome the problems caused by peak overlap and missing signals in an automated assignment process, PASTA uses threshold accepting, a combinatorial optimization strategy, which is superior to simulated annealing due to generally faster convergence and better solutions. The reliability of this algorithm is shown by reproducing the complete sequential backbone assignment of several proteins from published NMR data. The robustness of the algorithm against misassigned signals, noise, spectral overlap and missing peaks is shown by repeating the assignment with reduced sequential information and increased chemical shift tolerances. The performance of the program on real data is finally demonstrated with automatically picked peak lists of human nonpancreatic synovial phospholipase A2, a protein with 124 residues.  相似文献   

15.
NMR resonance assignment is one of the key steps in solving an NMR protein structure. The assignment process links resonance peaks to individual residues of the target protein sequence, providing the prerequisite for establishing intra- and inter-residue spatial relationships between atoms. The assignment process is tedious and time-consuming, which could take many weeks. Though there exist a number of computer programs to assist the assignment process, many NMR labs are still doing the assignments manually to ensure quality. This paper presents a new computational method based on the combination of a suite of algorithms for automating the assignment process, particularly the process of backbone resonance peak assignment. We formulate the assignment problem as a constrained weighted bipartite matching problem. While the problem, in the most general situation, is NP-hard, we present an efficient solution based on a branch-and-bound algorithm with effective bounding techniques using two recently introduced approximation algorithms. We also devise a greedy filtering algorithm for reducing the search space. Our experimental results on 70 instances of (pseudo) real NMR data derived from 14 proteins demonstrate that the new solution runs much faster than a recently introduced (exhaustive) two-layer algorithm and recovers more correct peak assignments than the two-layer algorithm. Our result demonstrates that integrating different algorithms can achieve a good tradeoff between backbone assignment accuracy and computation time.  相似文献   

16.
A TROSY-based triple-resonance pulse scheme is described which correlates backbone 1H and 15N chemical shifts of an amino acid residue with the 15N chemical shifts of both the sequentially preceding and following residues. The sequence employs 1 J NC and 2 J NC couplings in two sequential magnetization transfer steps in an `out-and-back' manner. As a result, N,N connectivities are obtained irrespective of whether the neighbouring amide nitrogens are protonated or not, which makes the experiment suitable for the assignment of proline resonances. Two different three-dimensional variants of the pulse sequence are presented which differ in sensitivity and resolution to be achieved in one of the nitrogen dimensions. The new method is demonstrated with two uniformly 2H/13C/15N-labelled proteins in the 30-kDa range.  相似文献   

17.
In NMR protein structure determination, after the resonance peaks have been identified and chemical shifts from peaks across multiple spectra have been grouped into spin systems, associating these spin systems to their host residues is the key toward the success of structural information extraction and thus the key to the success of the structure calculation. To achieve accurate enough structure calculation, a near complete and accurate assignment is a prerequisite. There are two pieces of information that can be used into the assignment, one of which is the adjacency information among the spin systems and the other is the signature information of the spin systems. The signature information reflects the fact that, generally speaking, for one type of amino acid residing in a specific local structural environment, the chemical shifts for the atoms inside the amino acid fall into some very narrow distinct ranges. In most of the existing work, normal distributions are assumed with means and standard deviations statistically collected from the available data. In this paper, we followed a simple yet effective histogram-based way to estimate for every spin system the probability that its host is a certain type of amino acid residing in a certain type of secondary structure. We used two combinations of chemical shifts to demonstrate the effectiveness of this type of histogram-based scoring schemes.  相似文献   

18.
This paper develops an approach to protein backbone NMR assignment that effectively assigns large proteins while using limited sets of triple-resonance experiments. Our approach handles proteins with large fractions of missing data and many ambiguous pairs of pseudoresidues, and provides a statistical assessment of confidence in global and position-specific assignments. The approach is tested on an extensive set of experimental and synthetic data of up to 723 residues, with match tolerances of up to 0.5 ppm for and resonance types. The tests show that the approach is particularly helpful when data contain experimental noise and require large match tolerances. The keys to the approach are an empirical Bayesian probability model that rigorously accounts for uncertainty in the data at all stages in the analysis, and a hybrid stochastic tree-based search algorithm that effectively explores the large space of possible assignments.  相似文献   

19.
Four novel 5D (HACA(N)CONH, HNCOCACB, (HACA)CON(CA)CONH, (H)NCO(NCA)CONH), and one 6D ((H)NCO(N)CACONH) NMR pulse sequences are proposed. The new experiments employ non-uniform sampling that enables achieving high resolution in indirectly detected dimensions. The experiments facilitate resonance assignment of intrinsically disordered proteins. The novel pulse sequences were successfully tested using δ subunit (20 kDa) of Bacillus subtilis RNA polymerase that has an 81-amino acid disordered part containing various repetitive sequences.  相似文献   

20.
We revisit the problem of protein structure determination from geometrical restraints from NMR, using convex optimization. It is well-known that the NP-hard distance geometry problem of determining atomic positions from pairwise distance restraints can be relaxed into a convex semidefinite program (SDP). However, often the NOE distance restraints are too imprecise and sparse for accurate structure determination. Residual dipolar coupling (RDC) measurements provide additional geometric information on the angles between atom-pair directions and axes of the principal-axis-frame. The optimization problem involving RDC is highly non-convex and requires a good initialization even within the simulated annealing framework. In this paper, we model the protein backbone as an articulated structure composed of rigid units. Determining the rotation of each rigid unit gives the full protein structure. We propose solving the non-convex optimization problems using the sum-of-squares (SOS) hierarchy, a hierarchy of convex relaxations with increasing complexity and approximation power. Unlike classical global optimization approaches, SOS optimization returns a certificate of optimality if the global optimum is found. Based on the SOS method, we proposed two algorithms—RDC-SOS and RDC–NOE-SOS, that have polynomial time complexity in the number of amino-acid residues and run efficiently on a standard desktop. In many instances, the proposed methods exactly recover the solution to the original non-convex optimization problem. To the best of our knowledge this is the first time SOS relaxation is introduced to solve non-convex optimization problems in structural biology. We further introduce a statistical tool, the Cramér–Rao bound (CRB), to provide an information theoretic bound on the highest resolution one can hope to achieve when determining protein structure from noisy measurements using any unbiased estimator. Our simulation results show that when the RDC measurements are corrupted by Gaussian noise of realistic variance, both SOS based algorithms attain the CRB. We successfully apply our method in a divide-and-conquer fashion to determine the structure of ubiquitin from experimental NOE and RDC measurements obtained in two alignment media, achieving more accurate and faster reconstructions compared to the current state of the art.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号