共查询到20条相似文献,搜索用时 9 毫秒
1.
Christopher James Langmead Anthony Yan Ryan Lilien Lincong Wang Bruce Randall Donald 《Journal of computational biology》2004,11(2-3):277-298
High-throughput NMR structural biology can play an important role in structural genomics. We report an automated procedure for high-throughput NMR resonance assignment for a protein of known structure, or of a homologous structure. These assignments are a prerequisite for probing protein-protein interactions, protein-ligand binding, and dynamics by NMR. Assignments are also the starting point for structure determination and refinement. A new algorithm, called Nuclear Vector Replacement (NVR) is introduced to compute assignments that optimally correlate experimentally measured NH residual dipolar couplings (RDCs) to a given a priori whole-protein 3D structural model. The algorithm requires only uniform( 15)N-labeling of the protein and processes unassigned H(N)-(15)N HSQC spectra, H(N)-(15)N RDCs, and sparse H(N)-H(N) NOE's (d(NN)s), all of which can be acquired in a fraction of the time needed to record the traditional suite of experiments used to perform resonance assignments. NVR runs in minutes and efficiently assigns the (H(N),(15)N) backbone resonances as well as the d(NN)s of the 3D (15)N-NOESY spectrum, in O(n(3)) time. The algorithm is demonstrated on NMR data from a 76-residue protein, human ubiquitin, matched to four structures, including one mutant (homolog), determined either by x-ray crystallography or by different NMR experiments (without RDCs). NVR achieves an assignment accuracy of 92-100%. We further demonstrate the feasibility of our algorithm for different and larger proteins, using NMR data for hen lysozyme (129 residues, 97-100% accuracy) and streptococcal protein G (56 residues, 100% accuracy), matched to a variety of 3D structural models. Finally, we extend NVR to a second application, 3D structural homology detection, and demonstrate that NVR is able to identify structural homologies between proteins with remote amino acid sequences using a database of structural models. 相似文献
2.
A novel automated approach for the sequence specific NMR assignments of 1HN, 13C, 13C, 13C/1H and 15N spins in proteins, using triple resonance experimental data, is presented. The algorithm, TATAPRO (Tracked AuTomated Assignments in Proteins) utilizes the protein primary sequence and peak lists from a set of triple resonance spectra which correlate 1HN and 15N chemical shifts with those of 13C, 13C and 13C/1H. The information derived from such correlations is used to create a `master_list' consisting of all possible sets of 1HN
i, 15Ni, 13C
i, 13C
i, 13Ci/1H
i, 13C
i–1, 13C
i–1 and 13Ci–1/ 1H
i–1 chemical shifts. On the basis of an extensive statistical analysis of 13C and 13C chemical shift data of proteins derived from the BioMagResBank (BMRB), it is shown that the 20 amino acid residues can be grouped into eight distinct categories, each of which is assigned a unique two-digit code. Such a code is used to tag individual sets of chemical shifts in the master_list and also to translate the protein primary sequence into an array called pps_array. The program then uses the master_list to search for neighbouring partners of a given amino acid residue along the polypeptide chain and sequentially assigns a maximum possible stretch of residues on either side. While doing so, each assigned residue is tracked in an array called assig_array, with the two-digit code assigned earlier. The assig_array is then mapped onto the pps_array for sequence specific resonance assignment. The program has been tested using experimental data on a calcium binding protein from Entamoeba histolytica (Eh-CaBP, 15 kDa) having substantial internal sequence homology and using published data on four other proteins in the molecular weight range of 18–42 kDa. In all the cases, nearly complete sequence specific resonance assignments (> 95%) are obtained. Furthermore, the reliability of the program has been tested by deleting sets of chemical shifts randomly from the master_list created for the test proteins. 相似文献
3.
We report an automated procedure for high-throughput NMR resonance assignment for a protein of known structure, or of an homologous structure. Our algorithm performs Nuclear Vector Replacement (NVR) by Expectation/Maximization (EM) to compute assignments. NVR correlates experimentally-measured NH residual dipolar couplings (RDCs) and chemical shifts to a given a priori whole-protein 3D structural model. The algorithm requires only uniform (15)N-labelling of the protein, and processes unassigned H(N)-(15)N HSQC spectra, H(N)-(15)N RDCs, and sparse H(N)-H(N) NOE's (d(NN)s). NVR runs in minutes and efficiently assigns the (H(N),(15)N) backbone resonances as well as the sparse d(NN)s from the 3D (15)N-NOESY spectrum, in O (n(3)) time. The algorithm is demonstrated on NMR data from a 76-residue protein, human ubiquitin, matched to four structures, including one mutant (homolog), determined either by X-ray crystallography or by different NMR experiments (without RDCs). NVR achieves an average assignment accuracy of over 99%. We further demonstrate the feasibility of our algorithm for different and larger proteins, using different combinations of real and simulated NMR data for hen lysozyme (129 residues) and streptococcal protein G (56 residues), matched to a variety of 3D structural models. 相似文献
4.
5.
Background
Next-generation sequencing (NGS) has yielded an unprecedented amount of data for genetics research. It is a daunting task to process the data from raw sequence reads to variant calls and manually processing this data can significantly delay downstream analysis and increase the possibility for human error. The research community has produced tools to properly prepare sequence data for analysis and established guidelines on how to apply those tools to achieve the best results, however, existing pipeline programs to automate the process through its entirety are either inaccessible to investigators, or web-based and require a certain amount of administrative expertise to set up.Findings
Advanced Sequence Automated Pipeline (ASAP) was developed to provide a framework for automating the translation of sequencing data into annotated variant calls with the goal of minimizing user involvement without the need for dedicated hardware or administrative rights. ASAP works both on computer clusters and on standalone machines with minimal human involvement and maintains high data integrity, while allowing complete control over the configuration of its component programs. It offers an easy-to-use interface for submitting and tracking jobs as well as resuming failed jobs. It also provides tools for quality checking and for dividing jobs into pieces for maximum throughput.Conclusions
ASAP provides an environment for building an automated pipeline for NGS data preprocessing. This environment is flexible for use and future development. It is freely available at http://biostat.mc.vanderbilt.edu/ASAP. 相似文献6.
Reliable automated NOE assignment and structure calculation on the basis of a largely complete, assigned input chemical shift list and a list of unassigned NOESY cross peaks has recently become feasible for routine NMR protein structure calculation and has been shown to yield results that are equivalent to those of the conventional, manual approach. However, these algorithms rely on the availability of a virtually complete list of the chemical shifts. This paper investigates the influence of incomplete chemical shift assignments on the reliability of NMR structures obtained with automated NOESY cross peak assignment. The program CYANA was used for combined automated NOESY assignment with the CANDID algorithm and structure calculations with torsion angle dynamics at various degrees of completeness of the chemical shift assignment which was simulated by random omission of entries in the experimental 1H chemical shift lists that had been used for the earlier, conventional structure determinations of two proteins. Sets of structure calculations were performed choosing the omitted chemical shifts randomly among all assigned hydrogen atoms, or among aromatic hydrogen atoms. For comparison, automated NOESY assignment and structure calculations were performed with the complete experimental chemical shift but under random omission of NOESY cross peaks. When heteronuclear-resolved three-dimensional NOESY spectra are available the current CANDID algorithm yields in the absence of up to about 10% of the experimental 1H chemical shifts reliable NOE assignments and three-dimensional structures that deviate by less than 2 Å from the reference structure obtained using all experimental chemical shift assignments. In contrast, the algorithm can accommodate the omission of up to 50% of the cross peaks in heteronuclear- resolved NOESY spectra without producing structures with a RMSD of more than 2 Å to the reference structure. When only homonuclear NOESY spectra are available, the algorithm is slightly more susceptible to missing data and can tolerate the absence of up to about 7% of the experimental 1H chemical shifts or of up to 30% of the NOESY peaks.Abbreviations: BmPBPA – Bombyx mori pheromone binding protein form A; CYANA – combined assignment and dynamics algorithm for NMR applications; NMR – nuclear magnetic resonance; NOE – nuclear Overhauser effect; NOESY – NOE spectroscopy; RMSD – root-mean-square deviation; WmKT – Williopsis mrakii killer toxin 相似文献
7.
Tau protein is the longest disordered protein for which nearly complete backbone NMR resonance assignments have been reported. Full-length tau protein was initially assigned using a laborious combination of bootstrapping assignments from shorter tau fragments and conventional triple resonance NMR experiments. Subsequently it was reported that assignments of comparable quality could be obtained in a fully automated fashion from data obtained using reduced dimensionality NMR (RDNMR) experiments employing a large number of indirect dimensions. Although the latter strategy offers many advantages, it presents some difficulties if manual intervention, confirmation, or correction of the assignments is desirable, as may often be the case for long disordered and degenerate polypeptide sequences. Here we demonstrate that nearly complete backbone resonance assignments for full-length tau isoforms can be obtained without resorting either to bootstrapping from smaller fragments or to very high dimensionality experiments and automation. Instead, a set of RDNMR triple resonance experiments of modest dimensionality lend themselves readily to efficient and unambiguous manual assignments. An analysis of the backbone chemical shifts obtained in this fashion indicates several regions in full length tau with a notable propensity for helical or strand-like structure that are in good agreement with previous observations. 相似文献
8.
9.
10.
11.
Shivanand M. Pudakalakatti Abhinav Dubey Garima Jaipuria U. Shubhashree Satish Kumar Adiga Detlef Moskau Hanudatta S. Atreya 《Journal of biomolecular NMR》2014,58(3):165-173
We present a new method for rapid NMR data acquisition and assignments applicable to unlabeled (12C) or 13C-labeled biomolecules/organic molecules in general and metabolomics in particular. The method involves the acquisition of three two dimensional (2D) NMR spectra simultaneously using a dual receiver system. The three spectra, namely: (1) G-matrix Fourier transform (GFT) (3,2)D [13C, 1H] HSQC–TOCSY, (2) 2D 1H–1H TOCSY and (3) 2D 13C–1H HETCOR are acquired in a single experiment and provide mutually complementary information to completely assign individual metabolites in a mixture. The GFT (3,2)D [13C, 1H] HSQC–TOCSY provides 3D correlations in a reduced dimensionality manner facilitating high resolution and unambiguous assignments. The experiments were applied for complete 1H and 13C assignments of a mixture of 21 unlabeled metabolites corresponding to a medium used in assisted reproductive technology. Taken together, the experiments provide time gain of order of magnitudes compared to the conventional data acquisition methods and can be combined with other fast NMR techniques such as non-uniform sampling and covariance spectroscopy. This provides new avenues for using multiple receivers and projection NMR techniques for high-throughput approaches in metabolomics. 相似文献
12.
R Le Goas S R LaPlante A Mikou M A Delsuc E Guittet M Robin I Charpentier J Y Lallemand 《Biochemistry》1992,31(20):4867-4875
The solution structure of alpha-cobratoxin, a neurotoxin purified from the venom of the snake Naja naja siamensis, at pH 3.2 is reported. Sequence-specific assignments of the NMR resonances was attained by a combination of a generalized main-chain-directed strategy and of the sequential method. The NMR data show the presence of a triple-stranded beta-sheet (residues 19-25, 36-41, and 52-57), a short helix, and turns. An extensive number of NOE cross peaks were identified in the NOESY NMR maps. These were applied as distance constraints in a molecular modeling protocol which includes distance geometry and dynamical simulated annealing calculations. A single family of structures is observed which fold in such a way that three major loops emerge from a globular head. The solution and crystal structures of alpha-cobratoxin are very similar. This is in clear contrast to results reported for alpha-bungarotoxin where significant differences exist. 相似文献
13.
We have developed a graphics based algorithm for semi-automated protein NMR assignments. Using the basic sequential triple resonance assignment strategy, the method is inspired by the Boolean operators as it applies "AND"-, "OR"- and "NOT"-like operations on planes pulled out of the classical three-dimensional spectra to obtain its functionality. The method's strength lies in the continuous graphical presentation of the spectra, allowing both a semi-automatic peaklist construction and sequential assignment. We demonstrate here its general use for the case of a folded protein with a well-dispersed spectrum, but equally for a natively unfolded protein where spectral resolution is minimal. 相似文献
14.
Complete sequence-specific 1H NMR assignments for human insulin 总被引:3,自引:0,他引:3
Solvent conditions where human insulin could be studied by high-resolution NMR were determined. Both low pH and addition of acetonitrile were required to overcome the protein's self-association and to obtain useful spectra. Two hundred eighty-six 1H resonances were located and assigned to specific sites on the protein by using two-dimensional NMR methods. The presence and position of numerous dNN sequential NOE's indicate that the insulin conformation seen in crystallographic studies is largely retained under these solution conditions. Slowly exchanging protons were observed for seven backbone amide protons and were assigned to positions A15 and A16 and to positions B15-B19. These amides all occur within helical regions of the protein [Chawdhury, S.A., Dodson, E.J., Dodson, G.G., Reynolds, C.D., Tolley, S.P., Blundell, T.L., Cleasby, A., Pitts, J.E., Tickle, I.J., & Wood, S.P. (1983) Diabetologia 25, 460-464]. 相似文献
15.
Detailed structural and functional characterization of proteins by solution NMR requires sequence-specific resonance assignment. We present a set of transverse relaxation optimization (TROSY) based four-dimensional automated projection spectroscopy (APSY) experiments which are designed for resonance assignments of proteins with a size up to 40 kDa, namely HNCACO, HNCOCA, HNCACB and HN(CO)CACB. These higher-dimensional experiments include several sensitivity-optimizing features such as multiple quantum parallel evolution in a ‘just-in-time’ manner, aliased off-resonance evolution, evolution-time optimized APSY acquisition, selective water-handling and TROSY. The experiments were acquired within the concept of APSY, but they can also be used within the framework of sparsely sampled experiments. The multidimensional peak lists derived with APSY provided chemical shifts with an approximately 20 times higher precision than conventional methods usually do, and allowed the assignment of 90 % of the backbone resonances of the perdeuterated primase-polymerase ORF904, which contains 331 amino acid residues and has a molecular weight of 38.4 kDa. 相似文献
16.
A two-dimensional 1H NMR study on Megasphaera elsdenii flavodoxin in the reduced state. Sequential assignments 总被引:1,自引:0,他引:1
Assignments for the 137 amino acid residues of Megasphaera elsdenii flavodoxin in the reduced state have been made using the sequential resonance assignment procedure. Several hydroxyl and sulfhydryl protons were observed at 41 degrees C at pH 8.3. Spin systems were sequentially assigned using phase-sensitive two-dimensional-correlated spectroscopy and phase-sensitive nuclear Overhauser enhancement spectroscopy. Spectra of the protein in H2O and of protein preparations either completely or partly exchanged against 2H2O were obtained. Use of the fast electron shuttle between the paramagnetic semiquinone and the diamagnetic hydroquinone state greatly simplified the NMR spectra, making it possible to assign easily the 1H resonances of amino acid residues located in the immediate neighbourhood of the isoalloxazine ring. The majority of the nuclear Overhauser effect contracts between the flavin and the apoprotein correspond to the crystal structure of the flavin domain of Clostridium MP flavodoxin, but differences are also observed. The assignments provide the basis for the structure determination of M. elsdenii flavodoxin in the reduced state as well as for assigning the resonances of the oxidized flavodoxin. 相似文献
17.
Sheftic SR Garcia PP Robinson VL Gage DJ Alexandrescu AT 《Biomolecular NMR assignments》2011,5(1):55-58
Response regulators are terminal ends of bacterial two-component systems that undergo extensive structural reorganization
in response to phosphoryl transfer from their cognate histidine kinases. The response regulator encoded by the gene sma0114 of Sinorhizobium meliloti is a part of a unique class of two-component systems that employ HWE histidine kinases. The distinct features of Sma0114
include a PFxFATGY motif that houses the conserved threonine in the “Y–T coupling” conformational switch which mediates output
response through downstream protein–protein interactions, and the replacement of the conserved phenylalanine/tyrosine in Y–T
coupling by a leucine. Here we present 1H, 15N, and 13C NMR assignments for Sma0114. We identify the secondary structure of the protein based on TALOS chemical shift analysis,
3JHNHα coupling constants and hydrogen–deuterium exchange. The secondary structure determined by NMR is in good agreement with that
predicted from the sequence. Both methods suggest that Sma0114 differs from standard CheY-like folds by missing the fourth
α-helix. Our initial NMR characterization of Sma0114 paves the way to a full investigation of the structure and dynamics of
this response regulator. 相似文献
18.
Wolfram Gronwald Leigh Willard Timothy Jellard Robert F. Boyko Krishna Rajarathnam David S. Wishart Frank D. Sönnichsen Brian D. Sykes 《Journal of biomolecular NMR》1998,12(3):395-405
A suite of programs called CAMRA (Computer Aided Magnetic Resonance Assignment) has been developed for computer assisted residue-specific assignments of proteins. CAMRA consists of three units: ORB, CAPTURE and PROCESS. ORB predicts NMR chemical shifts for unassigned proteins using a chemical shift database of previously assigned homologous proteins supplemented by a statistically derived chemical shift database in which the shifts are categorized according to their residue, atom and secondary structure type. CAPTURE generates a list of valid peaks from NMR spectra by filtering out noise peaks and other artifacts and then separating the derived peak list into distinct spin systems. PROCESS combines the chemical shift predictions from ORB with the spin systems identified by CAPTURE to obtain residue specific assignments. PROCESS ranks the top choices for an assignment along with scores and confidence values. In contrast to other auto-assignment programs, CAMRA does not use any connectivity information but instead is based solely on matching predicted shifts with observed spin systems. As such, CAMRA represents a new and unique approach for the assignment of protein NMR spectra. CAMRA will be particularly useful in conjunction with other assignment methods and under special circumstances, such as the assignment of flexible regions in proteins where sufficient NOE information is generally not available. CAMRA was tested on two medium-sized proteins belonging to the chemokine family. It was found to be effective in predicting the assignment providing a database of previously assigned proteins with at least 30% sequence identity is available. CAMRA is versatile and can be used to include and evaluate heteronuclear and three-dimensional experiments. 相似文献
19.
Arezue F. B. Boroujerdi Bulent Binbuga John K. Young 《Biomolecular NMR assignments》2007,1(1):139-141
A better understanding of how salt affects enzyme activity can be gained via NMR studies of binary hvDHFR1:folate complex. Chemical shift assignments of the 17.9 kDa enzyme with bound substrate prepare the way for ongoing research
of the effects of salt on enzyme flexibility through relaxation studies. 相似文献
20.
It has been estimated that more than 20% of the proteins in the BMRB are improperly referenced and that about 1% of all chemical
shift assignments are mis-assigned. These statistics also reflect the likelihood that any newly assigned protein will have
shift assignment or shift referencing errors. The relatively high frequency of these errors continues to be a concern for
the biomolecular NMR community. While several programs do exist to detect and/or correct chemical shift mis-referencing or
chemical shift mis-assignments, most can only do one, or the other. The one program (SHIFTCOR) that is capable of handling
both chemical shift mis-referencing and mis-assignments, requires the 3D structure coordinates of the target protein. Given
that chemical shift mis-assignments and chemical shift re-referencing issues should ideally be addressed prior to 3D structure
determination, there is a clear need to develop a structure-independent approach. Here, we present a new structure-independent
protocol, which is based on using residue-specific and secondary structure-specific chemical shift distributions calculated
over small (3–6 residue) fragments to identify mis-assigned resonances. The method is also able to identify and re-reference
mis-referenced chemical shift assignments. Comparisons against existing re-referencing or mis-assignment detection programs
show that the method is as good or superior to existing approaches. The protocol described here has been implemented into
a freely available Java program called “Probabilistic Approach for protein Nmr Assignment Validation (PANAV)” and as a web
server () which can be used to validate and/or correct as well as re-reference assigned protein chemical shifts. 相似文献