首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Novel algorithms are presented for automated NOESY peak picking and NOE signal identification in homonuclear 2D and heteronuclear-resolved 3D [1H,1H]-NOESY spectra during de novoprotein structure determination by NMR, which have been implemented in the new software ATNOS (automated NOESY peak picking). The input for ATNOS consists of the amino acid sequence of the protein, chemical shift lists from the sequence-specific resonance assignment, and one or several 2D or 3D NOESY spectra. In the present implementation, ATNOS performs multiple cycles of NOE peak identification in concert with automated NOE assignment with the software CANDID and protein structure calculation with the program DYANA. In the second and subsequent cycles, the intermediate protein structures are used as an additional guide for the interpretation of the NOESY spectra. By incorporating the analysis of the raw NMR data into the process of automated de novoprotein NMR structure determination, ATNOS enables direct feedback between the protein structure, the NOE assignments and the experimental NOESY spectra. The main elements of the algorithms for NOESY spectral analysis are techniques for local baseline correction and evaluation of local noise level amplitudes, automated determination of spectrum-specific threshold parameters, the use of symmetry relations, and the inclusion of the chemical shift information and the intermediate protein structures in the process of distinguishing between NOE peaks and artifacts. The ATNOS procedure has been validated with experimental NMR data sets of three proteins, for which high-quality NMR structures had previously been obtained by interactive interpretation of the NOESY spectra. The ATNOS-based structures coincide closely with those obtained with interactive peak picking. Overall, we present the algorithms used in this paper as a further important step towards objective and efficient de novoprotein structure determination by NMR.  相似文献   

2.
A major time-consuming step of protein NMR structure determination is the generation of reliable NOESY cross peak lists which usually requires a significant amount of manual interaction. Here we present a new algorithm for automated peak picking involving wavelet de-noised NOESY spectra in a process where the identification of peaks is coupled to automated structure determination. The core of this method is the generation of incremental peak lists by applying different wavelet de-noising procedures which yield peak lists of a different noise content. In combination with additional filters which probe the consistency of the peak lists, good convergence of the NOESY-based automated structure determination could be achieved. These algorithms were implemented in the context of the ARIA software for automated NOE assignment and structure determination and were validated for a polysulfide-sulfur transferase protein of known structure. The procedures presented here should be commonly applicable for efficient protein NMR structure determination and automated NMR peak picking. Electronic supplementary material Electronic supplementary material is available for this article at and accessible for authorised users.  相似文献   

3.
Peak overlap is one of the major factors complicating the analysis of biomolecular NMR spectra. We present a general method for predicting the extent of peak overlap in multidimensional NMR spectra and its validation using both, experimental data sets and Monte Carlo simulation. The method is based on knowledge of the magnetization transfer pathways of the NMR experiments and chemical shift statistics from the Biological Magnetic Resonance Data Bank. Assuming a normal distribution with characteristic mean value and standard deviation for the chemical shift of each observable atom, an analytic expression was derived for the expected overlap probability of the cross peaks. The analytical approach was verified to agree with the average peak overlap in a large number of individual peak lists simulated using the same chemical shift statistics. The method was applied to eight proteins, including an intrinsically disordered one, for which the prediction results could be compared with the actual overlap based on the experimentally measured chemical shifts. The extent of overlap predicted using only statistical chemical shift information was in good agreement with the overlap that was observed when the measured shifts were used in the virtual spectrum, except for the intrinsically disordered protein. Since the spectral complexity of a protein NMR spectrum is a crucial factor for protein structure determination, analytical overlap prediction can be used to identify potentially difficult proteins before conducting NMR experiments. Overlap predictions can be tailored to particular classes of proteins by preparing statistics from corresponding protein databases. The method is also suitable for optimizing recording parameters and labeling schemes for NMR experiments and improving the reliability of automated spectra analysis and protein structure determination.  相似文献   

4.
NMR spectroscopy is a widely used technique for characterizing the structure and dynamics of macromolecules. Often large amounts of NMR data are required to characterize the structure of proteins. To save valuable time and resources on data acquisition, simulated data is useful in the developmental phase, for data analysis, and for comparison with experimental data. However, existing tools for this purpose can be difficult to use, are sometimes specialized for certain types of molecules or spectra, or produce too idealized data. Here we present a fast, flexible and robust tool, VirtualSpectrum, for generating peak lists for most multi-dimensional NMR experiments for both liquid and solid state NMR. It is possible to tune the quality of the generated peak lists to include sources of artifacts from peak overlap, noise and missing signals. VirtualSpectrum uses an analytic expression to represent the spectrum and derive the peak positions, seamlessly handling overlap between signals. We demonstrate our tool by comparing simulated and experimental spectra for different multi-dimensional NMR spectra and analyzing systematically three cases where overlap between peaks is particularly relevant; solid state NMR data, liquid state NMR homonuclear 1H and 15N-edited spectra, and 2D/3D heteronuclear correlation spectra of unstructured proteins. We analyze the impact of protein size and secondary structure on peak overlap and on the accuracy of structure determination based on data of different qualities simulated by VirtualSpectrum.  相似文献   

5.
Clean absorption mode NMR data acquisition is presented based on mirrored time domain sampling and widely used time-proportional phase incrementation (TPPI) for quadrature detection. The resulting NMR spectra are devoid of dispersive frequency domain peak components. Those peak components exacerbate peak identification and shift peak maxima, and thus impede automated spectral analysis. The new approach is also of unique value for obtaining clean absorption mode reduced-dimensionality projection NMR spectra, which can rapidly provide high-dimensional spectral information for high-throughput NMR structure determination.  相似文献   

6.
Protein structure determination is a very important topic in structural genomics,which helps people to understand varieties of biological functions such as protein-protein interactions,protein–DNA interactions and so on.Nowadays,nuclear magnetic resonance(NMR) has often been used to determine the three-dimensional structures of protein in vivo.This study aims to automate the peak picking step,the most important and tricky step in NMR structure determination.We propose to model the NMR spectrum by a mixture of bivariate Gaussian densities and use the stochastic approximation Monte Carlo algorithm as the computational tool to solve the problem.Under the Bayesian framework,the peak picking problem is casted as a variable selection problem.The proposed method can automatically distinguish true peaks from false ones without preprocessing the data.To the best of our knowledge,this is the first effort in the literature that tackles the peak picking problem for NMR spectrum data using Bayesian method.  相似文献   

7.
MOTIVATION: High-throughput NMR structure determination is a goal that will require progress on many fronts, one of which is rapid resonance assignment. An important rate-limiting step in the resonance assignment process is accurate identification of resonance peaks in the NMR spectra. Peak-picking schemes range from incomplete (which lose essential assignment connectivities) to noisy (which obscure true connectivities with many false ones). We introduce an automated preassignment process that removes false peaks from noisy peak lists by requiring consensus between multiple NMR experiments and exploiting a priori information about NMR spectra. This process is designed to accept multiple input formats and generate multiple output formats, in an effort to be compatible with a variety of user preferences. RESULTS: Automated preprocessing with APART rapidly identifies and removes false peaks from initial peak lists, reduces the burden of manual data entry, and documents and standardizes the peak filtering process. Successful preprocessing is demonstrated by the increased number of correct assignments obtained when data are submitted to an automated assignment program. AVAILABILITY: APART is available from http://sir.lanl.gov/NMR/APART.htm CONTACT: npawley@lanl.gov; rmichalczyk@lanl.gov SUPPLEMENTARY INFORMATION: Manual pages with installation instructions, procedures and screen shots can also be found at http://sir.lanl.gov/NMR/APART_Manual1.pdf.  相似文献   

8.
ADAPT-NMR (Assignment-directed Data collection Algorithm utilizing a Probabilistic Toolkit in NMR) represents a groundbreaking prototype for automated protein structure determination by nuclear magnetic resonance (NMR) spectroscopy. With a [(13)C,(15)N]-labeled protein sample loaded into the NMR spectrometer, ADAPT-NMR delivers complete backbone resonance assignments and secondary structure in an optimal fashion without human intervention. ADAPT-NMR achieves this by implementing a strategy in which the goal of optimal assignment in each step determines the subsequent step by analyzing the current sum of available data. ADAPT-NMR is the first iterative and fully automated approach designed specifically for the optimal assignment of proteins with fast data collection as a byproduct of this goal. ADAPT-NMR evaluates the current spectral information, and uses a goal-directed objective function to select the optimal next data collection step(s) and then directs the NMR spectrometer to collect the selected data set. ADAPT-NMR extracts peak positions from the newly collected data and uses this information in updating the analysis resonance assignments and secondary structure. The goal-directed objective function then defines the next data collection step. The procedure continues until the collected data support comprehensive peak identification, resonance assignments at the desired level of completeness, and protein secondary structure. We present test cases in which ADAPT-NMR achieved results in two days or less that would have taken two months or more by manual approaches.  相似文献   

9.
A double quantum filter is inserted into a two-dimensional correlated (COSY) 1H NMR experiment to obtain phase-sensitive spectra in which both cross peak and diagonal peak multiplets have anti-phase fine structure, and in which the cross peaks and the major contribution to the diagonal peaks have absorption lineshapes in both dimensions. The elimination of the dispersive character of the diagonal peaks in phase-sensitive, double quantum-filtered COSY spectra allows identification of cross peaks lying immediately adjacent to the diagonal, which represents a significant improvement over the conventional COSY experiment.  相似文献   

10.
Summary Determining the volumes of peaks in 2D NMR spectra can be prohibitively difficult in cases of overlapping, broad lines. Deconvolution and parameter estimation can be attempted on either the time-domain or the frequency-domain data. We present a method of estimating spectral parameters from frequency-domain data, using a combination of Lorentzian and Gaussian lineshapes for reference lines. This approach combines a previously published method of projecting the data on a linear space spanned by reference lines with a nonlinear least-squares fitting algorithm. Comparison of this method with other published methods of frequency-domain deconvolution shows that it is both more precise and more accurate when estimating 2D volumes.  相似文献   

11.
Efficient analysis of protein 2D NMR spectra using the software packageEASY   总被引:10,自引:0,他引:10  
Summary The programEASY supports the spectral analysis of biomacromolecular two-dimensional (2D) nuclear magnetic resonance (NMR) data. It provides a user-friendly, window-based environment in which to view spectra for interactive interpretation. In addition, it includes a number of automated routines for peakpicking, spin-system identification, sequential resonance assignment in polypeptide chains, and cross peak integration. In this uniform environment, all resulting parameter lists can be recorded on disk, so that the paper plots and handwritten notes which normally accompany manual assignment of spectra can be largely eliminated. For example, in a protein structure determination by 2D1H NMR,EASY accepts the frequency domain datasets as input, and after combined use of the automated and interactive routines it can yield a listing of conformational constraints in the format required as input for the calculation of the 3D structure. The program was extensively tested with current protein structure determinations in our laboratory. In this paper, its main features are illustrated with data on the protein basic pancreatic trypsin inhibitor.  相似文献   

12.
Software for fitting of NMR spectra in MATLAB is presented. Spectra are fitted in the frequency domain, using Fourier transformed lineshapes, which are derived using the experimental acquisition and processing parameters. This yields more accurate fits compared to common fitting methods that use Lorentzian or Gaussian functions. Furthermore, a very time-efficient algorithm for calculating and fitting spectra has been developed. The software also performs initial peak picking, followed by subsequent fitting and refinement of the peak list, by iteratively adding and removing peaks to improve the overall fit. Estimation of error on fitting parameters is performed using a Monte-Carlo approach. Many fitting options allow the software to be flexible enough for a wide array of applications, while still being straightforward to set up with minimal user input.  相似文献   

13.
Peak lists derived from nuclear magnetic resonance (NMR) spectra are commonly used as input data for a variety of computer assisted and automated analyses. These include automated protein resonance assignment and protein structure calculation software tools. Prior to these analyses, peak lists must be aligned to each other and sets of related peaks must be grouped based on common chemical shift dimensions. Even when programs can perform peak grouping, they require the user to provide uniform match tolerances or use default values. However, peak grouping is further complicated by multiple sources of variance in peak position limiting the effectiveness of grouping methods that utilize uniform match tolerances. In addition, no method currently exists for deriving peak positional variances from single peak lists for grouping peaks into spin systems, i.e. spin system grouping within a single peak list. Therefore, we developed a complementary pair of peak list registration analysis and spin system grouping algorithms designed to overcome these limitations. We have implemented these algorithms into an approach that can identify multiple dimension-specific positional variances that exist in a single peak list and group peaks from a single peak list into spin systems. The resulting software tools generate a variety of useful statistics on both a single peak list and pairwise peak list alignment, especially for quality assessment of peak list datasets. We used a range of low and high quality experimental solution NMR and solid-state NMR peak lists to assess performance of our registration analysis and grouping algorithms. Analyses show that an algorithm using a single iteration and uniform match tolerances approach is only able to recover from 50 to 80% of the spin systems due to the presence of multiple sources of variance. Our algorithm recovers additional spin systems by reevaluating match tolerances in multiple iterations. To facilitate evaluation of the algorithms, we developed a peak list simulator within our nmrstarlib package that generates user-defined assigned peak lists from a given BMRB entry or database of entries. In addition, over 100,000 simulated peak lists with one or two sources of variance were generated to evaluate the performance and robustness of these new registration analysis and peak grouping algorithms.  相似文献   

14.
High-throughput, data-directed computational protocols for Structural Genomics (or Proteomics) are required in order to evaluate the protein products of genes for structure and function at rates comparable to current gene-sequencing technology. This paper presents the JIGSAW algorithm, a novel high-throughput, automated approach to protein structure characterization with nuclear magnetic resonance (NMR). JIGSAW applies graph algorithms and probabilistic reasoning techniques, enforcing first-principles consistency rules in order to overcome a 5-10% signal-to-noise ratio. It consists of two main components: (1) graph-based secondary structure pattern identification in unassigned heteronuclear NMR data, and (2) assignment of spectral peaks by probabilistic alignment of identified secondary structure elements against the primary sequence. Deferring assignment eliminates the bottleneck faced by traditional approaches, which begin by correlating peaks among dozens of experiments. JIGSAW utilizes only four experiments, none of which requires 13C-labeled protein, thus dramatically reducing both the amount and expense of wet lab molecular biology and the total spectrometer time. Results for three test proteins demonstrate that JIGSAW correctly identifies 79-100% of alpha-helical and 46-65% of beta-sheet NOE connectivities and correctly aligns 33-100% of secondary structure elements. JIGSAW is very fast, running in minutes on a Pentium-class Linux workstation. This approach yields quick and reasonably accurate (as opposed to the traditional slow and extremely accurate) structure calculations. It could be useful for quick structural assays to speed data to the biologist early in an investigation and could in principle be applied in an automation-like fashion to a large fraction of the proteome.  相似文献   

15.
Desulforedoxin is a simple dimeric protein isolated from Desulfovibrio gigas containing a distorted rubredoxin-like center with one iron coordinated by four cysteinyl residues (7.9?kDa with a 36-amino-acid monomer). 1H NMR spectra of the oxidized Dx(Fe3+) and reduced Dx(Fe2+) forms were analyzed. The spectra show substantial line broadening due to the paramagnetism of iron. However, very low-field-shifted resonances, assigned to Hβ protons, were observed in the reduced state and their temperature dependence analyzed. The active site of Dx was reconstituted with zinc, and its solution structure was determined using 2D NMR methods. This diamagnetic form gave high-resolution NMR data enabling the identification of all the amino acid spin systems. Sequential assignment and the determination of secondary structural elements was attempted using 2D NOESY experiments. However, because of the symmetrical dimer nature of the protein standard, NMR sequential assignment methods could not resolve all cross peaks due to inter- and intra-chain effects. The X-ray structure enabled the spatial relationship between the monomers to be obtained, and resolved the assignment problems. Secondary structural features could be identified from the NMR data; an antiparallel β-sheet running from D5 to V18 with a well-defined β-turn around cysteines C9 and C12. The section G22 to T25 is poorly defined by the NMR data and is followed by a turn around V27-C29. The C-terminus ends up near residues V6 and Y7. Distance geometry (DG) calculations allowed families of structures to be generated from the NMR data. A family of structures with a low target function violation for the Dx monomer and dimer were found to have secondary structural elements identical to those seen in the X-ray structure. The amide protons for G4, D5, G13, L11 NH and Q14 NHε amide protons, H-bonded in the X-ray structure, were not seen by NMR as slowly exchanging, while structural disorder at the N-terminus, for the backbone at E10 and for the section G22–T25, was observed. Comparison between the Fe and Zn forms of Dx suggests that metal substitution does not have an effect on the structure of the protein.  相似文献   

16.
Peptide mass fingerprinting, regardless of becoming complementary to tandem mass spectrometry for protein identification, is still the subject of in-depth study because of its higher sample throughput, higher level of specificity for single peptides and lower level of sensitivity to unexpected post-translational modifications compared with tandem mass spectrometry. In this study, we propose, implement and evaluate a uniform approach using support vector machines to incorporate individual concepts and conclusions for accurate PMF. We focus on the inherent attributes and critical issues of the theoretical spectrum (peptides), the experimental spectrum (peaks) and spectrum (masses) alignment. Eighty-one feature-matching patterns derived from cleavage type, uniqueness and variable masses of theoretical peptides together with the intensity rank of experimental peaks were proposed to characterize the matching profile of the peptide mass fingerprinting procedure. We developed a new strategy including the participation of matched peak intensity redistribution to handle shared peak intensities and 440 parameters were generated to digitalize each feature-matching pattern. A high performance for an evaluation data set of 137 items was finally achieved by the optimal multi-criteria support vector machines approach, with 491 final features out of a feature vector of 35,640 normalized features through cross training and validating a publicly available "gold standard" peptide mass fingerprinting data set of 1733 items. Compared with the Mascot, MS-Fit, ProFound and Aldente algorithms commonly used for MS-based protein identification, the feature-matching patterns algorithm has a greater ability to clearly separate correct identifications and random matches with the highest values for sensitivity (82%), precision (97%) and F1-measure (89%) of protein identification. Several conclusions reached via this research make general contributions to MS-based protein identification. Firstly, inherent attributes showed comparable or even greater robustness than other explicit. As an inherent attribute of an experimental spectrum, peak intensity should receive considerable attention during protein identification. Secondly, alignment between intense experimental peaks and properly digested, unique or non-modified theoretical peptides is very likely to occur in positive peptide mass fingerprinting. Finally, normalization by several types of harmonic factors, including missed cleavages and mass modification, can make important contributions to the performance of the procedure.  相似文献   

17.
Elucidation of high-resolution protein structures by NMR spectroscopy requires a large number of distance constraints that are derived from nuclear Overhauser effects between protons (NOEs). Due to the high level of spectral overlap encountered in 2D NMR spectra of proteins, the measurement of high quality distance constraints requires higher dimensional NMR experiments. Although four-dimensional Fourier transform (FT) NMR experiments can provide the necessary kind of spectral information, the associated measurement times are often prohibitively long. Covariance NMR spectroscopy yields 2D spectra that exhibit along the indirect frequency dimension the same high resolution as along the direct dimension using minimal measurement time. The generalization of covariance NMR to 4D NMR spectroscopy presented here exploits the inherent symmetry of certain 4D NMR experiments and utilizes the trace metric between donor planes for the construction of a high-resolution spectral covariance matrix. The approach is demonstrated for a 4D (13)C-edited NOESY experiment of ubiquitin. The 4D covariance spectrum narrows the line-widths of peaks strongly broadened in the FT spectrum due to the necessarily short number of increments collected, and it resolves otherwise overlapped cross peaks allowing for an increase in the number of NOE assignments to be made from a given dataset. At the same time there is no significant decrease in the positive predictive value of observing a peak as compared to the corresponding 4D Fourier transform spectrum. These properties make the 4D covariance method a potentially valuable tool for the structure determination of larger proteins and for high-throughput applications in structural biology.  相似文献   

18.
A novel formalism for estimating the complex motions of proteins and other flexible macromolecules from NMR relaxation measurements is applied to 13C NMR relaxation data on the Bovine Pancreatic Trypsin Inhibitor (M. W. 6,500). Six experimental parameters measured at two field strengths are accounted for by a minimum of three motions at each carbon group. Low frequency components make small but finite contribution to the relaxation of all resonances, suggesting a general low frequency distortion of the backbone. Rotational diffusion of the protein makes a relatively minor contribution to the relaxation process. For aliphatic groups, rotation of side chains dominates relaxation.  相似文献   

19.
The structure of a ubiquitin-like protein, small ubiquitin-related modifier-1 (SUMO-1), was earlier determined using homonuclear nuclear magnetic resonance (NMR) spectroscopy, since the spectral quality of the protein was not suitable for heteronuclear NMR data collection. In this study, a slightly different construct of the SUMO-1 gene was used for protein over-expression. The protein purified from this construct showed high spectral qualities, therefore, multi-dimensional heteronuclear NMR data for a dynamic study and structural determination were acquired. The structure of SUMO-1 obtained in this study differs in several respects from the structure obtained from homonuclear NMR data. Furthermore, structural differences were observed between the new SUMO-1 and ubiquitin structures. These differences may be important for SUMO-1-specific recognition in cells. Additionally, relaxation parameters indicate that SUMO-1 undergoes highly anisotropic tumbling in solution and that the long amino (N)-terminal sequence of SUMO-1 is highly dynamic with increasing flexibility towards the end.  相似文献   

20.
We develop an iterative relaxation algorithm called RIBRA for NMR protein backbone assignment. RIBRA applies nearest neighbor and weighted maximum independent set algorithms to solve the problem. To deal with noisy NMR spectral data, RIBRA is executed in an iterative fashion based on the quality of spectral peaks. We first produce spin system pairs using the spectral data without missing peaks, then the data group with one missing peak, and finally, the data group with two missing peaks. We test RIBRA on two real NMR datasets, hbSBD and hbLBD, and perfect BMRB data (with 902 proteins) and four synthetic BMRB data which simulate four kinds of errors. The accuracy of RIBRA on hbSBD and hbLBD are 91.4% and 83.6%, respectively. The average accuracy of RIBRA on perfect BMRB datasets is 98.28%, and 98.28%, 95.61%, 98.16%, and 96.28% on four kinds of synthetic datasets, respectively.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号