首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
Protein classification and characterization often rely on the information contained in the protein secondary structure. Protein class assignment is usually based on X-ray diffraction measurements, which need the protein in a crystallized form, or on NMR spectra, to obtain the structure of a protein in solution. Simple spectroscopic techniques, such as circular dichroism (CD) and infrared (IR) spectroscopies, are also known to be related to protein secondary structure, but they have seldom been used for protein classification. To see the potential of CD, IR, and combined CD/IR measurements for protein classification, unsupervised pattern recognition methods, Principal Component Analysis (PCA) and cluster analysis, are proposed first to check for natural grouping tendencies of proteins according to their measured spectra. Partial Least Squares Discriminant Analysis (PLS-DA), a supervised pattern recognition method, is used afterwards to test the possibility to model explicitly each protein class and to test these models in class assignment of unknown proteins. Determination of the protein secondary structure, understood as the prediction of the abundance of the different secondary structure motifs in the biomolecule, was carried out with the local regression method interval Partial Least Squares (iPLS). CD, IR, and CD/IR measurements were correlated to the fraction of the motif to be predicted, determined from X-ray measurements. iPLS builds models extracting the spectral information most correlated to a specific secondary motif and avoids the use of irrelevant spectral regions. Spectral intervals chosen by iPLS models provide structural information which can be used to confirm previous biochemical assignments or identify new motif-related spectral features. The predictive ability of the models built with the selected spectral regions has a quality similar to previous classical approaches.  相似文献   

2.

Background  

Circular Dichroism (CD) spectroscopy is a widely used method for studying protein structures in solution. Modern synchrotron radiation CD (SRCD) instruments have considerably higher photon fluxes than do conventional lab-based CD instruments, and hence have the ability to routinely measure CD data to much lower wavelengths. Recently a new reference dataset of SRCD spectra of proteins of known structure, designed to cover secondary structure and fold space, has been produced which includes low wavelength (vacuum ultraviolet – VUV) data. However, the existing algorithms used to calculate protein secondary structures from CD data have not been designed to take optimal advantage of the additional information in these low wavelength data.  相似文献   

3.
Matsuo K  Watanabe H  Gekko K 《Proteins》2008,73(1):104-112
Synchrotron-radiation vacuum-ultraviolet circular dichroism (VUVCD) spectroscopy can significantly improve the predictive accuracy of the contents and segment numbers of protein secondary structures by extending the short-wavelength limit of the spectra. In the present study, we combined VUVCD spectra down to 160 nm with neural-network (NN) method to improve the sequence-based prediction of protein secondary structures. The secondary structures of 30 target proteins (test set) were assigned into alpha-helices, beta-strands, and others by the DSSP program based on their X-ray crystal structures. Combining the alpha-helix and beta-strand contents estimated from the VUVCD spectra of the target proteins improved the overall sequence-based predictive accuracy Q(3) for three secondary-structure components from 59.5 to 60.7%. Incorporating the position-specific scoring matrix in the NN method improved the predictive accuracy from 70.9 to 72.1% when combining the secondary-structure contents, to 72.5% when combining the numbers of segments, and finally to 74.9% when filtering the VUVCD data. Improvement in the sequence-based prediction of secondary structures was also apparent in two other indices of the overall performance: the correlation coefficient (C) and the segment overlap value (SOV). These results suggest that VUVCD data could enhance the predictive accuracy to over 80% when combined with the currently best sequence-prediction algorithms, greatly expanding the applicability of VUVCD spectroscopy to protein structural biology.  相似文献   

4.
The Protein Circular Dichroism Data Bank (PCDDB) is a newly released resource for structural biology. It is a web-accessible (http://pcddb.cryst.bbk.ac.uk) data bank for circular dichroism (CD) and synchrotron radiation circular dichroism (SRCD) spectra and their associated experimental and secondary metadata, with links to protein sequence and structure data banks. It is designed to provide a public repository for CD spectroscopic data on macromolecules, to parallel the Protein Data Bank (PDB) for crystallographic, electron microscopic, and nuclear magnetic resonance spectroscopic data. Similarly to the PDB, it includes validation checking procedures to ensure good practice and the integrity of the deposited data. This paper reports on the first public release of the PCDDB, which provides access to spectral data that comprise standard reference datasets.  相似文献   

5.
Selected regions of infarred (ir) and circular dichroism (CD) spectral data from 10 proteins were combined and analyzed by a factor analysis method. The regions consisted of the area normalized amide I region from 1700 to 1600 cm-1 for the ir spectra and from 178 to 240 nm for the CD spectra. Each CD spectrum was scaled by a factor of 0.5 before appending the data to the ir spectral data. The scaling factor was deemed necessary to account for relative intensity differences between the ir and CD data and provided nearly optimum agreement between secondary structure estimated by the combined approach to secondary structure determined by X-ray crystallography. The ir/CD combined approach to estimation of helix, beta-sheet, beta-turn, and other or undefined secondary structure agreed with X-ray crystallographic determined structure better than estimation using data from either method alone. Correlation coefficients between X-ray and ir/CD combined secondary structure determinations were 0.99 for helix, 0.90 for beta-sheet, 0.70 for beta-turn, and 0.78 for other structure. The four most significant eigenvectors or basis spectra from eigenanalysis of the ir/CD data are presented as well as generalized inverse spectra for four secondary structures.  相似文献   

6.
Circular dichroism (CD) spectroscopy is a widely‐used method for characterizing the secondary structures of proteins. The well‐established and highly used analysis website, DichroWeb (located at: http://dichroweb.cryst.bbk.ac.uk/html/home.shtml) enables the facile quantitative determination of helix, sheet, and other secondary structure contents of proteins based on their CD spectra. DichroWeb includes a range of reference datasets and algorithms, plus graphical and quantitative methods for determining the quality of the analyses produced. This article describes the current website content, usage and accessibility, as well as the many upgraded features now present in this highly popular tool that was originally created nearly two decades ago.  相似文献   

7.
Circular dichroism (CD) is an excellent tool for rapid determination of the secondary structure and folding properties of proteins that have been obtained using recombinant techniques or purified from tissues. The most widely used applications of protein CD are to determine whether an expressed, purified protein is folded, or if a mutation affects its conformation or stability. In addition, it can be used to study protein interactions. This protocol details the basic steps of obtaining and interpreting CD data, and methods for analyzing spectra to estimate the secondary structural composition of proteins. CD has the advantage that measurements may be made on multiple samples containing < or =20 microg of proteins in physiological buffers in a few hours. However, it does not give the residue-specific information that can be obtained by x-ray crystallography or NMR.  相似文献   

8.
The effects of spectral magnitude on the calculated secondary structures derived from circular dichroism (CD) spectra were examined for a number of the most commonly used algorithms and reference databases. Proteins with different secondary structures, ranging from mostly helical to mostly beta-sheet, but which were not components of existing reference databases, were used as test systems. These proteins had known crystal structures, so it was possible to ascertain the effects of magnitude on both the accuracy of determining the secondary structure and the goodness-of-fit of the calculated structures to the experimental data. It was found that most algorithms are highly sensitive to spectral magnitude, and that the goodness-of-fit parameter may be a useful tool in assessing the correct scaling of the data. This means that parameters that affect magnitude, including calibration of the instrument, the spectral cell pathlength, and the protein concentration, must be accurately determined to obtain correct secondary structural analyses of proteins from CD data using empirical methods.  相似文献   

9.
We have used the circular dichroism and infrared spectra of a specially designed 50 protein database [Oberg, K.A., Ruysschaert, J.M. & Goormaghtigh, E. (2003) Protein Sci. 12, 2015-2031] in order to optimize the accuracy of spectroscopic protein secondary structure determination using multivariate statistical analysis methods. The results demonstrate that when the proteins are carefully selected for the diversity in their structure, no smaller subset of the database contains the necessary information to describe the entire set. One conclusion of the paper is therefore that large protein databases, observing stringent selection criteria, are necessary for the prediction of unknown proteins. A second important conclusion is that only the comparison of analyses run on circular dichroism and infrared spectra independently is able to identify failed solutions in the absence of known structure. Interestingly, it was also found in the course of this study that the amide II band has high information content and could be used alone for secondary structure prediction in place of amide I.  相似文献   

10.
Secondary structures of proteins have been predicted using neural networks from their Fourier transform infrared spectra. To improve the generalization ability of the neural networks, the training data set has been artificially increased by linear interpolation. The leave-one-out approach has been used to demonstrate the applicability of the method. Bayesian regularization has been used to train the neural networks and the predictions have been further improved by the maximum-likelihood estimation method. The networks have been tested and standard error of prediction (SEP) of 4.19% for alpha helix, 3.49% for beta sheet, and 3.15% for turns have been achieved. The results indicate that there is a significant decrease in the SEP for each type of structure parameter compared to previous works.  相似文献   

11.
A new procedure based on the statistical method of "variable selection" is used to predict the secondary structure of proteins from circular dichroism spectra. Variable selection adds the flexibility found in the Provencher and Gl?ckner method (S. W. Provencher and J. Gl?ckner, 1981, Biochemistry 20, 33-37) to the method of Hennessey and Johnson (J. P. Hennessey and W. C. Johnson, 1981, Biochemistry 20, 1085-1094). Two analytical methods are presented for choosing a solution from the series generated by the Provencher and Gl?ckner method, and this improves the technique. All three methods are compared and it is shown that both the variable selection method and the improved Provencher and Gl?ckner methods have equivalent reliability superior to the original Hennessey and Johnson method. For the new variable selection method, correlation coefficients calculated between X-ray structure and predicted secondary structures for data measured to 178 nm are: 0.97 for alpha-helix, 0.75 for beta-sheet, 0.50 for beta-turn, and 0.89 for other structures. Although the variable selection method improves the analysis of circular dichroism data truncated at 190 nm, data measured to 178 nm gives superior results. It is shown that improving the fit to the measured CD beyond the accuracy of the data can result in poorer analyses.  相似文献   

12.
Three different approaches (propensity curve shifting, hydropathy index evaluation, and iterative attribution/cancellation of secondary structure) to the use of secondary structure percentages derived from circular dichroism measurements to improve the success rate of a protein secondary structure prediction method, without using decision constants, are described and compared. Propensity-curve shifting appears to be the best-performing approach, bearing an increase of 5.3% in the success rate of single-residue structural prediction when exact information on the secondary structure, obtained by X-ray crystallography, is employed; with information of an accuracy comparable to that obtainable by circular dichroism, the improvement stays between 3.5 and 4.9%, for a three-state prediction. Although developed with circular dichroism in mind, the method can use percentages of secondary structure obtained by any other experimental methodology from which they can be inferred, for instance Raman spectroscopy and infrared spectroscopy.  相似文献   

13.
Estimation of a protein's secondary structure from its circular dichroism spectrum usually requires accurate knowledge of the concentration and pathlength of the sample. Two recently described methods avoid this problem by analysis of g-factor spectra (McPhie, Anal. Biochem. 293, 109-119) or scaling of relative intensities (Raussens et al., Anal. Biochem. 319, 114-121). Application of the two methods to the same samples shows that they can have similar efficacies. Calculation with the latter method is more rapid, but the performance of the former is maintained over reduced wavelength ranges.  相似文献   

14.
Circular dichroism (CD) is a spectroscopic technique widely used for estimating protein secondary structures in aqueous solution, but its accuracy has been doubted in recent work. In the present paper, the contents of nine globular proteins with known secondary structures were determined by CD spectroscopy and Fourier transform infrared spectroscopy (FTIR) in aqueous solution. A large deviation was found between the CD spectra and X-ray data, even when the experimental conditions were optimized. The content determined by FTIR was in good agreement with the X-ray crystallography data. Therefore, CD spectra are not recommended for directly calculating the content of a protein’s secondary structure.  相似文献   

15.
The various methods used to study the secondary and tertiary structure of myelin P2 protein, and one of its peptides (CN1), in aqueous solution, indicate that the native protein contains a significant fraction of alpha-helix, suggesting that the current prediction of an all-beta tertiary structure requires revision.  相似文献   

16.
17.
Circular dichroism spectra of proteins are extremely sensitive to secondary structure. Nevertheless, circular dichroism spectra should not be analyzed for protein secondary structure unless they are measured to at least 184 nm. Even if all the various types ofβ-turns are lumped together, there are at least 5 different types of secondary structure in a protein (α-helix, antiparallelβ-sheet, parallelβ-sheet,β-turn, and other structures not included in the first 4 categories). It is not possible to solve for these 5 parameters unless there are 5 equations. Singular value decomposition can be used to show that circular dichroism spectra of proteins measured to 200 nm contain only 2 pieces of information, while spectra measured to 190 nm contain about 4. Adding the constraint that the sum of secondary structures must equal 1 provides another piece of information, but even with this constraint, spectra measured to 190 nm simply do not analyze well for the 5 unknowns in secondary structure. Spectra measured to 184 nm do contain 5 pieces of information and we have used such spectra successfully to analyze a variety of proteins for their component secondary structures.  相似文献   

18.
Heat shock proteins are rapidly synthesized when cells are exposed to stressful agents that cause protein damage. The 70-kDa heat shock induced proteins and their closely related constitutively expressed cognate proteins bind to unfolded and aberrant polypeptides and to hydrophilic peptides. The structural features of the 70-kDa heat shock proteins that confer the ability to associate with diverse polypeptides are unknown. In this study, we have used circular dichroism (CD) spectroscopy and secondary structure prediction to analyze the secondary structure of the mammalian 70-kDa heat shock cognate protein (hsc 70). The far-ultraviolet CD spectrum of hsc 70 indicates a large fraction of alpha-helix in the protein and resembles the spectra one obtains from proteins of the alpha/beta structural class. Analysis of the CD spectra with deconvolution methods yielded estimates of secondary structure content. The results indicate about 40% alpha-helix and 20% aperiodic structure within hsc 70 and between 16-41% beta-sheet and 21-0% beta-turn. The Garnier-Osguthorpe-Robson method of secondary structure prediction was applied to the rat hsc 70 amino acid sequence. The predicted estimates of alpha-helix and aperiodic structure closely matched the values derived from the CD analysis, whereas the predicted estimates of beta-sheet and beta-turn were midway between the CD-derived values. Present evidence suggests that the polypeptide ligand binding domain of the 70-kDa heat shock protein resides within the C-terminal 160 amino acids [Milarski, K. L., & Morimoto, R. I. (1989) J. Cell Biol. 109, 1947-1962].(ABSTRACT TRUNCATED AT 250 WORDS)  相似文献   

19.
The estimation of protein secondary structure from circular dichroism spectra is described by a multivariate linear model with noise (Gauss-Markoff model). With this formalism the adequacy of the linear model is investigated, paying special attention to the estimation of the error in the secondary structure estimates. It is shown that the linear model is only adequate for the alpha-helix class. Since the failure of the linear model is most likely due to nonlinear effects, a locally linearized model is introduced. This model is combined with the selection of the estimate whose fractions of secondary structure summate to approximately one. Comparing the estimation from the CD spectra with the X-ray data (by using the data set of W.C. Johnson Jr., 1988, Annu. Rev. Biophys. Chem. 17, 145-166) the root mean square residuals are 0.09 (alpha-helix), 0.12 (anti-parallel beta-sheet), 0.08 (parallel beta-sheet), 0.07 (beta-turn), and 0.09 (other). These residuals are somewhat larger than the errors estimated from the locally linearized model. In addition to alpha-helix, in this model the beta-turn and "other" class are estimated adequately. But the estimation of the antiparallel and parallel beta-sheet class remains unsatisfactory. We compared the linear model and the locally linearized model with two other methods (S. W. Provencher and J. Gl?ckner, 1981, Biochemistry 20, 1085-1094; P. Manavalan and W. C. Johnson Jr., 1988, Anal. Biochem. 167, 76-85). The locally linearized model and the Provencher and Gl?ckner method provided the smallest residuals. However, an advantage of the locally linearized model is the estimation of the error in the secondary structure estimates.  相似文献   

20.
Circular dichroism (CD) spectroscopy is a widely used technique for the evaluation of protein secondary structures that has a significant impact for the understanding of molecular biology. However, the quantitative analysis of protein secondary structures based on CD spectra is still a hard work due to the serious overlap of the spectra corresponding to different structural motifs. Here, Tchebichef image moment (TM) approach is introduced for the first time, which can effectively extract the chemical features in CD spectra for the quantitative analysis of protein secondary structures. The proposed approach was applied to analyze reference set and the obtained results were evaluated by the strict statistical parameters such as correlation coefficient, cross‐validation correlation coefficient and root mean squared error. Compared with several specialized prediction methods, TM approach provided satisfactory results, especially for turns and unordered structures. Our study indicates that TM approach can be regarded as a feasible tool for the analysis of the secondary structures of proteins based on CD spectra. An available TMs package is provided and can be used directly for secondary structures prediction.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号