首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Here we report the development of a new neural network based approach for rapid quantification of protein secondary structure from Fourier transform infrared (FTIR) spectra of proteins. A technique for efficiently reducing the amount of spectral data by almost 90% is suggested to facilitate faster neural network analysis. Additionally, an automatic procedure is introduced for selecting only those regions within the amide I band of protein FTIR spectra, which can be best related to secondary structure contents by subsequent neural network analysis. Based on a given reference set of FTIR spectra from proteins with known secondary structure, a subset of merely 29 out of 101 amide I absorbance values could be identified, which lead to an improved prediction accuracy. The average prediction accuracy achieved for helix, sheet, turn, bend, and other is 4.96% which is better than that achieved by alternative methods that have been previously reported indicating the significant potential of this approach. Our suggested automatic amide I frequency selection procedure may be easily extended to identify promising regions from spectral data recorded by other spectroscopic techniques, like for example circular dichroism spectroscopy.  相似文献   

2.
An infrared (ir) method to determine the secondary structure of proteins in solution using the amide I region of the spectrum has been devised. The method is based on the circular dichroism (CD) matrix method for secondary structure analysis given by Compton and Johnson (L. A. Compton and W. C. Johnson, 1986, Anal. Biochem. 155, 155-167). The infrared data matrix was constructed from the normalized Fourier transform infrared spectra from 1700 to 1600 cm-1 of 17 commercially available proteins. The secondary structure matrix was constructed from the X-ray data of the seventeen proteins with secondary structure elements of helix, beta-sheet, beta-turn, and other (random). The CD and ir methods were compared by analyzing the proteins of the CD and ir databases as unknowns. Both methods produce similar results compared to structures obtained by X-ray crystallographic means with the CD slightly better for helix conformation, and the ir slightly better for beta-sheet. The relatively good ir analysis for concanavalin A and alpha-chymotrypsin indicate that the ir method is less affected by the presence of aromatic groups. The concentration of the protein and the cell path length need not be known for the ir analysis since the spectra can be normalized to the total ir intensity in the amide I region. The ir spectra for helix, beta-sheet, beta-turn, and other, as extracted from the data-base, agree with the literature band assignments. The ir data matrix and the inverse matrix necessary to analyze unknown proteins are presented.  相似文献   

3.
Cai S  Singh BR 《Biochemistry》2004,43(9):2541-2549
Fourier transform infrared spectroscopy is becoming an increasingly important method to study protein secondary structure. The amide I region of the protein infrared spectrum is the widely used region, whereas the amide III region has been comparatively neglected due to its low signal. Since there is no water interference in the amide III region and, more importantly, the different secondary structures of proteins have more resolved differences in their amide III spectra, it is quite promising to use the amide III region to determine protein secondary structure. In our current study, a partial least squares (PLS) method was used to predict protein secondary structures from the protein IR spectra. The IR spectra of aqueous solutions of 16 different proteins of known crystal structure have been recorded, and the amide I, the amide III, and the amide I combined with the amide III region of these proteins were used to set up the calibration set for the PLS algorithm. Our results correlate quite well with the data from X-ray studies, and the prediction from the amide III region is better than that from amide I or combined amide I and amide III regions.  相似文献   

4.
The secondary structures of two recombinant human growth factors, platelet-derived growth factor and the basic fibroblast growth factor, have been quantitatively examined by using Fourier transform infrared spectroscopy. These studies, carried out in D2O, focus on the conformation-sensitive amide I region. Resolution enhancement techniques, including Fourier self-deconvolution and derivative spectroscopy, were combined with band fitting techniques to quantitate the spectral information from the broad, overlapped amide I band. The results presented here indicate that both proteins are rich in beta-structures. The remainder of the platelet-derived growth factor exists largely as irregular or disordered conformations with a moderate amount of alpha-helix and a small portion of reverse turns. By contrast, the basic fibroblast growth factor is much richer in reverse turn structures and contains a lesser portion of irregularly folded or disordered structures. Based on circular dichroism studies which indicate no alpha-helix in bFGF, components near 1655 cm-1 in the bFGF spectra are tentatively assigned to loops. The results of this study emphasize the need for using a combination of circular dichroism and infrared studies for spectroscopic characterization of protein secondary structure.  相似文献   

5.
Fourier-transform infrared spectroscopy is a method of choice for the experimental determination of protein secondary structure. Numerous approaches have been developed during the past 15 years. A critical parameter that has not been taken into account systematically is the selection of the wavenumbers used for building the mathematical models used for structure prediction. The high quality of the current Fourier-transform infrared spectrometers makes the absorbance at every single wavenumber a valid and almost noiseless type of information. We address here the question of the amount of independent information present in the infrared spectra of proteins for the prediction of the different secondary structure contents. It appears that, at most, the absorbance at three distinct frequencies of the spectra contain all the nonredundant information that can be related to one secondary structure content. The ascending stepwise method proposed here identifies the relevance of each wavenumber of the infrared spectrum for the prediction of a given secondary structure and yields a particularly simple method for computing the secondary structure content. Using the 50-protein database built beforehand to contain as little fold redundancy as possible, the standard error of prediction in cross-validation is 5.5% for the alpha-helix, 6.6% for the beta-sheet, and 3.4% for the beta-turn.  相似文献   

6.
Three different approaches (propensity curve shifting, hydropathy index evaluation, and iterative attribution/cancellation of secondary structure) to the use of secondary structure percentages derived from circular dichroism measurements to improve the success rate of a protein secondary structure prediction method, without using decision constants, are described and compared. Propensity-curve shifting appears to be the best-performing approach, bearing an increase of 5.3% in the success rate of single-residue structural prediction when exact information on the secondary structure, obtained by X-ray crystallography, is employed; with information of an accuracy comparable to that obtainable by circular dichroism, the improvement stays between 3.5 and 4.9%, for a three-state prediction. Although developed with circular dichroism in mind, the method can use percentages of secondary structure obtained by any other experimental methodology from which they can be inferred, for instance Raman spectroscopy and infrared spectroscopy.  相似文献   

7.
A simple approach to estimate the number of alpha-helical and beta-strand segments from protein circular dichroism spectra is described. The alpha-helix and beta-sheet conformations in globular protein structures, assigned by DSSP and STRIDE algorithms, were divided into regular and distorted fractions by considering a certain number of terminal residues in a given alpha-helix or beta-strand segment to be distorted. The resulting secondary structure fractions for 29 reference proteins were used in the analyses of circular dichroism spectra by the SELCON method. From the performance indices of the analyses, we determined that, on an average, four residues per alpha-helix and two residues per beta-strand may be considered distorted in proteins. The number of alpha-helical and beta-strand segments and their average length in a given protein were estimated from the fraction of distorted alpha-helix and beta-strand conformations determined from the analysis of circular dichroism spectra. The statistical test for the reference protein set shows the high reliability of such a classification of protein secondary structure. The method was used to analyze the circular dichroism spectra of four additional proteins and the predicted structural characteristics agree with the crystal structure data.  相似文献   

8.
Circular dichroism spectra of proteins are extremely sensitive to secondary structure. Nevertheless, circular dichroism spectra should not be analyzed for protein secondary structure unless they are measured to at least 184 nm. Even if all the various types ofβ-turns are lumped together, there are at least 5 different types of secondary structure in a protein (α-helix, antiparallelβ-sheet, parallelβ-sheet,β-turn, and other structures not included in the first 4 categories). It is not possible to solve for these 5 parameters unless there are 5 equations. Singular value decomposition can be used to show that circular dichroism spectra of proteins measured to 200 nm contain only 2 pieces of information, while spectra measured to 190 nm contain about 4. Adding the constraint that the sum of secondary structures must equal 1 provides another piece of information, but even with this constraint, spectra measured to 190 nm simply do not analyze well for the 5 unknowns in secondary structure. Spectra measured to 184 nm do contain 5 pieces of information and we have used such spectra successfully to analyze a variety of proteins for their component secondary structures.  相似文献   

9.
Fourier transform infrared spectroscopy has become well known as a sensitive and informative tool for studying secondary structure in proteins. Present analysis of the conformation-sensitive amide I region in protein infrared spectra, when combined with band narrowing techniques, provides more information concerning protein secondary structure than can be meaningfully interpreted. This is due in part to limited models for secondary structure. Using the algorithm described in the previous paper of this series, we have generated a library of substructures for several trypsin-like serine proteases. This library was used as a basis for spectra-structure correlations with infrared spectra in the amide I' region, for five homologous proteins for which spectra were collected. Use of the substructure library has allowed correlations not previously possible with template-based methods of protein conformational analysis.  相似文献   

10.
J M Chandonia  M Karplus 《Proteins》1999,35(3):293-306
A primary and a secondary neural network are applied to secondary structure and structural class prediction for a database of 681 non-homologous protein chains. A new method of decoding the outputs of the secondary structure prediction network is used to produce an estimate of the probability of finding each type of secondary structure at every position in the sequence. In addition to providing a reliable estimate of the accuracy of the predictions, this method gives a more accurate Q3 (74.6%) than the cutoff method which is commonly used. Use of these predictions in jury methods improves the Q3 to 74.8%, the best available at present. On a database of 126 proteins commonly used for comparison of prediction methods, the jury predictions are 76.6% accurate. An estimate of the overall Q3 for a given sequence is made by averaging the estimated accuracy of the prediction over all residues in the sequence. As an example, the analysis is applied to the target beta-cryptogein, which was a difficult target for ab initio predictions in the CASP2 study; it shows that the prediction made with the present method (62% of residues correct) is close to the expected accuracy (66%) for this protein. The larger database and use of a new network training protocol also improve structural class prediction accuracy to 86%, relative to 80% obtained previously. Secondary structure content is predicted with accuracy comparable to that obtained with spectroscopic methods, such as vibrational or electronic circular dichroism and Fourier transform infrared spectroscopy.  相似文献   

11.
Vibrational circular dichroism (VCD) spectra have been measured for 23 globular proteins dissolved in H2O/phosphate buffer over the 1400 to 1100 cm−1region which encompasses the amide III mode. Spectral responses characteristic of the dominant secondary structure type were found as broad features at ∼1300 cm−1, with the extreme forms having positive VCD for highly helical proteins and negative VCD for highly sheet-containing proteins. Quantitative correlation with secondary structure was carried out using previously developed factor analysis and restricted multiple regression (FA/RMR) techniques. Since the absorbance intensity of the amide III mode is difficult to determine due to overlap with other transitions, an alternative, absolute intensity-independent, simple structural analysis method was used. A linear regression was developed between the fractional components of secondary structure for the protein set and the overlap integrals of the normalized spectra from the set with that of a selected protein. The results of this simple method are quite comparable to those of the FA/RMR approach for analysis with amide III VCD. On the other hand, test calculations with the new method when used with electronic CD spectra are not as good as FA/RMR due to its more intensity-dependent relationship with secondary structure.  相似文献   

12.

Background  

Circular dichroism spectroscopy is a widely used technique to analyze the secondary structure of proteins in solution. Predictive methods use the circular dichroism spectra from proteins of known tertiary structure to assess the secondary structure contents of a protein with unknown structure given its circular dichroism spectrum.  相似文献   

13.
The infrared amide bands are sensitive to the conformation of the polypeptide backbone of proteins. Since the backbone of proteins folds in complex spatial arrangements, the amide bands of these proteins result from the superimposition of vibration modes corresponding to the different types of structural motifs (alpha helices, beta sheets, etc.). Initially, band deconvolution techniques were applied to determine the secondary structure of proteins, i.e., the abundance of each structural motif in the polypeptide chain was directly related to the area of the suitable deconvolved vibration modes under the amide I band (1700-1600 cm(-1)). Recently, several multivariate regression methods have been used to predict the secondary structure of proteins as an alternative to the previous methods. They are based on establishing a relationship between a matrix of infrared protein spectra and another that includes their secondary structure, expressed as the fractions of the different structural motifs, determined from X-ray analysis. In this study, we investigated the use of the local regression method interval partial least-squares (iPLS) to seek improvements to the full-spectrum PLS and other regression methods. The local character of iPLS avoids the use of spectral regions that can introduce noise or that can be irrelevant for prediction and focuses on finding specific spectral ranges related to each secondary structure motif in all the proteins. This study has been applied to a representative protein data set with infrared spectra covering a large wavenumber range, including amides I-III bands (1700-1200 cm(-1)). iPLS has revealed new structural mode assignments related to less explored amide bands and has offered a satisfactory predictive ability using a small amount of selected specific spectral information.  相似文献   

14.
Hering JA  Innocent PR  Haris PI 《Proteomics》2003,3(8):1464-1475
Fourier transform infrared (FTIR) spectroscopy is a very flexible technique for characterization of protein secondary structure. Measurements can be carried out rapidly in a number of different environments based on only small quantities of proteins. For this technique to become more widely used for protein secondary structure characterization, however, further developments in methods to accurately quantify protein secondary structure are necessary. Here we propose a structural classification of proteins (SCOP) class specialized neural networks architecture combining an adaptive neuro-fuzzy inference system (ANFIS) with SCOP class specialized backpropagation neural networks for improved protein secondary structure prediction. Our study shows that proteins can be accurately classified into two main classes "all alpha proteins" and "all beta proteins" merely based on the amide I band maximum position of their FTIR spectra. ANFIS is employed to perform the classification task to demonstrate the potential of this architecture with moderately complex problems. Based on studies using a reference set of 17 proteins and an evaluation set of 4 proteins, improved predictions were achieved compared to a conventional neural network approach, where structure specialized neural networks are trained based on protein spectra of both "all alpha" and "all beta" proteins. The standard errors of prediction (SEPs) in % structure were improved by 4.05% for helix structure, by 5.91% for sheet structure, by 2.68% for turn structure, and by 2.15% for bend structure. For other structure, an increase of SEP by 2.43% was observed. Those results were confirmed by a "leave-one-out" run with the combined set of 21 FTIR spectra of proteins.  相似文献   

15.
Protein classification and characterization often rely on the information contained in the protein secondary structure. Protein class assignment is usually based on X-ray diffraction measurements, which need the protein in a crystallized form, or on NMR spectra, to obtain the structure of a protein in solution. Simple spectroscopic techniques, such as circular dichroism (CD) and infrared (IR) spectroscopies, are also known to be related to protein secondary structure, but they have seldom been used for protein classification. To see the potential of CD, IR, and combined CD/IR measurements for protein classification, unsupervised pattern recognition methods, Principal Component Analysis (PCA) and cluster analysis, are proposed first to check for natural grouping tendencies of proteins according to their measured spectra. Partial Least Squares Discriminant Analysis (PLS-DA), a supervised pattern recognition method, is used afterwards to test the possibility to model explicitly each protein class and to test these models in class assignment of unknown proteins. Determination of the protein secondary structure, understood as the prediction of the abundance of the different secondary structure motifs in the biomolecule, was carried out with the local regression method interval Partial Least Squares (iPLS). CD, IR, and CD/IR measurements were correlated to the fraction of the motif to be predicted, determined from X-ray measurements. iPLS builds models extracting the spectral information most correlated to a specific secondary motif and avoids the use of irrelevant spectral regions. Spectral intervals chosen by iPLS models provide structural information which can be used to confirm previous biochemical assignments or identify new motif-related spectral features. The predictive ability of the models built with the selected spectral regions has a quality similar to previous classical approaches.  相似文献   

16.
Comparative studies of the secondary structures of six model proteins, adsorbed onto aluminum hydroxide gel (Alhydrogel) or in aqueous solution, were carried out by Fourier transform infrared (FTIR) spectroscopy. The analysis of high-quality spectra of all six model proteins, with a broad range of secondary structure compositions, obtained at 15 mg/ml by the conventional method and at 0.5 and 1.0 mg/ml adsorbed to Alhydrogel revealed that adsorption onto hydrophilic surfaces of aluminum hydroxide particles did not alter the secondary structures of the proteins. The results of this study suggest that adsorbing proteins to Alhydrogel provides a means of obtaining FTIR spectra to study secondary structure and conformational changes of proteins in aqueous solution at very low concentrations. The new procedure effectively lowers the concentration requirement for FTIR studies of proteins in aqueous solutions by at least 40-fold, as compared with the conventional FTIR method. It permits FTIR study of proteins to be carried out in the same concentration range as is used for circular dichroism and fluorescence, thereby making it possible to compare structural information obtained by three commonly used techniques in protein biophysical characterization.  相似文献   

17.
The IR absorption frequencies as derived from second derivatives of the Fourier transform IR spectra of the amide I' bands of globular proteins in D2O are compared to those obtained from band fitting of the vibrational circular dichroism (VCD) spectra. The two sets of frequencies are in very good agreement, yielding consistent ranges where amide I' VCD and IR features occur. Use of VCD to complement the IR allows one to add sign information to the frequency information so that features occurring in the overlapping frequency ranges that might arise from different secondary structures can be better discriminated. From this comparison, it is clear that correlation just of the frequency of a given IR transition to secondary structure can lead to a nonunique solution. Different sign patterns were identified for correlated groups of globular proteins in restricted frequency ranges that have been previously assigned to defined secondary structural elements. Hence, different secondary structural elements must contribute band components to a given frequency range.  相似文献   

18.
The effectiveness is compared of the infrared spectroscopy in the amide I region and UV circular dichroism to the analysis of the protein secondary structure by the example of the linker histone H1 and bovine serum albumin (BSA). It has been shown that the application of a diamond ATR cell gives the quantitative estimate of the fraction of α-helices and β-structures which are in a good agreement with UV circular dichroism spectroscopy. It has been shown that the histone H1 is able to aggregate, which results in considerable changes in its secondary structure.  相似文献   

19.
This article presents SOMCD, an improved method for the evaluation of protein secondary structure from circular dichroism spectra, based on Kohonen's self-organizing maps (SOM). Protein circular dichroism (CD) spectra are used to train a SOM, which arranges the spectra on a two-dimensional map. Location in the map reflects the secondary structure composition of a protein. With SOMCD, the prediction of beta-turn has been included. The number of spectra in the training set has been increased, and it now includes 39 protein spectra and 6 reference spectra. Finally, SOM parameters have been chosen to minimize distortion and make the network produce clusters with known properties. Estimation results show improvements compared with the previous version, K2D, which, in addition, estimated only three secondary structure components; the accuracy of the method is more uniform over the different secondary structures.  相似文献   

20.
This paper demonstrates that secondary structure information beyond purely protein secondary structure content can be predicted from FTIR (Fourier transform infrared spectroscopy) spectra of proteins with a high degree of accuracy. Both neural networks and adaptive neuro-fuzzy inference systems (ANFISs) were employed to predict helix/sheet segment information. The best results were achieved using ANFISs with fuzzy subtractive clustering based on normalised, compressed amide I data with an average SEP (standard error of prediction, root mean of squared errors) of 1.51. Predictions for average helix/sheet length based merely on the amide I band maximum position in combination with the full-width at half-height resulted in a comparable average SEP of 1.62. This suggests the importance of information on the position and width of the amide I band maximum for the prediction of helix/sheet segment information. Finally, the most promising pattern recognition approaches found in this study were applied to a protein with an as yet unknown x-ray structure: native a1-antichymotrypsin (a1-ACT).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号