首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
One major problem with the existing algorithm for the prediction of protein structural classes is low accuracies for proteins from α/β and α+β classes. In this study, three novel features were rationally designed to model the differences between proteins from these two classes. In combination with other rational designed features, an 11-dimensional vector prediction method was proposed. By means of this method, the overall prediction accuracy based on 25PDB dataset was 1.5% higher than the previous best-performing method, MODAS. Furthermore, the prediction accuracy for proteins from α+β class based on 25PDB dataset was 5% higher than the previous best-performing method, SCPRED. The prediction accuracies obtained with the D675 and FC699 datasets were also improved.  相似文献   

2.
3.
We probe the stability and near-native energy landscape of protein fold space using powerful conformational sampling methods together with simple reduced models and statistical potentials. Fold space is represented by a set of 280 protein domains spanning all topological classes and having a wide range of lengths (33-300 residues) amino acid composition and number of secondary structural elements. The degrees of freedom are taken as the loop torsion angles. This choice preserves the native secondary structure but allows the tertiary structure to change. The proteins are represented by three-point per residue, three-dimensional models with statistical potentials derived from a knowledge-based study of known protein structures. When this space is sampled by a combination of parallel tempering and equi-energy Monte Carlo, we find that the three-point model captures the known stability of protein native structures with stable energy basins that are near-native (all α: 4.77 Å, all β: 2.93 Å, α/β: 3.09 Å, α+β: 4.89 Å on average and within 6 Å for 71.41%, 92.85%, 94.29% and 64.28% for all-α, all-β, α/β and α+β, classes, respectively). Denatured structures also occur and these have interesting structural properties that shed light on the different landscape characteristics of α and β folds. We find that α/β proteins with alternating α and β segments (such as the β-barrel) are more stable than proteins in other fold classes.  相似文献   

4.
Ding S  Zhang S  Li Y  Wang T 《Biochimie》2012,94(5):1166-1171
Knowledge of structural classes plays an important role in understanding protein folding patterns. In this paper, features based on the predicted secondary structure sequence and the corresponding E–H sequence are extracted. Then, an 11-dimensional feature vector is selected based on a wrapper feature selection algorithm and a support vector machine (SVM). Among the 11 selected features, 4 novel features are newly designed to model the differences between α/β class and α + β class, and other 7 rational features are proposed by previous researchers. To examine the performance of our method, a total of 5 datasets are used to design and test the proposed method. The results show that competitive prediction accuracies can be achieved by the proposed method compared to existing methods (SCPRED, RKS-PPSC and MODAS), and 4 new features are demonstrated essential to differentiate α/β and α + β classes. Standalone version of the proposed method is written in JAVA language and it can be downloaded from http://web.xidian.edu.cn/slzhang/paper.html.  相似文献   

5.
β转角作为一种蛋白质二级结构类型在蛋白质折叠、蛋白质稳定性、分子识别等方面具有重要作用.现有的β转角预测方法,没有将PDB等结构数据库中先前存在的同源序列的结构信息映射到待预测的蛋白质序列上.PDB存储的结构已超过70 000,因此对一条新确定的序列,有较大可能性从PDB中找到其同源序列.本文融合PDB中提取的同源结构信息(对每一待测序列,仅使用先于该序列存储于PDB中的同源信息)与NetTurnP预测,提出了一种新的β转角预测方法BTMapping,在经典的BT426数据集和本文构建的数据集EVA937上,以马修斯相关系数表示的预测精度分别为0.56、0.52,而仅使用NetTurnP的为0.50、0.46,以Qtotal表示的预测精度分别为81.4%、80.4%,而仅使用NetTurnP的为78.2%、77.3%.结果证实同源结构信息结合先进的β转角预测器如NetTurnP有助于改进β转角识别.BTMapping程序及相关数据集可从http://www.bio530.weebly.com获得.  相似文献   

6.
Li ZC  Zhou XB  Lin YR  Zou XY 《Amino acids》2008,35(3):581-590
Structural class characterizes the overall folding type of a protein or its domain. Most of the existing methods for determining the structural class of a protein are based on a group of features that only possesses a kind of discriminative information for the prediction of protein structure class. However, different types of discriminative information associated with primary sequence have been completely missed, which undoubtedly has reduced the success rate of prediction. We present a novel method for the prediction of protein structure class by coupling the improved genetic algorithm (GA) with the support vector machine (SVM). This improved GA was applied to the selection of an optimized feature subset and the optimization of SVM parameters. Jackknife tests on the working datasets indicated that the prediction accuracies for the different classes were in the range of 97.8–100% with an overall accuracy of 99.5%. The results indicate that the approach has a high potential to become a useful tool in bioinformatics.  相似文献   

7.
Astacus leptodactylus is a decapod crustacean fully adapted to freshwater where it spends its entire life cycle after hatching under huge osmoconcentration differences between the hemolymph and surrounding freshwater. We investigated the expression of mRNA encoding one ion transport-related protein, Na+/K+-ATPase α-subunit, and one putative housekeeping gene, β-actin, during crayfish ontogenesis using quantitative real-time PCR. A 216-amino acid part of the open reading frame region of the cDNA coding for the Na+/K+-ATPase α-subunit was sequenced from total embryo, juvenile and adult gill tissues. The predicted amino acid sequence showed a high percentage similarity to those of other invertebrates (up to 95%) and vertebrates (up to 69%). β-actin expression exhibited modest changes through embryonic development and early post-embryonic stage. The Na+/K+-ATPase α-subunit gene was expressed in all studied stages from metanauplius to juvenile. Two peaks of expression were observed: one in young embryos at 25% of embryonic development (EI = 100 μm), and one in embryos just before hatching (at EI = 420 μm), continuing in the freshly hatched juveniles. The Na+/K+-ATPase expression profile during embryonic development is time-correlated with the occurrence of other features, including ontogenesis of excretory antennal glands and differentiation of gill ionocytes linked to hyperosmoregulation processes and therefore involved in freshwater adaptation.  相似文献   

8.
In this paper, we intend to predict protein structural classes (α, β, α+β, or α/β) for low-homology data sets. Two data sets were used widely, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence homology being 40% and 25%, respectively. We propose to decompose the chaos game representation of proteins into two kinds of time series. Then, a novel and powerful nonlinear analysis technique, recurrence quantification analysis (RQA), is applied to analyze these time series. For a given protein sequence, a total of 16 characteristic parameters can be calculated with RQA, which are treated as feature representation of protein sequences. Based on such feature representation, the structural class for each protein is predicted with Fisher's linear discriminant algorithm. The jackknife test is used to test and compare our method with other existing methods. The overall accuracies with step-by-step procedure are 65.8% and 64.2% for 1189 and 25PDB data sets, respectively. With one-against-others procedure used widely, we compare our method with five other existing methods. Especially, the overall accuracies of our method are 6.3% and 4.1% higher for the two data sets, respectively. Furthermore, only 16 parameters are used in our method, which is less than that used by other methods. This suggests that the current method may play a complementary role to the existing methods and is promising to perform the prediction of protein structural classes.  相似文献   

9.
Knowledge of structural class plays an important role in understanding protein folding patterns. In this study, a simple and powerful computational method, which combines support vector machine with PSI-BLAST profile, is proposed to predict protein structural class for low-similarity sequences. The evolution information encoding in the PSI-BLAST profiles is converted into a series of fixed-length feature vectors by extracting amino acid composition and dipeptide composition from the profiles. The resulting vectors are then fed to a support vector machine classifier for the prediction of protein structural class. To evaluate the performance of the proposed method, jackknife cross-validation tests are performed on two widely used benchmark datasets, 1189 (containing 1092 proteins) and 25PDB (containing 1673 proteins) with sequence similarity lower than 40% and 25%, respectively. The overall accuracies attain 70.7% and 72.9% for 1189 and 25PDB datasets, respectively. Comparison of our results with other methods shows that our method is very promising to predict protein structural class particularly for low-similarity datasets and may at least play an important complementary role to existing methods.  相似文献   

10.
A mushroom lectin has been purified from ascomycete Cordyceps militaris, which is one of the most popular mushrooms in eastern Asia used as a nutraceutical and in traditional Chinese medicine. This lectin, designated CML, exhibited hemagglutination activity in mouse and rat erythrocytes, but not in human ABO erythrocytes. SDS-PAGE of CML revealed a single band with a molecular mass of 31.0 kDa under both nonreducing and reducing conditions that was stained by silver nitrate, and a 31.4 kDa peak in a Superdex-200 HR gel-filtration column. The hemagglutination activity was inhibited by sialoglycoproteins, but not in by mono- or disaccharides, asialoglycoproteins, or de-O-acetylated glycoprotein. The activity was maximal at pH 6.0–9.1 and at temperatures below 50 °C. Circular dichroism spectrum analysis revealed that CML comprises 27% α-helix, 12% β-sheets, 29% β-turns, and 32% random coils. Its binding specificity and secondary structure are similar to those of a fungal lectin from Arthrobotrys oligospora. However, the N-terminal amino acid sequence of CML differs greatly from those of other lectins. CML exhibits mitogenic activity against mouse splenocytes.  相似文献   

11.
Protein trafficking or protein sorting in eukaryotes is a complicated process and is carried out based on the information contaified in the protein. Many methods reported prediction of the subcellular location of proteins from sequence information. However, most of these prediction methods use a flat structure or parallel architecture to perform prediction. In this work, we introduce ensemble classifiers with features that are extracted directly from full length protein sequences to predict locations in the protein-sorting pathway hierarchically. Sequence driven features, sequence mapped features and sequence autocorrelation features were tested with ensemble learners and their performances were compared. When evaluated by independent data testing, ensemble based-bagging algorithms with sequence feature composition, transition and distribution (CTD) successfully classified two datasets with accuracies greater than 90%. We compared our results with similar published methods, and our method equally performed with the others at two levels in the secreted pathway. This study shows that the feature CTD extracted from protein sequences is effective in capturing biological features among compartments in secreted pathways.  相似文献   

12.
Zhang S  Ding S  Wang T 《Biochimie》2011,93(4):710-714
Information on the structural classes of proteins has been proven to be important in many fields of bioinformatics. Prediction of protein structural class for low-similarity sequences is a challenge problem. In this study, 11 features (including 8 re-used features and 3 newly-designed features) are rationally utilized to reflect the general contents and spatial arrangements of the secondary structural elements of a given protein sequence. To evaluate the performance of the proposed method, jackknife cross-validation tests are performed on two widely used benchmark datasets, 1189 and 25PDB with sequence similarity lower than 40% and 25%, respectively. Comparison of our results with other methods shows that our proposed method is very promising and may provide a cost-effective alternative to predict protein structural class in particular for low-similarity datasets.  相似文献   

13.
MOTIVATION: The solvent accessibility of amino acid residues plays an important role in tertiary structure prediction, especially in the absence of significant sequence similarity of a query protein to those with known structures. The prediction of solvent accessibility is less accurate than secondary structure prediction in spite of improvements in recent researches. The k-nearest neighbor method, a simple but powerful classification algorithm, has never been applied to the prediction of solvent accessibility, although it has been used frequently for the classification of biological and medical data. RESULTS: We applied the fuzzy k-nearest neighbor method to the solvent accessibility prediction, using PSI-BLAST profiles as feature vectors, and achieved high prediction accuracies. With leave-one-out cross-validation on the ASTRAL SCOP reference dataset constructed by sequence clustering, our method achieved 64.1% accuracy for a 3-state (buried/intermediate/exposed) prediction (thresholds of 9% for buried/intermediate and 36% for intermediate/exposed) and 86.7, 82.0, 79.0 and 78.5% accuracies for 2-state (buried/exposed) predictions (thresholds of each 0, 5, 16 and 25% for buried/exposed), respectively. Our method also showed slightly better accuracies than other methods by about 2-5% on the RS126 dataset and a benchmarking dataset with 229 proteins. AVAILABILITY: Program and datasets are available at http://biocom1.ssu.ac.kr/FKNNacc/ CONTACT: jul@ssu.ac.kr.  相似文献   

14.
Insect pests pose a significant and increasing threat to agricultural production worldwide. However, most existing recognition methods are built upon well-known convolutional neural networks, which limits the possibility of improving pest recognition accuracies. This research attempts to overcome this challenge from a novel perspective, constructing a simplified but very useful network for effective insect pest recognition by combining transformer architecture and convolution blocks. First, the representative features are extracted from the input image using a backbone convolutional neural network. Second, a new transformer attention-based classification head is proposed to sufficiently utilize spatial data from the features. With that, we explore different combinations for each module in our model and abstract our model into a simple and scalable architecture; we introduce more effective training strategies, pretrained models and data augmentation methods. Our models performance was evaluated on the IP102 benchmark dataset and achieved classification accuracies of 74.897% and 75.583% with minimal implementation costs at image resolutions of 224 × 224 pixels and 480 × 480 pixels, respectively. Our model also attains accuracies of 99.472% and 97.935% on the D0 dataset and Li's dataset, respectively, with an image resolution of 224 × 224 pixels. The experimental results demonstrate that our method is superior to the state-of-the-art methods on these datasets. Accordingly, the proposed model can be deployed in practice and provides additional insights into the related research.  相似文献   

15.
Abstract

The Aβ(1–42) peptide of Alzheimer's disease was studied by molecular modeling. The coordinates of the peptide were experimentally generated from solution-NMR spectroscopy, and the conformations were energy minimized using a combination of connectivity-based iterative partial equalization of orbital electronegativity with the MM + force field.

There is a central folded domain in the Aβ peptide. This part is an apolar α-helix. The remaining residues form β-sheets. Aggregation requires that β-sheets interact by noncovalent bonding forces. The unsoluble, aggregated complexes are energetically stable and have ordered structures.

A perspective in drug research is to design compounds that inhibit the hydrophobic cores of the individual Aβ peptides, blocking so the associations between the β-strains.  相似文献   

16.
《Theriogenology》2015,84(9):1469-1476
The pituitary LHβ and placental CGβ subunits are products of different genes in primates. The major structural difference between the two subunits is in the carboxy-terminal region, where the short carboxyl sequence of hLHβ is replaced by a longer O-glycosylated carboxy-terminal peptide in hCGβ. In association with this structural deviation, there are marked differences in the secretion kinetics and polarized routing of the two subunits. In equids, however, the CGβ and LHβ subunits are products of the same gene expressed in the placenta and pituitary (LHβ), and both contain a carboxy-terminal peptide. This unusual expression pattern intrigued us and led to our study of eLHβ subunit secretion by transfected Chinese hamster ovary and Madin–Darby canine kidney cells. In continuous labeling and pulse-chase experiments, the secretion of the eLHβ subunit from the transfected Chinese hamster ovary cells was inefficient (medium recovery of 16%–25%) and slow (t1/2 > 6.5 hours). This indicated that, the secretion of the eLHβ subunit resembles that of hLHβ rather than hCGβ. In Madin–Darby canine kidney cells grown on Transwell filters, the eLHβ subunit was preferentially secreted from the apical side, similar to the hCGβ subunit secretory route (∼65% of the total protein secreted). Taken together, these data suggested that secretion of the eLHβ subunit integrates features of both hLHβ and hCGβ subunits. We propose that the evolution of this intracellular behavior may fulfill the physiological demands for biosynthesis of the LH and CG β-subunits in the pituitary and placenta, respectively.  相似文献   

17.

Background  

A number of sequence-based methods exist for protein secondary structure prediction. Protein secondary structures can also be determined experimentally from circular dichroism, and infrared spectroscopic data using empirical analysis methods. It has been proposed that comparable accuracy can be obtained from sequence-based predictions as from these biophysical measurements. Here we have examined the secondary structure determination accuracies of sequence prediction methods with the empirically determined values from the spectroscopic data on datasets of proteins for which both crystal structures and spectroscopic data are available.  相似文献   

18.
Subcellular location is an important functional annotation of proteins. An automatic, reliable and efficient prediction system for protein subcellular localization is necessary for large-scale genome analysis. This paper describes a protein subcellular localization method which extracts features from protein profiles rather than from amino acid sequences. The protein profile represents a protein family, discards part of the sequence information that is not conserved throughout the family and therefore is more sensitive than the amino acid sequence. The amino acid compositions of whole profile and the N-terminus of the profile are extracted, respectively, to train and test the probabilistic neural network classifiers. On two benchmark datasets, the overall accuracies of the proposed method reach 89.1% and 68.9%, respectively. The prediction results show that the proposed method perform better than those methods based on amino acid sequences. The prediction results of the proposed method are also compared with Subloc on two redundance-reduced datasets.  相似文献   

19.
The solid state secondary structure of myoglobin, RNase A, concanavalin A (Con A), poly(L -lysine), and two linear heterooligomeric peptides were examined by both far-uv CD spectroscopy1 and by ir spectroscopy. The proteins associated from water solution on glass and mica surfaces into noncrystalline, amorphous films, as judged by transmission electron microscopy of carbon-platinum replicas of surface and cross-fractured layer. The association into the solid state induced insignificant changes in the amide CD spectra of all α-helical myoglobin, decreased the molar ellipticity of the α/β RNase A, and increased the molar ellipticity of all-β Con A with no change in the positions of the bands' maxima. High-temperature exposure of the films induced permanent changes in the conformation of all proteins, resulting in less α-helix and more β-sheet structure. The results suggest that the protein α-helices are less stable in films and that the secondary structure may rearrange into β-sheets at high temperature. Two heterooligomeric peptides and poly (L -lysine), all in solution at neutral pH with “random coil” conformation, formed films with variable degrees of their secondary structure in β-sheets or β-turns. The result corresponded to the protein-derived Chou-Fasman amino acid propensities, and depended on both temperature and solvent used. The ir and CD spectra correlations of the peptides in the solid state indicate that the CD spectrum of a “random” structure in films differs from random coil in solution. Formic acid treatment transformed the secondary structure of the protein and peptide films into a stable α-helix or β-sheet conformations. The results indicate that the proteins aggregate into a noncrystalline, glass-like state with preserved secondary structure. The solid state secondary structure may undergo further irreversible transformations induced by heat or solvent. © 1993 John Wiley & Sons, Inc.  相似文献   

20.
In this paper, support vector machines (SVMs) are applied to predict the nucleic-acid-binding proteins. We constructed two classifiers to differentiate DNA/RNA-binding proteins from non-nucleic-acid-binding proteins by using a conjoint triad feature which extract information directly from amino acids sequence of protein. Both self-consistency and jackknife tests show promising results on the protein datasets in which the sequences identity is less than 25%. In the self-consistency test, the predictive accuracy is 90.37% for DNA-binding proteins and 89.70% for RNA-binding proteins. In the jackknife test, the predictive accuracies are 78.93% and 76.75%, respectively. Comparison results show that our method is very competitive by outperforming other previously published sequence-based prediction methods.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号