首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
3.
It is a critical challenge to develop automated methods for fast and accurately determining the structures of proteins because of the increasingly widening gap between the number of sequence-known proteins and that of structure-known proteins in the post-genomic age. The knowledge of protein structural class can provide useful information towards the determination of protein structure. Thus, it is highly desirable to develop computational methods for identifying the structural classes of newly found proteins based on their primary sequence. In this study, according to the concept of Chou's pseudo amino acid composition (PseAA), eight PseAA vectors are used to represent protein samples. Each of the PseAA vectors is a 40-D (dimensional) vector, which is constructed by the conventional amino acid composition (AA) and a series of sequence-order correlation factors as original introduced by Chou. The difference among the eight PseAA representations is that different physicochemical properties are used to incorporate the sequence-order effects for the protein samples. Based on such a framework, a dual-layer fuzzy support vector machine (FSVM) network is proposed to predict protein structural classes. In the first layer of the FSVM network, eight FSVM classifiers trained by different PseAA vectors are established. The 2nd layer FSVM classifier is applied to reclassify the outputs of the first layer. The results thus obtained are quite promising, indicating that the new method may become a useful tool for predicting not only the structural classification of proteins but also their other attributes.  相似文献   

4.
MOTIVATION: Fold recognition is a key step in the protein structure discovery process, especially when traditional sequence comparison methods fail to yield convincing structural homologies. Although many methods have been developed for protein fold recognition, their accuracies remain low. This can be attributed to insufficient exploitation of fold discriminatory features. RESULTS: We have developed a new method for protein fold recognition using structural information of amino acid residues and amino acid residue pairs. Since protein fold recognition can be treated as a protein fold classification problem, we have developed a Support Vector Machine (SVM) based classifier approach that uses secondary structural state and solvent accessibility state frequencies of amino acids and amino acid pairs as feature vectors. Among the individual properties examined secondary structural state frequencies of amino acids gave an overall accuracy of 65.2% for fold discrimination, which is better than the accuracy by any method reported so far in the literature. Combination of secondary structural state frequencies with solvent accessibility state frequencies of amino acids and amino acid pairs further improved the fold discrimination accuracy to more than 70%, which is approximately 8% higher than the best available method. In this study we have also tested, for the first time, an all-together multi-class method known as Crammer and Singer method for protein fold classification. Our studies reveal that the three multi-class classification methods, namely one versus all, one versus one and Crammer and Singer method, yield similar predictions. AVAILABILITY: Dataset and stand-alone program are available upon request.  相似文献   

5.
To understand the folding behavior of proteins is an important and challenging problem in modern molecular biology. In the present investigation, a large number of features representing protein sequences were developed based on sequence autocorrelation weighted by properties of amino acid residues. Genetic algorithm (GA) combined with multiple linear regression (MLR) was employed to select significant features related to protein folding rates, and to build global predictive model. Moreover, local lazy regression (LLR) method was also used to predict the protein folding rates. The obtained results indicated that LLR performed much better than the global MLR model. The important properties of amino acid residues affecting protein folding rates were also analyzed. The results of this study will be helpful to understand the mechanism of protein folding. Our results also demonstrate that the features of amino acid sequence autocorrelation is effective in representing the relationship between protein sequence and folding rates, and the local method is a powerful tool to predict the protein folding rates.  相似文献   

6.
Protein sequence world is considerably larger than structure world. In consequence, numerous non-related sequences may adopt similar 3D folds and different kinds of amino acids may thus be found in similar 3D structures. By grouping together the 20 amino acids into a smaller number of representative residues with similar features, sequence world simplification may be achieved. This clustering hence defines a reduced amino acid alphabet (reduced AAA). Numerous works have shown that protein 3D structures are composed of a limited number of building blocks, defining a structural alphabet. We previously identified such an alphabet composed of 16 representative structural motifs (5-residues length) called Protein Blocks (PBs). This alphabet permits to translate the structure (3D) in sequence of PBs (1D). Based on these two concepts, reduced AAA and PBs, we analyzed the distributions of the different kinds of amino acids and their equivalences in the structural context. Different reduced sets were considered. Recurrent amino acid associations were found in all the local structures while other were specific of some local structures (PBs) (e.g Cysteine, Histidine, Threonine and Serine for the alpha-helix Ncap). Some similar associations are found in other reduced AAAs, e.g Ile with Val, or hydrophobic aromatic residues Trp with Phe and Tyr. We put into evidence interesting alternative associations. This highlights the dependence on the information considered (sequence or structure). This approach, equivalent to a substitution matrix, could be useful for designing protein sequence with different features (for instance adaptation to environment) while preserving mainly the 3D fold.  相似文献   

7.
The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence‐search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino‐acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence‐search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z‐score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales‐up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web‐server that is freely available at http://www.bo‐protscience.fr/forsa .  相似文献   

8.
Bovine amyloid protein AA: isolation and amino acid sequence analysis   总被引:7,自引:0,他引:7  
Amyloid-laden renal glomeruli were selectively isolated from a cow with a history of multiple organ inflammatory diseases which terminated in amyloid-induced glomerulopathy and severe proteinuria. Lyophilized amyloid fibrils obtained by water extraction procedures were dissolved in 6M guanidine hydrochloride and gel filtered on Sepharose CL6B and Sephacryl S-300 Superfine columns for slab gel electrophoresis, analytic isoelectric focusing, and amino acid sequence analyses. Electrophoresis of material from the major retarded peak of the elution profile revealed that bovine protein AA moves as one band with an apparent molecular mass of about 14,000 Daltons. Several distinct bands between approximately pH 4.0 and 5.0 were observed when this material was evaluated by analytic isoelectric focusing, thus having a pattern resembling that of human and dog protein AA. A blocked N-terminus was demonstrated when protein from the major retarded peak was subjected to amino acid sequencing, but cyanogen bromide cleavage followed by gel filtration produced 3 peptide fragments for amino acid sequence analysis. These peptides had a high degree of homology with positions 4-14, 18-24 and 25-49 of human protein AA. An apparent complete homology between bovine protein AA and protein AA from other species was apparent at positions 35-45, providing further evidence that this is a functionally significant part of the serum protein AA (SAA) molecule.  相似文献   

9.
The DA strain of Theiler's virus persists in the central nervous system of mice and causes chronic inflammation and demyelination. On the other hand, the GDVII strain causes an acute encephalitis and does not persist in surviving animals. Series of recombinants between infectious cDNA clones of the genomes of DA and GDVII viruses have been constructed. The analysis of the phenotypes of the recombinant viruses has shown that determinants of persistence and demyelination are present in the capsid proteins of DA virus. Chimeric viruses constructed by the different research groups gave consistent results, with one exception. Chimeras GD1B-2A/DAFL3 and GD1B-2C/DAFL3, which contain part of capsid protein VP2, capsid proteins VP3 and VP1, and different portions of P2 of GDVII in a DA background, were able to persist and cause demyelination. Chimera R4, whose genetic map is identical to that of GD1B-2A/DAFL3, was not. After exchanging the viral chimeras between laboratories and verifying each other's observations, new chimeras were generated in order to explain this difference. Here we report that the discrepancy can be attributed to a single amino acid difference in the sequence of the capsid protein VP2 of the two parental DA strains. DAFL3 (University of Chicago) and the chimeras derived from it, GD1B-2A/DAFL3 and GD1B-2C/DAFL3, contain a Lys at position 141, while TMDA (Institut Pasteur) and R4, the chimera derived from it, contain an Asn in that position. This amino acid is located at the tip of the EF loop, on the rim of the depression spanning the twofold axis of the capsid. These results show that a single amino acid change can confer the ability to persist and demyelinate to a chimeric Theiler's virus, and they pinpoint a region of the viral capsid that is important for this phenotype.  相似文献   

10.
Sulfotransferase (SULT) 1A3 catalyzes the sulfate conjugation of catecholamines and structurally related drugs. As a step toward studies of the possible contribution of inherited variation in SULT1A3 to the pathophysiology of human disease and/or variation in response to drugs related to catecholamines, we have resequenced all seven coding exons, three upstream non-coding exons, exon-intron splice junctions and the 5'-flanking region of SULT1A3 using DNA samples from 60 African-American (AA) and 60 Caucasian-American (CA) subjects. Eight single nucleotide polymorphisms (SNPs) were observed in AA and five in CA subjects, including one non-synonymous cSNP (Lys234Asn) that was observed only in AA subjects with an allele frequency of 4.2%. This change in amino acid sequence resulted in only 28 +/- 4.5% (mean +/- SEM) of the enzyme activity of the wild-type (WT) sequence after transient expression in COS-1 cells, with a parallel decrease (54 +/- 2.2% of WT) in level of SULT1A3 immunoreactive protein. Substrate kinetic studies failed to show significant differences in apparent Km values of the two allozymes for either dopamine (10.5 versus 10.2 micro m for WT and variant, respectively) or the cosubstrate 3'-phosphoadenosine 5'-phosphosulfate (0.114 versus 0.122 micro m, respectively). The decrease in level of immunoreactive protein in response to this single change in amino acid sequence was due, at least in part, to accelerated SULT1A3 degradation through a proteasome-mediated process. These observations raise the possibility of ethnic-specific inherited alterations in catecholamine sulfation in humans.  相似文献   

11.
The structural relationship between isoenzymes I and II of chloroplast glyceraldehyde-3-phosphate dehydrogenase (D-glyceraldehyde-3-phosphate: NADP+ oxidoreductase (phosphorylating) EC 1.2.1.13) has been established at the protein level. The complete primary structure of subunits A and B of glyceraldehyde-3-phosphate dehydrogenase I from Spinacia oleracea has been determined by sequence analysis of the corresponding tryptic peptides, aligned by fragments derived from cyanogen bromide and Staphylococcus proteinase V8 digestions and by partially sequencing each intact subunit. Subunit A has an Mr of 36,225 and consists of 337 amino acid residues, whilst subunit B (Mr 39,355) consists of 368 residues. The amino acid sequence of subunit B, as determined through direct analysis of the protein, is identical to that recently deduced at cDNA level (Brinkmann et al. (1989) Plant Mol. Biol. 13, 81-94). The two subunits share a common portion of amino acid sequence which differs by 66 amino acid residues. Subunit B has an extra C-terminal sequence of 31 amino acid residues. Chloroplast glyceraldehyde-3-phosphate dehydrogenase II was partially characterized by sequencing the N-terminal portion of the intact protein and some of its tryptic peptides. The sequences of all the examined fragments fit precisely that of the corresponding regions of subunit A from glyceraldehyde-3-phosphate dehydrogenase I.  相似文献   

12.
Zhang TL  Ding YS 《Amino acids》2007,33(4):623-629
Compared with the conventional amino acid composition (AA), the pseudo amino acid composition (PseAA) as originally introduced by Chou can incorporate much more information of a protein sequence; this remarkably enhances the power to use a discrete model for predicting various attributes of a protein. In this study, based on the concept of Chou's PseAA, a 46-D (dimensional) PseAA was formulated to represent the sample of a protein and a new approach based on binary-tree support vector machines (BTSVMs) was proposed to predict the protein structural class. BTSVMs algorithm has the capability in solving the problem of unclassifiable data points in multi-class SVMs. The results by both the 10-fold cross-validation and jackknife tests demonstrate that the predictive performance using the new PseAA (46-D) is better than that of AA (20-D), which is widely used in many algorithms for protein structural class prediction. The results obtained by the new approach are quite encouraging, indicating that it can at least play a complimentary role to many of the existing methods and is a useful tool for predicting many other protein attributes as well.  相似文献   

13.
Shi JY  Zhang SW  Pan Q  Cheng YM  Xie J 《Amino acids》2007,33(1):69-74
As more and more genomes have been discovered in recent years, there is an urgent need to develop a reliable method to predict the subcellular localization for the explosion of newly found proteins. However, many well-known prediction methods based on amino acid composition have problems utilizing the sequence-order information. Here, based on the concept of Chou's pseudo amino acid composition (PseAA), a new feature extraction method, the multi-scale energy (MSE) approach, is introduced to incorporate the sequence-order information. First, a protein sequence was mapped to a digital signal using the amino acid index. Then, by wavelet transform, the mapped signal was broken down into several scales in which the energy factors were calculated and further formed into an MSE feature vector. Following this, combining this MSE feature vector with amino acid composition (AA), we constructed a series of MSEPseAA feature vectors to represent the protein subcellular localization sequences. Finally, according to a new kind of normalization approach, the MSEPseAA feature vectors were normalized to form the improved MSEPseAA vectors, named as IEPseAA. Using the technique of IEPseAA, C-support vector machine (C-SVM) and three multi-class SVMs strategies, quite promising results were obtained, indicating that MSE is quite effective in reflecting the sequence-order effects and might become a useful tool for predicting the other attributes of proteins as well.  相似文献   

14.
Complete amino acid sequence of yeast thioltransferase (glutaredoxin)   总被引:3,自引:0,他引:3  
The amino acid sequence of a thioltransferase isolated from Saccharomyces cerevisiae was determined. The protein was cleaved by trypsin, Staphylococcus aureus V8 protease, and cyanogen bromide. The peptides generated were purified by reverse phase HPLC. Sequencing of intact protein and its fragments were achieved by automated Edman degradation. The protein contains 106 amino acid residues with two cysteines. Yeast thioltransferase showed 51% structural similarity to pig liver thioltransferase and 34% to E. coli glutaredoxin.  相似文献   

15.
Neurogranin, formerly designated p17 (Baudier, J., Bronner, C., Kligman, D., and Cole, R. D.) (1989) J. Biol. Chem. 264, 1824-1828), a brain-specific in vitro substrate for protein kinase C (PKC), has been purified to homogeneity from bovine forebrain. The purified protein has a molecular mass of 7837.1 +/- 0.5 Da, determined by electrospray mass spectrometry. In the absence of reducing agent, dimers and higher oligomers accumulated. On sodium dodecyl sulfate-polyacrylamide gels the protein monomer migrated abnormally with an apparent molecular mass of 15,000-19,000 Da, depending on the percentage of polyacrylamide. The native protein is blocked at its amino terminus. The majority of the primary amino acid sequence was determined following proteolytic and chemical fragmentation. A comparison of the amino acid sequence of neurogranin with that of the brain-specific PKC substrate neuromodulin, revealed a strikingly conserved amino acid sequence AA(X)KIQA-SFRGH(X)(X)RKK(X)K. The two proteins are not related over the rest of their sequences. Neurogranin was shown to be phosphorylated in hippocampal slices incubated with 32Pi and phorbol esters stimulated neurogranin phosphorylation, suggesting that neurogranin is likely to be an in vivo substrate for PKC. In vitro phosphorylation of neurogranin by PKC produced a shift of the isoelectric point of the protein (pI 5.6) to a more acidic value (pI 5.4). Tryptic digestion of the phosphorylated protein yielded a single phosphopeptide having the sequence IQASFR, where the serine residue is the phosphorylated amino acid. This phosphopeptide is part of the conserved sequence shared with neuromodulin and also corresponds to the PKC phosphorylation site on neuromodulin (Apel, E. D., Byford, M. F., Au, D., Walsh, K. A., and Storm, D. R. (1990) Biochemistry 29, 2330-2335). Evidence was obtained suggesting that neurogranin binds to calmodulin in the absence of Ca2+, a feature that also characterizes neuromodulin. We propose that the amino acid sequence shared by neurogranin and neuromodulin reflects a functional relationship between these two proteins and that the consensus sequence represents a conserved PKC phosphorylation site and a calmodulin binding domain that characterizes a class of brain-specific PKC substrates.  相似文献   

16.
Hijikata A  Yura K  Noguti T  Go M 《Proteins》2011,79(6):1868-1877
In comparative modeling, the quality of amino acid sequence alignment still constitutes a major bottleneck in the generation of high quality models of protein three-dimensional (3D) structures. Substantial efforts have been made to improve alignment quality by revising the substitution matrix, introducing multiple sequences, replacing dynamic programming with hidden Markov models, and incorporating 3D structure information. Improvements in the gap penalty have not been a major focus, however, following the development of the affine gap penalty and of the secondary structure dependent gap penalty. We revisited the correlation between protein 3D structure and gap location in a large protein 3D structure data set, and found that the frequency of gap locations approximated to an exponential function of the solvent accessibility of the inserted residues. The nonlinearity of the gap frequency as a function of accessibility corresponded well to the relationship between residue mutation pattern and residue accessibility. By introducing this relationship into the gap penalty calculation for pairwise alignment between template and target amino acid sequences, we were able to obtain a sequence alignment much closer to the structural alignment. The quality of the alignments was substantially improved on a pair of sequences with identity in the "twilight zone" between 20 and 40%. The relocation of gaps by our new method made a significant improvement in comparative modeling, exemplified here by the Bacillus subtilis yitF protein. The method was implemented in a computer program, ALAdeGAP (ALignment with Accessibility dependent GAp Penalty), which is available at http://cib.cf.ocha.ac.jp/target_protein/.  相似文献   

17.
3 beta-Hydroxysteroid dehydrogenase/steroid isomerase has been purified to homogeneity from bovine adrenal glands. A single protein of molecular weight 42,090 +/- 40 containing both enzyme activities has been isolated. Approximately 86% of the amino acid sequence of the bovine adrenal 3 beta-hydroxysteroid dehydrogenase/steroid isomerase has been obtained by sequencing peptides isolated from digests with trypsin and lysyl endopeptidase and by chemical cleavage with CNBr. The sequence obtained is identical with that of the deduced amino acid sequence of the bovine ovarian 3 beta-hydroxysteroid dehydrogenase/steroid isomerase [Zhao et al. (1989) FEBS Lett. 259, 153-157], with the exception that the N-terminal methionine residue found in the bovine ovarian sequence is not present in the mature bovine adrenal enzyme. On the basis of the primary structure and comparisons with other NAD+ binding proteins, we propose a structural model of the bovine adrenal 3 beta-hydroxysteroid dehydrogenase/steroid isomerase localizing the NAD+ binding site as well as the membrane-anchoring segment.  相似文献   

18.
A method for amino acid sequence and D/L configuration identification of peptides by using fluorogenic Edman reagent 7-[(N, N-dimethylamino)sulfonyl]-2,1,3-benzoxadiazol-4-yl isothiocyanate (DBD-NCS) has been developed. This method was based on the Edman degradation principle with some modifications. A peptide or protein was coupled with DBD-NCS under basic conditions and then cyclized/cleaved to produce DBD-thiazolinone (TZ) derivative by BF3, a Lewis acid, which could significantly suppress the amino acid racemization. The liberated DBD-TZ amino acid was hydrolyzed to DBD-thiocarbamoyl (TC) amino acid under a weakly acidic condition and then oxidized by NaNO2/H+ to DBD-carbamoyl (CA) amino acid which was a stable and had a strong fluorescence intensity. The individual DBD-CA amino acids were separated on a reversed-phase high-performance liquid chromatography (RP-HPLC) for amino acid sequencing and their enantiomers were resolved on a chiral stationary-phase HPLC for identifying their D/L configurations. Combination of the two HPLC systems, the amino acid sequence and D/L configuration of peptides could be determined. This method will be useful for searching D-amino-acid-containing peptides in animals.  相似文献   

19.
Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities.  相似文献   

20.
The use of poly(acrylic acid) (PAA)-multiwalled carbon-nanotubes (MWNTs) composite-coated glassy-carbon disk electrode (GCE) (PAA-MWNTs/GCE) for the simultaneous determination of physiological level dopamine (DA) and uric acid (UA) in the presence of an excess of ascorbic acid (AA) in a pH 7.4 phosphate-buffered solution was proposed. PAA-MWNTs composite was prepared by mixing of MWNTs powder into 1 mg/ml PAA aqueous solution under sonication. GCE surface was modified with PAA-MWNTs film by casting. AA demonstrates no voltammetric peak at PAA-MWNTs/GCE. The PAA-MWNTs composite is of a high surface area and of affinity for DA and UA adsorption. DA exhibits greatly improved electron-transfer rate and is electro-catalyzed at PAA-MWNTs/GCE. Moreover, the electro-catalytic oxidation of UA at PAA-MWNTs/GCE is observed, which makes it possible to detect lower level UA. Therefore, the enhanced electrocatalytic currents for DA and UA were observed. The anodic peak currents at approximately 0.18 V and 0.35 V increase with the increasing concentrations of DA and UA, respectively, which correspond to the voltammetric peaks of DA and UA, respectively. The linear ranges are 40 nM to 3 microM DA and 0.3 microM to 10 microM UA in the presence of 0.3 mM AA. The lowest detection limits (S/N=3) were 20 nM DA and 110 nM UA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号