首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Prediction of RNA binding sites in proteins from amino acid sequence   总被引:3,自引:0,他引:3  
RNA-protein interactions are vitally important in a wide range of biological processes, including regulation of gene expression, protein synthesis, and replication and assembly of many viruses. We have developed a computational tool for predicting which amino acids of an RNA binding protein participate in RNA-protein interactions, using only the protein sequence as input. RNABindR was developed using machine learning on a validated nonredundant data set of interfaces from known RNA-protein complexes in the Protein Data Bank. It generates a classifier that captures primary sequence signals sufficient for predicting which amino acids in a given protein are located in the RNA-protein interface. In leave-one-out cross-validation experiments, RNABindR identifies interface residues with >85% overall accuracy. It can be calibrated by the user to obtain either high specificity or high sensitivity for interface residues. RNABindR, implementing a Naive Bayes classifier, performs as well as a more complex neural network classifier (to our knowledge, the only previously published sequence-based method for RNA binding site prediction) and offers the advantages of speed, simplicity and interpretability of results. RNABindR predictions on the human telomerase protein hTERT are in good agreement with experimental data. The availability of computational tools for predicting which residues in an RNA binding protein are likely to contact RNA should facilitate design of experiments to directly test RNA binding function and contribute to our understanding of the diversity, mechanisms, and regulation of RNA-protein complexes in biological systems. (RNABindR is available as a Web tool from http://bindr.gdcb.iastate.edu.).  相似文献   

2.
3.
Post-translational modifications (PTMs) occur on almost all proteins analyzed to date. The function of a modified protein is often strongly affected by these modifications and therefore increased knowledge about the potential PTMs of a target protein may increase our understanding of the molecular processes in which it takes part. High-throughput methods for the identification of PTMs are being developed, in particular within the fields of proteomics and mass spectrometry. However, these methods are still in their early stages, and it is indeed advantageous to cut down on the number of experimental steps by integrating computational approaches into the validation procedures. Many advanced methods for the prediction of PTMs exist and many are made publicly available. We describe our experiences with the development of prediction methods for phosphorylation and glycosylation sites and the development of PTM-specific databases. In addition, we discuss novel ideas for PTM visualization (exemplified by kinase landscapes) and improvements for prediction specificity (by using ESS--evolutionary stable sites). As an example, we present a new method for kinase-specific prediction of phosphorylation sites, NetPhosK, which extends our earlier and more general tool, NetPhos. The new server, NetPhosK, is made publicly available at the URL http://www.cbs.dtu.dk/services/NetPhosK/. The issues of underestimation, over-prediction and strategies for improving prediction specificity are also discussed.  相似文献   

4.
E V Barkovski? 《Biofizika》1985,30(5):782-785
Two-dimensional representation of consequence of 32 proteins with known three-dimensional structure has been obtained on 20 X 20 matrix of the distribution of amino acid pairs (nearest neighbours). Prediction algorithm of the structural class of globular proteins has been worked out on the basis of the comparison of 20 X 20 matrix of the distribution of amino acid pairs for the proteins of different structural classes. The accuracy of structural class predictions of 32 proteins has been carried out (all the proteins are taken from numerous ones used to obtain the algorithm).  相似文献   

5.
6.
Prediction of amino acid sequence from structure   总被引:2,自引:0,他引:2       下载免费PDF全文
We have developed a method for the prediction of an amino acid sequence that is compatible with a three-dimensional backbone structure. Using only a backbone structure of a protein as input, the algorithm is capable of designing sequences that closely resemble natural members of the protein family to which the template structure belongs. In general, the predicted sequences are shown to have multiple sequence profile scores that are dramatically higher than those of random sequences, and sometimes better than some of the natural sequences that make up the superfamily. As anticipated, highly conserved but poorly predicted residues are often those that contribute to the functional rather than structural properties of the protein. Overall, our analysis suggests that statistical profile scores of designed sequences are a novel and valuable figure of merit for assessing and improving protein design algorithms.  相似文献   

7.
Hayat M  Khan A  Yeasin M 《Amino acids》2012,42(6):2447-2460
Knowledge of the types of membrane protein provides useful clues in deducing the functions of uncharacterized membrane proteins. An automatic method for efficiently identifying uncharacterized proteins is thus highly desirable. In this work, we have developed a novel method for predicting membrane protein types by exploiting the discrimination capability of the difference in amino acid composition at the N and C terminus through split amino acid composition (SAAC). We also show that the ensemble classification can better exploit this discriminating capability of SAAC. In this study, membrane protein types are classified using three feature extraction and several classification strategies. An ensemble classifier Mem-EnsSAAC is then developed using the best feature extraction strategy. Pseudo amino acid (PseAA) composition, discrete wavelet analysis (DWT), SAAC, and a hybrid model are employed for feature extraction. The nearest neighbor, probabilistic neural network, support vector machine, random forest, and Adaboost are used as individual classifiers. The predicted results of the individual learners are combined using genetic algorithm to form an ensemble classifier, Mem-EnsSAAC yielding an accuracy of 92.4 and 92.2% for the Jackknife and independent dataset test, respectively. Performance measures such as MCC, sensitivity, specificity, F-measure, and Q-statistics show that SAAC-based prediction yields significantly higher performance compared to PseAA- and DWT-based systems, and is also the best reported so far. The proposed Mem-EnsSAAC is able to predict the membrane protein types with high accuracy and consequently, can be very helpful in drug discovery. It can be accessed at http://111.68.99.218/membrane.  相似文献   

8.
9.
The amino acid sequence of the beta subunit of allophycocyanin   总被引:5,自引:0,他引:5  
The complete amino acid sequence of the beta subunit of Anabaena variabilis allophycocyanin is: H2N-Ala-Gln-Asp-Ala-Ile-Thr-Ala-Val-Ile-Asn-Ser-Ala-Asp-Val-Gln-Gly-Lys-Tyr-Leu-Asp-Thr-Ala-Ala-Leu-Glu-Lys-Leu-Lys-Ala-Tyr-Phe-Ser-Thr-Gly-Glu-Leu-Arg-Val-Arg-Ala-Ala-Thr-Thr-Ile-Ser-Ala-Asn-Ala-Ala-Ala-Ile-Val-Lys-Glu-Ala-Val-Ala-Lys-Ser-Leu-Leu-Tyr-Ser-Asp-Ile-Thr-Arg-Pro-Gly-Gly-Asn-Met-Tyr-Thr-Thr-Arg-Arg-Tyr-Ala-Ala-Cys-Ile-Arg-Asp-Leu-Asp-Tyr-Tyr-Leu-Arg-Tyr-Ala-Thr-Tyr-Ala-Met-Leu-Ala-Gly-Asp-Pro-Ser-Ile-Leu-Asp-Glu-Arg-Val-Leu-Asn-Gly-Leu-Lys-Glu-Thr-Tyr-Asn-Ser-Leu-Gly-Val-Pro-Val-Gly-Ala-Thr-Val-Gln-Ala-Ile-Gln-Ala-Ile-Lys-Glu-Val-Thr-Ala-Ser-Leu-Val-Gly-Ala-Asp-Ala-Gly-Lys-Glu-Met-Gly-Ile-Tyr-Leu-Asp-Tyr-Ile-Ser-Ser-Gly-Leu-Ser-COOH Phycocyanobilin is attached though a thioether linkage to cysteinyl residue 81, indicated by an asterisk. Comparison of this sequence with those of C-phycocyanins shows that there are 60 identities between corresponding subunits of these two biliproteins. Of the region between residues 79 and 120, 29 residues are identical in the beta subunits of allophycocyanin and phycocyanin. The character of all 10 charged residues in this region of the beta subunit sequences is completely conserved.  相似文献   

10.
The amino acid sequence of the beta subunit of rabbit lutropin (lLH) has been determined. The amino terminus of about 97% of the beta subunit has a two amino acid extension (pyro-Glu-Pro) compared to other lutropin beta sequences. Overlapping peptides from trypsin and chymotrypsin digestions of the performic acid-oxidized beta subunit and trypsin digestion of the S-aminoethylated cysteine beta subunit were isolated by chromatography on TSK Fractogel 40F and high-pressure liquid chromatography (HPLC). Sequencing was by a combination of the dansyl-Edman method and the direct Edman method. Amide placements were established by HPLC analysis of the PTH amino acid derivatives. The proposed sequence of lLH subunit is: This sequence is highly homologous to the other known lutropin beta subunits, especially rat and pig lutropin beta (91%). Partial cleavage of the peptide bond between Asp-79 and Pro-80 was observed during cyanogen bromide treatment. Rabbit thyrotropin and thyrotropin beta subunit copurified with lLH and lLH except at a final chromatography on Sephadex G-100.  相似文献   

11.
An algorithm was derived to relate the amino acid sequence of a collagen triple helix to its thermal stability. This calculation is based on the triple helical stabilization propensities of individual residues and their intermolecular and intramolecular interactions, as quantitated by melting temperature values of host-guest peptides. Experimental melting temperature values of a number of triple helical peptides of varying length and sequence were successfully predicted by this algorithm. However, predicted T(m) values are significantly higher than experimental values when there are strings of oppositely charged residues or concentrations of like charges near the terminus. Application of the algorithm to collagen sequences highlights regions of unusually high or low stability, and these regions often correlate with biologically significant features. The prediction of stability from sequence indicates an understanding of the major forces maintaining this protein motif. The use of highly favorable KGE and KGD sequences is seen to complement the stabilizing effects of imino acids in modulating stability and may become dominant in the collagenous domains of bacterial proteins that lack hydroxyproline. The effect of single amino acid mutations in the X and Y positions can be evaluated with this algorithm. An interactive collagen stability calculator based on this algorithm is available online.  相似文献   

12.
The amino acid sequences of both the alpha and beta subunits of human chorionic gonadotropin have been determined. The amino acid sequence of the alpha subunit is: Ala - Asp - Val - Gln - Asp - Cys - Pro - Glu - Cys-10 - Thr - Leu - Gln - Asp - Pro - Phe - Ser - Gln-20 - Pro - Gly - Ala - Pro - Ile - Leu - Gln - Cys - Met - Gly-30 - Cys - Cys - Phe - Ser - Arg - Ala - Tyr - Pro - Thr - Pro-40 - Leu - Arg - Ser - Lys - Lys - Thr - Met - Leu - Val - Gln-50 - Lys - Asn - Val - Thr - Ser - Glu - Ser - Thr - Cys - Cys-60 - Val - Ala - Lys - Ser - Thr - Asn - Arg - Val - Thr - Val-70 - Met - Gly - Gly - Phe - Lys - Val - Glu - Asn - His - Thr-80 - Ala - Cys - His - Cys - Ser - Thr - Cys - Tyr - Tyr - His-90 - Lys - Ser. Oligosaccharide side chains are attached at residues 52 and 78. In the preparations studied approximately 10 and 30% of the chains lack the initial 2 and 3 NH2-terminal residues, respectively. This sequence is almost identical with that of human luteinizing hormone (Sairam, M. R., Papkoff, H., and Li, C. H. (1972) Biochem. Biophys. Res. Commun. 48, 530-537). The amino acid sequence of the beta subunit is: Ser - Lys - Glu - Pro - Leu - Arg - Pro - Arg - Cys - Arg-10 - Pro - Ile - Asn - Ala - Thr - Leu - Ala - Val - Glu - Lys-20 - Glu - Gly - Cys - Pro - Val - Cys - Ile - Thr - Val - Asn-30 - Thr - Thr - Ile - Cys - Ala - Gly - Tyr - Cys - Pro - Thr-40 - Met - Thr - Arg - Val - Leu - Gln - Gly - Val - Leu - Pro-50 - Ala - Leu - Pro - Gin - Val - Val - Cys - Asn - Tyr - Arg-60 - Asp - Val - Arg - Phe - Glu - Ser - Ile - Arg - Leu - Pro-70 - Gly - Cys - Pro - Arg - Gly - Val - Asn - Pro - Val - Val-80 - Ser - Tyr - Ala - Val - Ala - Leu - Ser - Cys - Gln - Cys-90 - Ala - Leu - Cys - Arg - Arg - Ser - Thr - Thr - Asp - Cys-100 - Gly - Gly - Pro - Lys - Asp - His - Pro - Leu - Thr - Cys-110 - Asp - Asp - Pro - Arg - Phe - Gln - Asp - Ser - Ser - Ser - Ser - Lys - Ala - Pro - Pro - Pro - Ser - Leu - Pro - Ser-130 - Pro - Ser - Arg - Leu - Pro - Gly - Pro - Ser - Asp - Thr-140 - Pro - Ile - Leu - Pro - Gln. Oligosaccharide side chains are found at residues 13, 30, 121, 127, 132, and 138. The proteolytic enzyme, thrombin, which appears to cleave a limited number of arginyl bonds, proved helpful in the determination of the beta sequence.  相似文献   

13.
The complete amino acid sequence of beta 2-microglobulin   总被引:33,自引:0,他引:33  
  相似文献   

14.
Complete amino acid sequence of human alpha 1-microglobulin   总被引:4,自引:0,他引:4  
Complete amino acid sequence of human α1-microglobulin has been established. It is composed of 167 amino acid residues and contains three carbohydrate attachment sites. No amino acid sequence heterogeneity was found.  相似文献   

15.
The amino acid sequence of bovine thymus prothymosin alpha   总被引:2,自引:0,他引:2  
Prothymosin alpha has been purified from calf thymus and its amino acid sequence determined. It contains 109 amino acid residues and closely resembles human prothymosin alpha, with only two substitutions, glutamic acid for aspartic acid at position 31 and alanine for serine at position 83. This is in contrast to six differences between rat and bovine prothymosins, including four substitutions and two deletions. The structural similarity of the bovine and human polypeptides makes the former a good candidate for studies on the evaluation of the biological activities of prothymosin alpha in human systems.  相似文献   

16.
Prediction of protein structural class from the amino acid sequence   总被引:9,自引:0,他引:9  
P Klein  C Delisi 《Biopolymers》1986,25(9):1659-1672
The multidimensional statistical technique of discriminant analysis is used to allocate amino acid sequences to one of four secondary structural classes: high α content, high β content, mixed α and β, low content of ordered structure. Discrimination is based on four attributes: estimates of percentages of α and β structures, and regular variations in the hydrophobic values of residues along the sequence, occurring with periods of 2 and 3.6 residues. The reliability of the method, estimated by classifying 138 sequences from the Brookhaven Protein Data Bank, is 80%, with no misallocations between α-rich and β-rich classes. The reliability can be increased to 84% by making no allocation for proteins classified with odds close to 1. Classification using previously developed secondary structural prediction methods is considerably less reliable, the best result being 64% obtained using predictions based on the Delphi method.  相似文献   

17.
A set of aligned homologous protein sequences is divided into two groups consisting of m and n most related sequences. The value of position variability for homologous protein sequences is defined as a number of failures to coincide in the intergroup comparison of all possible m*n pairs of amino acid residues in that position divided by m*n. The position variability value plotted versus the sequence position number with a window of 10 positions gives the intergroup local variability profile. Area S of the figure included between the local variability profile and the straight line corresponding to the mean local variability value is compared with the average area Sr for 1000 random homologous protein families. If S is greater than Sr by more than 2 standard deviation units sigma r, the local variability profile is assumed to contain peaks and hollows corresponding to significant variable and conservative regions of the sequences. The profile extrema containing the area surplus delta S = S-(Sr+ 2 sigma r) are cut off by two straight lines to locate significant regions. The difference (S-Sr) given in standard deviation units sigma r is believed to be the amino acid substitution overall irregularity along the homologous protein sequences OI = (S-Sr)/sigma r. The significant conservative and variable regions of six homologous sequence families (phospholipase A2, cytochromes b, alpha-subunits of Na,K-ATPase, L- and M-subunits of photosynthetic bacteria photoreaction centre and human rhodopsins) were identified. It was shown that for artificial homologous protein sequences derived by k-fold lengthening of natural protein sequences, the OI value rises as square root of k. To compare the degree of substitution irregularity in homologous protein sequence families of different lengths L the value of standard substitution overall irregularity for L = 250 is proposed.  相似文献   

18.
An efficient method for site-selective modification of proteins using an unnatural amino acid, 3-azidotyrosine has been developed. This method utilizes the yeast amber suppressor tRNA(Tyr)/mutated tyrosyl-tRNA synthetase pair as a carrier of 3-azidotyrosine in an Escherichia coli cell-free translation system, and triarylphosphine derivatives for specific modification of the azido group. Using rat calmodulin (CaM) as a model protein, we prepared several unnatural CaM molecules, each carrying an azidotyrosine at predetermined positions 72, 78, 80 or 100, respectively. Post-translational modification of these proteins with a conjugate compound of triarylphosphine and biotin produced site-selectively biotinylated CaM molecules. Reaction efficiency was similar among these proteins irrespective of the position of introduction, and site-specificity of biotinylation was confirmed using mass spectrometry. In addition, CBP-binding activity of the biotinylated CaMs was confirmed to be similar to that of wild-type CaM. This method is intrinsically versatile in that it should be easily applicable to introducing any other desirable compounds (e.g., probes and cross-linkers) into selected sites of proteins as far as appropriate derivative compounds of triarylphosphine could be chemically synthesized. Elucidation of molecular mechanisms of protein functions and protein-to-protein networks will be greatly facilitated by making use of these site-selectively modified proteins.  相似文献   

19.
20.
The amino acid sequence of the alpha subunit of rabbit (lagomorph) lutropin (lLH) has been determined. Overlapping peptides from trypsin and chymotrypsin digestions were isolated by reverse-phase high-pressure liquid chromatography (HPLC). Sequencing was by the dansyl-Edman procedure. Amide placements were established by HPLC analysis of the PTH amino acid derivatives. The proposed sequence of lLH alpha subunit is (asterisks denote carbohydrate attachment sites): This proposed sequence is highly homologous with the porcine, murine, ovine, and bovine glycoprotein hormone alpha subunit sequences. Two unusual proteolytic cleavages were observed: (1) a cleavage by trypsin between Asn-77 and Ala-78, and (2) a cleavage by chymotrypsin between Ala-45 and Arg-46. Similar enzymatic cleavages were previously reported for equine chorionic gonadotropin alpha subunit by Wardet al. and for these sites in the ovine LH alpha subunit by Liuet al. Chymotrypsin cleaved on the carboxyl side of methionine sulfone residues at positions 51 and 75.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号