共查询到20条相似文献,搜索用时 16 毫秒
1.
N-terminal N-myristoylation of proteins: prediction of substrate proteins from amino acid sequence 总被引:5,自引:0,他引:5
Myristoylation by the myristoyl-CoA:protein N-myristoyltransferase (NMT) is an important lipid anchor modification of eukaryotic and viral proteins. Automated prediction of N-terminal N-myristoylation from the substrate protein sequence alone is necessary for large-scale sequence annotation projects but it requires a low rate of false positive hits in addition to a sufficient sensitivity.Our previous analysis of substrate protein sequence variability, NMT sequences and 3D structures has revealed motif properties in addition to the known PROSITE motif that are utilized in a new predictor described here. The composite prediction function (with separate ad hoc parameterization (a) for queries from non-fungal eukaryotes and their viruses and (b) for sequences from fungal species) consists of terms evaluating amino acid type preferences at sequences positions close to the N terminus as well as terms penalizing deviations from the physical property pattern of amino acid side-chains encoded in multi-residue correlation within the motif sequence. The algorithm has been validated with a self-consistency and two jack-knife tests for the learning set as well as with kinetic data for model substrates. The sensitivity in recognizing documented NMT substrates is above 95 % for both taxon-specific versions. The corresponding rate of false positive prediction (for sequences with an N-terminal glycine residue) is close to 0.5 %; thus, the technique is applicable for large-scale automated sequence database annotation. The predictor is available as public WWW-server with the URL http://mendel.imp.univie.ac.at/myristate/. Additionally, we propose a version of the predictor that identifies a number of proteolytic protein processing sites at internal glycine residues and that evaluates possible N-terminal myristoylation of the protein fragments.A scan of public protein databases revealed new potential NMT targets for which the myristoyl modification may be of critical importance for biological function. Among others, the list includes kinases, phosphatases, proteasomal regulatory subunit 4, kinase interacting proteins KIP1/KIP2, protozoan flagellar proteins, homologues of mitochondrial translocase TOM40, of the neuronal calcium sensor NCS-1 and of the cytochrome c-type heme lyase CCHL. Analyses of complete eukaryote genomes indicate that about 0.5 % of all encoded proteins are apparent NMT substrates except for a higher fraction in Arabidopsis thaliana ( approximately 0.8 %). 相似文献
2.
Harshvardan Khare Vivek Ratnaparkhi Sonali Chavan Valadi Jayraman 《Bioinformation》2012,8(24):1202-1205
Mannose is an abundant cell surface monosaccharide and has an important role in many biochemical processes. It binds to a greatdiversity of receptor proteins. In this study we have employed Random Forest for prediction of mannose binding sites. Mannosebindingsite is taken to be a sphere around the centroid of the ligand and the sphere is subdivided into different layers and atomwise and residue wise features were extracted for each layer. The method achieves 95.59 % of accuracy using Random Forest with10 fold cross validation. Prediction of mannose binding site analysis will be quite useful in drug design. 相似文献
3.
Rapid protein domain assignment from amino acid sequence using predicted secondary structure 总被引:8,自引:0,他引:8 下载免费PDF全文
Marsden RL McGuffin LJ Jones DT 《Protein science : a publication of the Protein Society》2002,11(12):2814-2824
The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to measuring the usefulness of these prediction methods in terms of their application to fully automatic domain assignment. Thus, the sensitivity of each domain assignment method was measured by calculating the number of correctly assigned top scoring predictions. We have implemented a new continuous domain identification method using the alignment of predicted secondary structures of target sequences against observed secondary structures of chains with known domain boundaries as assigned by Class Architecture Topology Homology (CATH). Taking top predictions only, the success rate of the method in correctly assigning domain number to the representative chain set is 73.3%. The top prediction for domain number and location of domain boundaries was correct for 24% of the multidomain set (+/-20 residues). These results have been put into context in relation to the results obtained from the other prediction methods assessed. 相似文献
4.
Emanuelsson O 《Briefings in bioinformatics》2002,3(4):361-376
Predicting the subcellular localisation of proteins is an important part of the elucidation of their functions and interactions. Here, the amino acid sequence motifs that direct proteins to their proper subcellular compartment are surveyed, different methods for localisation prediction are discussed, and some benchmarks for the more commonly used predictors are presented. 相似文献
5.
We describe a general, modular method for developing protocols to identify the amino acid residues that most likely define the division of a protein superfamily into two subsets. As one possibility, we use PROBE to gather superfamily members and perform an ungapped alignment. We then use a modified BLOSUM62 substitution matrix to determine the discriminating power of each column of aligned residues. The overall method is particularly useful for predicting amino acids responsible for substrate or binding specificity when no structures are available. We apply our method to three pairs of protein classes in three different superfamilies, and present our results, some of which have been experimentally verified. This approach may accelerate the elucidation of enzymic substrate specificity, which is critical for both mechanistic insights into biocatalysis and ultimate application. 相似文献
6.
The crystal structure of a cysteine protease ervatamin B, isolated from the medicinal plant Ervatamia coronaria, has been determined at 1.63 A. The unknown primary structure of the enzyme could also be traced from the high-quality electron density map. The final refined model, consisting of 215 amino acid residues, 208 water molecules, and a thiosulfate ligand molecule, has a crystallographic R-factor of 15.9% and a free R-factor of 18.2% for F > 2sigma(F). The protein belongs to the papain superfamily of cysteine proteases and has some unique properties compared to other members of the family. Though the overall fold of the structure, comprising two domains, is similar to the others, a few natural substitutions of conserved amino acid residues at the interdomain cleft of ervatamin B are expected to increase the stability of the protein. The substitution of a lysine residue by an arginine (residue 177) in this region of the protein may be important, because Lys --> Arg substitution is reported to increase the stability of proteins. Another substitution in this cleft region that helps to hold the domains together through hydrogen bonds is Ser36, replacing a conserved glycine residue in the others. There are also some substitutions in and around the active site cleft. Residues Tyr67, Pro68, Val157, and Ser205 in papain are replaced by Trp67, Met68, Gln156, and Leu208, respectively, in ervatamin B, which reduces the volume of the S2 subsite to almost one-fourth that of papain, and this in turn alters the substrate specificity of the enzyme. 相似文献
7.
The contact number of an amino acid residue in a protein structure is defined by the number of C(beta) atoms around the C(beta) atom of the given residue, a quantity similar to, but different from, solvent accessible surface area. We present a method to predict the contact numbers of a protein from its amino acid sequence. The method is based on a simple linear regression scheme and predicts the absolute values of contact numbers. When single sequences are used for both parameter estimation and cross-validation, the present method predicts the contact numbers with a correlation coefficient of 0.555 on average. When multiple sequence alignments are used, the correlation increases to 0.627, which is a significant improvement over previous methods. In terms of discrete states prediction, the accuracies for 2-, 3-, and 10-state predictions are, respectively, 71.4%, 54.1%, and 18.9% with residue type-dependent unbiased thresholds, and 76.3%, 59.2%, and 21.8% with residue type-independent unbiased thresholds. The difference between accessible surface area and contact number from a prediction viewpoint and the application of contact number prediction to three-dimensional structure prediction are discussed. 相似文献
8.
9.
【目的】通过定点突变探究腾冲嗜热厌氧菌MB4中生物合成型丙氨酸消旋酶Tt Alr底物通道内氨基酸位点A172和S173的功能。【方法】利用定点突变PCR技术构建突变体,通过亲和层析法纯化酶蛋白,采用D-氨基酸氧化酶偶联法检测各突变蛋白的活性及其稳定性。【结果】通过定点突变PCR成功得到8个突变体,酶学特性分析发现,A172位点突变为丝氨酸(S)后酶蛋白的相对活性有所提升,但含有该位点突变的酶蛋白稳定性均大幅下降;S173位点突变为天门冬氨酸(D)后导致突变体蛋白的最适反应温度提升了15°C,半衰期大幅延长,但相对活性明显下降。【结论】丙氨酸消旋酶Tt Alr底物通道内A172和S173位点均是影响酶蛋白催化活性和稳定性的关键位点。 相似文献
10.
Amino acid sequence of a protease inhibitor isolated from Sarcophaga bullata determined by mass spectrometry. 下载免费PDF全文
I. A. Papayannopoulos K. Biemann 《Protein science : a publication of the Protein Society》1992,1(2):278-288
The amino acid sequence of a protease inhibitor isolated from the hemolymph of Sarcophaga bullata larvae was determined by tandem mass spectrometry. Homology considerations with respect to other protease inhibitors with known primary structures assisted in the choice of the procedure followed in the sequence determination and in the alignment of the various peptides obtained from specific chemical cleavage at cysteines and enzyme digests of the S. bullata protease inhibitor. The resulting sequence of 57 residues is as follows: Val Asp Lys Ser Ala Cys Leu Gln Pro Lys Glu Val Gly Pro Cys Arg Lys Ser Asp Phe Val Phe Phe Tyr Asn Ala Asp Thr Lys Ala Cys Glu Glu Phe Leu Tyr Gly Gly Cys Arg Gly Asn Asp Asn Arg Phe Asn Thr Lys Glu Glu Cys Glu Lys Leu Cys Leu. 相似文献
11.
Prediction of protein–protein interaction sites from weakly homologous template structures using meta‐threading and machine learning 下载免费PDF全文
The identification of protein–protein interactions is vital for understanding protein function, elucidating interaction mechanisms, and for practical applications in drug discovery. With the exponentially growing protein sequence data, fully automated computational methods that predict interactions between proteins are becoming essential components of system‐level function inference. A thorough analysis of protein complex structures demonstrated that binding site locations as well as the interfacial geometry are highly conserved across evolutionarily related proteins. Because the conformational space of protein–protein interactions is highly covered by experimental structures, sensitive protein threading techniques can be used to identify suitable templates for the accurate prediction of interfacial residues. Toward this goal, we developed eFindSitePPI, an algorithm that uses the three‐dimensional structure of a target protein, evolutionarily remotely related templates and machine learning techniques to predict binding residues. Using crystal structures, the average sensitivity (specificity) of eFindSitePPI in interfacial residue prediction is 0.46 (0.92). For weakly homologous protein models, these values only slightly decrease to 0.40–0.43 (0.91–0.92) demonstrating that eFindSitePPI performs well not only using experimental data but also tolerates structural imperfections in computer‐generated structures. In addition, eFindSitePPI detects specific molecular interactions at the interface; for instance, it correctly predicts approximately one half of hydrogen bonds and aromatic interactions, as well as one third of salt bridges and hydrophobic contacts. Comparative benchmarks against several dimer datasets show that eFindSitePPI outperforms other methods for protein‐binding residue prediction. It also features a carefully tuned confidence estimation system, which is particularly useful in large‐scale applications using raw genomic data. eFindSitePPI is freely available to the academic community at http://www.brylinski.org/efindsiteppi . Copyright © 2014 John Wiley & Sons, Ltd. 相似文献
12.
Novel proteases from the genome of the carnivorous plant Drosera capensis: Structural prediction and comparative analysis 下载免费PDF全文
In his 1875 monograph on insectivorous plants, Darwin described the feeding reactions of Drosera flypaper traps and predicted that their secretions contained a “ferment” similar to mammalian pepsin, an aspartic protease. Here we report a high‐quality draft genome sequence for the cape sundew, Drosera capensis, the first genome of a carnivorous plant from order Caryophyllales, which also includes the Venus flytrap (Dionaea) and the tropical pitcher plants (Nepenthes). This species was selected in part for its hardiness and ease of cultivation, making it an excellent model organism for further investigations of plant carnivory. Analysis of predicted protein sequences yields genes encoding proteases homologous to those found in other plants, some of which display sequence and structural features that suggest novel functionalities. Because the sequence similarity to proteins of known structure is in most cases too low for traditional homology modeling, 3D structures of representative proteases are predicted using comparative modeling with all‐atom refinement. Although the overall folds and active residues for these proteins are conserved, we find structural and sequence differences consistent with a diversity of substrate recognition patterns. Finally, we predict differences in substrate specificities using in silico experiments, providing targets for structure/function studies of novel enzymes with biological and technological significance. Proteins 2016; 84:1517–1533. © 2016 Wiley Periodicals, Inc. 相似文献
13.
Christian-Scott E. McCartney James A. MacLeod Peter A. Greer Peter L. Davies 《Biochimica et Biophysica Acta (BBA)/Molecular Cell Research》2018,1865(2):221-230
Calpain-1 and -2 are Ca2 +-activated intracellular cysteine proteases that regulate a wide range of cellular functions through the cleavage of their protein substrates. Unlike degradative proteases, calpains make limited, transformative cleavages, typically in accessible sequences linking discrete subdomains, to irreversibly alter substrate functions. The biological roles of calpain and their interplay with calcium signaling are of significant biomedical interest as biomarkers and potential therapeutic targets in a growing number of diseases including Alzheimer's, cancer and fibrosis. Unfortunately, many of the colorimetric and fluorimetric assays that have been developed to study calpain activity suffer from low sensitivity and/or poor calpain specificity. To address the need for a highly sensitive and calpain-specific substrate suitable for in vitro and in vivo calpain activity analysis, we have developed a protein FRET probe. We inserted the optimized calpain cleavage sequence PLFAAR between cyan fluorescent protein (CFP) and yellow fluorescent protein (YFP) and modulated its flanking sequences for optimal calpain cleavage. We demonstrate greater sensitivity and calpain-specificity of an optimal 16-residue PLFAAR-based FRET substrate compared to a standard α-spectrin-based probe. The 16-residue PLFAAR protein FRET substrate is not significantly cleaved by trypsin, chymotrypsin, cathepsin-L or caspase-3, and is highly sensitive to both calpain-1 and -2. After transfection of the substrate gene into breast cancer cells the PLFAAR protein FRET product was cut in lysed wild-type cells but not in those with a calpain knock-out phenotype. Blockage of substrate cleavage in the lysates by endogenous and exogenous calpastatin was observed, and was overcome by adding extra calpain. 相似文献
14.
We have modified and improved the GOR algorithm for the protein secondary structure prediction by using the evolutionary information provided by multiple sequence alignments, adding triplet statistics, and optimizing various parameters. We have expanded the database used to include the 513 non-redundant domains collected recently by Cuff and Barton (Proteins 1999;34:508-519; Proteins 2000;40:502-511). We have introduced a variable size window that allowed us to include sequences as short as 20-30 residues. A significant improvement over the previous versions of GOR algorithm was obtained by combining the PSI-BLAST multiple sequence alignments with the GOR method. The new algorithm will form the basis for the future GOR V release on an online prediction server. The average accuracy of the prediction of secondary structure with multiple sequence alignment and full jack-knife procedure was 73.5%. The accuracy of the prediction increases to 74.2% by limiting the prediction to 375 (of 513) sequences having at least 50 PSI-BLAST alignments. The average accuracy of the prediction of the new improved program without using multiple sequence alignments was 67.5%. This is approximately a 3% improvement over the preceding GOR IV algorithm (Garnier J, Gibrat JF, Robson B. Methods Enzymol 1996;266:540-553; Kloczkowski A, Ting K-L, Jernigan RL, Garnier J. Polymer 2002;43:441-449). We have discussed alternatives to the segment overlap (Sov) coefficient proposed by Zemla et al. (Proteins 1999;34:220-223). 相似文献
15.
Phosphatidylinositol-specific phospholipase C from Bacillus cereus: improved purification, amino acid composition, and amino-terminal sequence 总被引:3,自引:0,他引:3
J J Volwerk P B Wetherwax L M Evans A Kuppe O H Griffith 《Journal of cellular biochemistry》1989,39(3):315-325
Phosphatidylinositol-specific phospholipase C was purified in a 27% yield from the culture medium of Bacillus cereus by a combination of ammonium sulfate precipitation and ion-exchange and hydrophobic interaction chromatography. The purified enzyme was free of other phospholipase C-type activities and exhibited a high specific activity of approximately 1,300 units/mg. Amino acid composition analysis and sodium dodecyl sulfate-polyacrylamide gel electrophoresis indicated a molecular weight of about 35 kDa. The sequence of the first 29 N-terminal amino acids was also determined. 相似文献
16.
Ana Sagrera César Cobaleda Jose M. Gonza´lez De Buitrago Adolfo García-Sastre Enrique Villar 《Glycoconjugate journal》2001,18(4):283-289
The nucleotide sequence of the glycoprotein hemagglutinin-neuraminidase (HN) gene of the Newcastle disease virus (NDV) strain Clone-30 has been determined. The open reading frame of the HN gene contains 1731 nucleotides and encodes a protein of 577 amino acids. Three highly conserved patterns among all paramyxovirus HN glycoproteins, and one additional conserved species-specific region are present. The protein contains five potential N-glycosylation sites, all but one located in the C-terminal external domain. The secondary structure prediction shows that the C-terminal external domain is mostly arranged in -sheets, while -helices are predominantly located in the N-terminal domain. The nucleotide sequence data of the HN gene reported in this paper has been deposited in the GenBank database, under accession number AF098289. 相似文献
17.
James O'Connell Zhixiu Li Jack Hanson Rhys Heffernan James Lyons Kuldip Paliwal Abdollah Dehzangi Yuedong Yang Yaoqi Zhou 《Proteins》2018,86(6):629-633
Designing protein sequences that can fold into a given structure is a well‐known inverse protein‐folding problem. One important characteristic to attain for a protein design program is the ability to recover wild‐type sequences given their native backbone structures. The highest average sequence identity accuracy achieved by current protein‐design programs in this problem is around 30%, achieved by our previous system, SPIN. SPIN is a program that predicts sequences compatible with a provided structure using a neural network with fragment‐based local and energy‐based nonlocal profiles. Our new model, SPIN2, uses a deep neural network and additional structural features to improve on SPIN. SPIN2 achieves over 34% in sequence recovery in 10‐fold cross‐validation and independent tests, a 4% improvement over the previous version. The sequence profiles generated from SPIN2 are expected to be useful for improving existing fold recognition and protein design techniques. SPIN2 is available at http://sparks-lab.org . 相似文献
18.
An analysis approach to identify specific functional sites in orthologous proteins using sequence and structural information: Application to neuroserpin reveals regions that differentially regulate inhibitory activity 下载免费PDF全文
The analysis of sequence conservation is commonly used to predict functionally important sites in proteins. We have developed an approach that first identifies highly conserved sites in a set of orthologous sequences using a weighted substitution‐matrix‐based conservation score and then filters these conserved sites based on the pattern of conservation present in a wider alignment of sequences from the same family and structural information to identify surface‐exposed sites. This allows us to detect specific functional sites in the target protein and exclude regions that are likely to be generally important for the structure or function of the wider protein family. We applied our method to two members of the serpin family of serine protease inhibitors. We first confirmed that our method successfully detected the known heparin binding site in antithrombin while excluding residues known to be generally important in the serpin family. We next applied our sequence analysis approach to neuroserpin and used our results to guide site‐directed polyalanine mutagenesis experiments. The majority of the mutant neuroserpin proteins were found to fold correctly and could still form inhibitory complexes with tissue plasminogen activator (tPA). Kinetic analysis of tPA inhibition, however, revealed altered inhibitory kinetics in several of the mutant proteins, with some mutants showing decreased association with tPA and others showing more rapid dissociation of the covalent complex. Altogether, these results confirm that our sequence analysis approach is a useful tool that can be used to guide mutagenesis experiments for the detection of specific functional sites in proteins. Proteins 2015; 83:135–152. © 2014 Wiley Periodicals, Inc. 相似文献
19.
Vincent Murray Jon K. Chen Dong Yang Ben Shen 《Bioorganic & medicinal chemistry》2018,26(14):4168-4178
Bleomycin (BLM) is a cancer chemotherapeutic agent that cleaves cellular DNA at specific sequences. Using next-generation Illumina sequencing, the genome-wide sequence specificity of DNA cleavage by two BLM analogues, 6′-deoxy-BLM Z and zorbamycin (ZBM), was determined in human HeLa cells and compared with BLM. Over 200 million double-strand breaks were examined for each sample, and the 50,000 highest intensity cleavage sites were analysed. It was found that the DNA sequence specificity of the BLM analogues in human cells was different to BLM, especially at the cleavage site (position “0”) and the “+1” position. In human cells, the 6′-deoxy-BLM Z had a preference for 5′-GTGY*MC (where * is the cleavage site, Y is C or T, M is A or C); it was 5′-GTGY*MCA for ZBM; and 5′-GTGT*AC for BLM. With cellular DNA, the highest ranked tetranucleotides were 5′-TGC*C and 5′-TGT*A for 6′-deoxy-BLM Z; 5′-TGC*C, 5′-TGT*A and 5′-TGC*A for ZBM; and 5′-TGT*A for BLM. In purified human genomic DNA, the DNA sequence preference was 5′-TGT*A for 6′-deoxy-BLM, 5′-RTGY*AYR (where R is G or A) for ZBM, and 5′-TGT*A for BLM. Thus, the sequence specificity of the BLM analogue, 6′-deoxy-BLM Z, was similar to BLM in purified human DNA, while ZBM was different. 相似文献
20.
Abstract The 987P fimbrial antigen was purified from a spontaneous overproducing variant. The protein was characterized with respect to M r , amino acid sequence and partial covalent structure. The purified protein was used for vaccination trials, and excellent protection of piglets upon challenge with 987P positive enterotoxigenic strains was seen. 相似文献