首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 565 毫秒
1.
Subtilases are members of the family of subtilisin-like serine proteases. Presently, greater than 50 subtilases are known, greater than 40 of which with their complete amino acid sequences. We have compared these sequences and the available three-dimensional structures (subtilisin BPN', subtilisin Carlsberg, thermitase and proteinase K). The mature enzymes contain up to 1775 residues, with N-terminal catalytic domains ranging from 268 to 511 residues, and signal and/or activation-peptides ranging from 27 to 280 residues. Several members contain C-terminal extensions, relative to the subtilisins, which display additional properties such as sequence repeats, processing sites and membrane anchor segments. Multiple sequence alignment of the N-terminal catalytic domains allows the definition of two main classes of subtilases. A structurally conserved framework of 191 core residues has been defined from a comparison of the four known three-dimensional structures. Eighteen of these core residues are highly conserved, nine of which are glycines. While the alpha-helix and beta-sheet secondary structure elements show considerable sequence homology, this is less so for peptide loops that connect the core secondary structure elements. These loops can vary in length by greater than 150 residues. While the core three-dimensional structure is conserved, insertions and deletions are preferentially confined to surface loops. From the known three-dimensional structures various predictions are made for the other subtilases concerning essential conserved residues, allowable amino acid substitutions, disulphide bonds, Ca(2+)-binding sites, substrate-binding site residues, ionic and aromatic interactions, proteolytically susceptible surface loops, etc. These predictions form a basis for protein engineering of members of the subtilase family, for which no three-dimensional structure is known.  相似文献   

2.
The complete nucleotide sequence of the gene encoding an alkaline serine proteinase (aprP) of Bacillus pumilus TYO-67 was determined. The sequence analysis showed an open reading frame of 1,149 bp (383 amino acids) that encoded a signal peptide consisting of 29 residues and a propeptide of 79 residues. The deduced 3 amino acid residues, D32, H64, and S221, were identical with 3 essential amino acids in the catalytic center of subtilases. The sequence around these residues revealed that APRP was a new member of the true subtilisin subgroup of the subtilisin family. The highest homology was found in subtilisin NAT at 64.4% in the DNA sequence. The residue S189 of APRP was different from those of other subtilases.  相似文献   

3.
Subtilisin-like serine proteases (subtilases) are a very diverse family of serine proteases with low sequence homology, often limited to regions surrounding the three catalytic residues. Starting with different Hidden Markov Models (HMM), based on sequence alignments around the catalytic residues of the S8 family (subtilisins) and S53 family (sedolisins), we iteratively searched all ORFs in the complete genomes of 313 eubacteria and archaea. In 164 genomes we identified a total of 567 ORFs with one or more of the conserved regions with a catalytic residue. The large majority of these contained all three regions around the "classical" catalytic residues of the S8 family (Asp-His-Ser), while 63 proteins were identified as S53 (sedolisin) family members (Glu-Asp-Ser). More than 30 proteins were found to belong to two novel subsets with other evolutionary variations in catalytic residues, and new HMMs were generated to search for them. In one subset the catalytic Asp is replaced by an equivalent Glu (i.e. Glu-His-Ser family). The other subset resembles sedolisins, but the conserved catalytic Asp is not located on the same helix as the nucleophile Glu, but rather on a beta-sheet strand in a topologically similar position, as suggested by homology modeling. The Prokaryotic Subtilase Database (www.cmbi.ru.nl/subtilases) provides access to all information on the identified subtilases, the conserved sequence regions, the proposed family subdivision, and the appropriate HMMs to search for them. Over 100 proteins were predicted to be subtilases for the first time by our improved searching methods, thereby improving genome annotation.  相似文献   

4.
The effectiveness of sequence alignment in detecting structural homology among protein sequences decreases markedly when pairwise sequence identity is low (the so‐called “twilight zone” problem of sequence alignment). Alternative sequence comparison strategies able to detect structural kinship among highly divergent sequences are necessary to address this need. Among them are alignment‐free methods, which use global sequence properties (such as amino acid composition) to identify structural homology in a rapid and straightforward way. We explore the viability of using tetramer sequence fragment composition profiles in finding structural relationships that lie undetected by traditional alignment. We establish a strategy to recast any given protein sequence into a tetramer sequence fragment composition profile, using a series of amino acid clustering steps that have been optimized for mutual information. Our method has the effect of compressing the set of 160,000 unique tetramers (if using the 20‐letter amino acid alphabet) into a more tractable number of reduced tetramers (~15–30), so that a meaningful tetramer composition profile can be constructed. We test remote homology detection at the topology and fold superfamily levels using a comprehensive set of fold homologs, culled from the CATH database that share low pairwise sequence similarity. Using the receiver‐operating characteristic measure, we demonstrate potentially significant improvement in using information‐optimized reduced tetramer composition, over methods relying only on the raw amino acid composition or on traditional sequence alignment, in homology detection at or below the “twilight zone”. Proteins 2010. © 2010 Wiley‐Liss, Inc.  相似文献   

5.
Comparative analysis of structure and function of macromolecules, such as proteins, is an integral part of modern evolutionary biology. The first and critical step in understanding evolution of homologous proteins is their amino acid sequence alignment. However, standard algorithms fail to provide unambiguous sequence alignment for proteins of poor homology. More reliable results can be provided by comparing experimental 3D structures obtained at atomic resolution with the aid of X-ray structural analysis. If such structures are lacking, homology modeling is used which considers indirect experimental data on functional roles of individual amino acid residues. An important problem is that sequence alignment, which reflects genetic modifications, not necessarily corresponds to functional homology, which depends on 3D structures critical for natural selection. Since the alignment techniques relying only on the analysis of primary structures carry no information on the functional properties of proteins, the inclusion of 3D structures into consideration is of utmost importance. Here we consider several ion channels as examples to demonstrate that alignment of their 3D structures can significantly improve sequence alignment obtained by traditional methods.  相似文献   

6.
Summary Adenovirus E1A and c-myc genes are known to be capable of transforming primary rat cells when they occur in combination with either polyoma middle-T or T24 Harvey-ras 1 genes. There was a low level of amino acid sequence homology between the nuclear adenovirus-12 (Ad12) E1A protein product (289 amino acids) and the c-myc protein based on optimal alignment and percentage identity. In contrast to others [Ralston R, Bishop JM (1983) Nature 306:803–806], we concluded that this low level of amino acid sequence homology was not significant, since rabies glycoprotein (RGP), which has no transforming function and localizes to the cell surface, had a similar low level of amino acid sequence homology to the c-myc protein. Furthermore, dot-matrix analysis, when used to test the overall level of amino acid sequence homology, showed no significant homology between c-myc and Ad12 E1A, E1B, or RGP. Thus, low levels of amino acid sequence homology between two proteins may not be sufficient to predict structural and functional similarities between them reliably, even if the two proteins appear to share a common function.  相似文献   

7.
A data base was compiled containing the amino acid sequences of 12 aspartate aminotransferases and 11 other aminotransferases. A comparison of these sequences by a standard alignment method confirmed the previously reported homology of all aspartate aminotransferases and Escherichia coli tyrosine aminotransferase. However, no significant similarity between these proteins and any of the other aminotransferases was detected. A more rigorous analysis, focusing on short sequence segments rather than the total polypeptide chain, revealed that rat tyrosine aminotransferase and Saccharomyces cerevisiae and Escherichia coli histidinol-phosphate aminotransferase share several homologous sequence segments with aspartate aminotransferases. For comparison of the complete sequences, a multiple sequence editor was developed to display the whole set of amino acid sequences in parallel on a single work-sheet. The editor allows gaps in individual sequences or a set of sequences to be introduced and thus facilitates their parallel analysis and alignment. Several clusters of invariant residues at corresponding positions in the amino acid sequences became evident, clearly establishing that the cytosolic and the mitochondrial isoenzyme of vertebrate aspartate aminotransferase, E. coli aspartate aminotransferase, rat and E. coli tyrosine aminotransferase, and S. cerevisiae and E. coli histidinol-phosphate aminotransferase are homologous proteins. Only 12 amino acid residues out of a total of about 400 proved to be invariant in all sequences compared; they are either involved in the binding of pyridoxal 5'-phosphate and the substrate, or appear to be essential for the conformation of the enzymes.  相似文献   

8.
Pectate lyases are plant virulence factors that degrade the pectate component of the plant cell wall. The enzymes share considerable sequence homology with plant pollen and style proteins, suggesting a shared structural topology and possibly functional relationships as well. The three-dimensional structures of two Erwinia chrysanthemi pectate lyases, C and E, have been superimposed and the structurally conserved amino acids have been identified. There are 232 amino acids that superimpose with a root-mean-square deviation of 3 A or less. These amino acids have been used to correct the primary sequence alignment derived from evolution-based techniques. Subsequently, multiple alignment techniques have allowed the realignment of other extracellular pectate lyases as well as all sequence homologs, including pectin lyases and the plant pollen and style proteins. The new multiple sequence alignment reveals amino acids likely to participate in the parallel beta helix motif, those involved in binding Ca2+, and those invariant amino acids with potential catalytic properties. The latter amino acids cluster in two well-separated regions on the pectate lyase structures, suggesting two distinct enzymatic functions for extracellular pectate lyases and their sequence homologs.  相似文献   

9.
Remote homology detection refers to the detection of structure homology in evolutionarily related proteins with low sequence similarity. Supervised learning algorithms such as support vector machine (SVM) are currently the most accurate methods. In most of these SVM-based methods, efforts have been dedicated to developing new kernels to better use the pairwise alignment scores or sequence profiles. Moreover, amino acids’ physicochemical properties are not generally used in the feature representation of protein sequences. In this article, we present a remote homology detection method that incorporates two novel features: (1) a protein's primary sequence is represented using amino acid's physicochemical properties and (2) the similarity between two proteins is measured using recurrence quantification analysis (RQA). An optimization scheme was developed to select different amino acid indices (up to 10 for a protein family) that are best to characterize the given protein family. The selected amino acid indices may enable us to draw better biological explanation of the protein family classification problem than using other alignment-based methods. An SVM-based classifier will then work on the space described by the RQA metrics. The classification scheme is named as SVM-RQA. Experiments at the superfamily level of the SCOP1.53 dataset show that, without using alignment or sequence profile information, the features generated from amino acid indices are able to produce results that are comparable to those obtained by the published state-of-the-art SVM kernels. In the future, better prediction accuracies can be expected by combining the alignment-based features with our amino acids property-based features. Supplementary information including the raw dataset, the best-performing amino acid indices for each protein family and the computed RQA metrics for all protein sequences can be downloaded from http://ym151113.ym.edu.tw/svm-rqa.  相似文献   

10.
Wrabl JO  Grishin NV 《Proteins》2005,61(3):523-534
Understanding of amino acid type co-occurrence in trusted multiple sequence alignments is a prerequisite for improved sequence alignment and remote homology detection algorithms. Two objective approaches were used to investigate co-occurrence, both based on variance maximization of the weighted residue frequencies in columns taken from a large alignment database. The first approach discretely grouped amino acid types, and the second approach extracted orthogonal properties of amino acids using principal components analysis. The grouping results corresponded to amino acid physical properties such as side chain hydrophobicity, size, or backbone flexibility, and an optimal arrangement of approximately eight groups was observed. However, interpretation of the orthogonal properties was more complex. Although the principal components accounting for the largest variances exhibited modest correlations with hydrophobicity and conservation of glycine, in general principal components did not correspond to physical properties of amino acids. Although not intuitive, these amino acid mathematical properties were demonstrated to be robust and to improve local pairwise alignment accuracy, relative to 20 amino acid frequencies alone, for a simple test case.  相似文献   

11.
12.
MOTIVATION: Sequence alignment techniques have been developed into extremely powerful tools for identifying the folding families and function of proteins in newly sequenced genomes. For a sufficiently low sequence identity it is necessary to incorporate additional structural information to positively detect homologous proteins. We have carried out an extensive analysis of the effectiveness of incorporating secondary structure information directly into the alignments for fold recognition and identification of distant protein homologs. A secondary structure similarity matrix based on a database of three-dimensionally aligned proteins was first constructed. An iterative application of dynamic programming was used which incorporates linear combinations of amino acid and secondary structure sequence similarity scores. Initially, only primary sequence information is used. Subsequently contributions from secondary structure are phased in and new homologous proteins are positively identified if their scores are consistent with the predetermined error rate. RESULTS: We used the SCOP40 database, where only PDB sequences that have 40% homology or less are included, to calibrate homology detection by the combined amino acid and secondary structure sequence alignments. Combining predicted secondary structure with sequence information results in a 8-15% increase in homology detection within SCOP40 relative to the pairwise alignments using only amino acid sequence data at an error rate of 0.01 errors per query; a 35% increase is observed when the actual secondary structure sequences are used. Incorporating predicted secondary structure information in the analysis of six small genomes yields an improvement in the homology detection of approximately 20% over SSEARCH pairwise alignments, but no improvement in the total number of homologs detected over PSI-BLAST, at an error rate of 0.01 errors per query. However, because the pairwise alignments based on combinations of amino acid and secondary structure similarity are different from those produced by PSI-BLAST and the error rates can be calibrated, it is possible to combine the results of both searches. An additional 25% relative improvement in the number of genes identified at an error rate of 0.01 is observed when the data is pooled in this way. Similarly for the SCOP40 dataset, PSI-BLAST detected 15% of all possible homologs, whereas the pooled results increased the total number of homologs detected to 19%. These results are compared with recent reports of homology detection using sequence profiling methods. AVAILABILITY: Secondary structure alignment homepage at http://lutece.rutgers.edu/ssas CONTACT: anders@rutchem.rutgers.edu; ronlevy@lutece.rutgers.edu Supplementary Information: Genome sequence/structure alignment results at http://lutece.rutgers.edu/ss_fold_predictions.  相似文献   

13.
The gene (aprI) encoding alkaline serine protease (AprI; subtilase) from Alteromonas sp. strain O-7 was cloned and sequenced. The nucleotide sequence of aprI has been identified. The deduced amino acid sequence indicated that aprI codes for a precursor of 715 amino acids and the precursor is composed of four regions including a signal peptide, an N-terminal pro-region, a mature protease region and a C-terminal extension region of 215 amino acids as previously described for aprII [H. Tsujibo et al., Gene, 136, 247–251 (1993)]. The amino acid sequence of the mature AprI (AprI-M) showed high sequence homology with those of other class I subtilases. The C-terminal region was characterized by a repeat of 94 amino acids residues, which showed about 50% similarity with those of the C-terminal pro-region of several known proteases from Gram-negative bacteria.  相似文献   

14.
The superoxide dismutase (SOD) of Bacteroides gingivalis can use either iron or manganese as a cofactor in its catalytic activity. In this study, the complete amino acid sequence of this SOD purified from anaerobically maintained B. gingivalis cells was determined. The proteins consisted of 191 amino acid residues and had a molecular mass of 21,500. The sequence of B. gingivalis SOD showed 44-51% homology with those for iron-specific SODs (Fe-SODs) and 40-45% homology with manganese-specific SODs (Mn-SODs) from several bacteria. However, this sequence homology was considerably less than that seen among the Fe-SOD (65-74%) or Mn-SOD family (42-60%). This indicates that B. gingivalis SOD, which accepts either iron or manganese as metal cofactor, is a structural intermediate between the Fe-SOD and Mn-SOD families.  相似文献   

15.
MOTIVATION: The observed correlations between pairs of homologous protein sequences are typically explained in terms of a Markovian dynamic of amino acid substitution. This model assumes that every location on the protein sequence has the same background distribution of amino acids, an assumption that is incompatible with the observed heterogeneity of protein amino acid profiles and with the success of profile multiple sequence alignment. RESULTS: We propose an alternative model of amino acid replacement during protein evolution based upon the assumption that the variation of the amino acid background distribution from one residue to the next is sufficient to explain the observed sequence correlations of homologs. The resulting dynamical model of independent replacements drawn from heterogeneous backgrounds is simple and consistent, and provides a unified homology match score for sequence-sequence, sequence-profile and profile-profile alignment.  相似文献   

16.
Amino acid residues that are involved in functional interactions in proteins have strong evolutionary pressure to remain unchanged and consequently their substitution patterns are different from those that are noninteracting. To characterize and quantify the differences between amino acid substitution patterns due to structural restraints and those under functional restraints, we have made a comparative analysis of families of homologous proteins. Residues classified as having the same amino acid type, secondary structure, accessibility, and side-chain hydrogen bonds are shown to be better conserved if they are close to the active site. We have focused on enzyme families for this analysis since they have functional sites that are easily defined by their catalytic residues. We have derived new sets of environment-specific substitution tables, which we term function-dependent environment-specific substitution tables, where amino acid residues are classified according to their distance from the functional sites. The residues that are within a distance of 9 A from the active site have distinct amino acid substitution patterns when compared to the other sites. The function-dependent environment-specific substitution tables have been tested using the sequence-structure homology recognition program FUGUE and the results compared with the recognition performance obtained using the standard environment-specific substitution tables. Significant improvements are obtained in both recognition performance and alignment accuracy using the function-dependent environment-specific substitution tables (P-value = 0.02, according to the Wilcoxon signed rank test for alignment accuracy). The alignments near the active site are greatly improved with pronounced improvements at lower percentage identities (less than 30%).  相似文献   

17.
Glutathione synthetase from Escherichia coli B showed amino acid sequence homology with mammalian and bacterial dihydrofolate reductases over 40 residues, although these two enzymes are different in their reaction mechanisms and ligand requirements. The effects of ligands of dihydrofolate reductase on the reaction of E. coli B glutathione synthetase were examined to find resemblances in catalytic function to dihydrofolate reductase. The E. coli B enzyme was potently inhibited by 7,8-dihydrofolate, methotrexate, and trimethoprim. Methotrexate was studied in detail and proved to bind to an ATP binding site of the E. coli B enzyme with K1 value of 0.1 mM. The homologous portion of the amino acid sequence in dihydrofolate reductases, which corresponds to the portion coded by exon 3 of mammalian dihydrofolate reductase genes, provided a binding site of the adenosine diphosphate moiety of NADPH in the crystal structure of dihydrofolate reductase. These analyses would indicate that the homologous portion of the amino acid sequence of the E. coli B enzyme provides the ATP binding site. This report gives experimental evidence that amino acid sequences related by sequence homology conserve functional similarity even in enzymes which differ in their catalytic mechanisms.  相似文献   

18.
A novel subtilase from common bean leaves   总被引:3,自引:0,他引:3  
Popovic T  Puizdar V  Brzin J 《FEBS letters》2002,530(1-3):163-168
We describe the isolation of a protease from common bean leaves grown in the field. On the basis of its biochemical properties it was classified as serine proteinase belonging to the subtilisin clan. Isoelectric focusing resulted in a single band at pH 4.6, and SDS–PAGE in a single band corresponding to Mr 72 kDa. The proteinase activity is maximal at pH 9.9 and shows high stability in the alkaline region. The relative activities of the proteinase for eight different synthetic substrates were determined. The requirement for Arg in the P1 position appeared obligatory. kcat/Km values indicate that, for highest catalytic efficiency, a basic amino acid is also required in the P2 position, presenting a motif typical of the cleavage site for the kexin family of subtilases. The sequence of the 17 N-terminal amino acids of this proteinase shows similarity to those of other plant subtilases, sharing the highest number of identical amino acids with proteinase C1 from soybean seedling cotyledons and a cucumisin-like proteinase from white gourd (Benincasa hispida).  相似文献   

19.
D'Amico S  Gerday C  Feller G 《Gene》2000,253(1):95-105
The alpha-amylase sequences contained in databanks were screened for the presence of amino acid residues Arg195, Asn298 and Arg/Lys337 forming the chloride-binding site of several specialized alpha-amylases allosterically activated by this anion. This search provides 38 alpha-amylases potentially binding a chloride ion. All belong to animals, including mammals, birds, insects, acari, nematodes, molluscs, crustaceans and are also found in three extremophilic Gram-negative bacteria. An evolutionary distance tree based on complete amino acid sequences was constructed, revealing four distinct clusters of species. On the basis of multiple sequence alignment and homology modeling, invariable structural elements were defined, corresponding to the active site, the substrate binding site, the accessory binding sites, the Ca(2+) and Cl(-) binding sites, a protease-like catalytic triad and disulfide bonds. The sequence variations within functional elements allowed engineering strategies to be proposed, aimed at identifying and modifying the specificity, activity and stability of chloride-dependent alpha-amylases.  相似文献   

20.
首先介绍序列比对的分子生物学基础,即核酸序列基本单元核苷酸和蛋白质序列基本单元氨基酸。文中以精心设计的图表列出四种核苷酸和二十种氨基酸的名称、性质和分类。第2节简述序列比对基础,包括相似性和同源性基本概念、整体比对和局部比对、点阵图方法、动态规划和启发式算法、计分矩阵和空位罚分,以及常用软件和分析平台。第3节介绍核酸序列比对中常用计分矩阵DNAfull,蛋白质序列比对中常用计分矩阵BLOSUM62和PAM250。第4-8节则以血红蛋白、多肽毒素、植物转录因子、癌胚抗原和唾液酸酶为例,介绍双序列比对的具体应用。通过这些实例,说明如何选择分析平台和比对程序、如何设置计分矩阵和空位罚分,如何分析比对结果及其生物学意义。文末进行简要总结。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号