首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Serine peptidases (SP) are peptidases with a uniquely activated serine residue in the substrate-binding site. SP can be classified into clans with distinct evolutionary histories and each clan further subdivided into families. We analyzed 79 proteins representing the S1A subfamily of human SP, obtained from different databases. Multiple alignment identified 87 highly conserved amino acid residues. In most cases of substitution, a residue of similar character was inserted, implying that the overall character of the local region was conserved. We also identified several conserved protein motifs. 7-13 cysteine positions, potentially forming disulfide bridges, were also found to be conserved. Most members are secreted as inactive (pro) forms with a trypsin-like cleavage site for activation. Substrate specificity was predicted to be trypsin-like for most members, with few chymotrypsin-like proteins. Phylogenetic analysis enabled us to classify members of the S1A subfamily into structurally related groups; this might also help to functionally sort members of this subfamily and give an idea about their possible functions.  相似文献   

2.
A good system for the naming and classification of peptidases can contribute much to the study of these enzymes. Having already described the building of families and clans in the MEROPS system, we here focus on the lowest level in the hierarchy, in which the huge number of individual peptidase proteins are assigned to a lesser number of what we term 'species' of peptidases. Just over 2000 peptidase species are recognised today, but we estimate that 25 000 will one day be known. Each species is built around a peptidase protein that has been adequately characterised. The cluster of peptidase proteins that represent the single species is then assembled primarily by analysis of a sequence 'tree' for the family. Each peptidase species is given a systematic identifier and a summary page of data regarding it is assembled. Because the characterisation of new peptidases lags far behind the sequencing, the majority of peptidase proteins are so far known only as amino acid sequences and cannot yet be assigned to species. We suggest that new forms of analysis of the sequences of the unassigned peptidases may give early indications of how they will cluster into the new species of the future.  相似文献   

3.
The MEROPS website (http://merops.sanger.ac.uk) includes information on peptidase inhibitors as well as on peptidases and their substrates. Displays have been put in place to link peptidases and inhibitors together. The classification of protein peptidase inhibitors is continually being revised, and currently inhibitors are grouped into 67 families based on comparisons of protein sequences. These families can be further grouped into 38 clans based on comparisons of tertiary structure. Small molecule inhibitors are important reagents for peptidase characterization and, with the increasing importance of peptidases as drug targets, they are also important to the pharmaceutical industry. Small molecule inhibitors are now included in MEROPS and over 160 summaries have been written.  相似文献   

4.
This review deals with structural and functional features of glycoside hydrolases, a widespread group of enzymes present in almost all living organisms. Their catalytic domains are grouped into 120 amino acid sequence-based families in the international classification of the carbohydrate-active enzymes (CAZy database). At a higher hierarchical level some of these families are combined in 14 clans. Enzymes of the same clan have common evolutionary origin of their genes and share the most important functional characteristics such as composition of the active center, anomeric configuration of cleaved glycosidic bonds, and molecular mechanism of the catalyzed reaction (either inverting, or retaining). There are now extensive data in the literature concerning the relationship between glycoside hydrolase families belonging to different clans and/or included in none of them, as well as information on phylogenetic protein relationship within particular families. Summarizing these data allows us to propose a multilevel hierarchical classification of glycoside hydrolases and their homologs. It is shown that almost the whole variety of the enzyme catalytic domains can be brought into six main folds, large groups of proteins having the same three-dimensional structure and the supposed common evolutionary origin.  相似文献   

5.
6.
We classified the carboxylic ester hydrolases (CEHs) into families and clans by use of multiple sequence alignments, secondary structure analysis, and tertiary structure superpositions. Our work for the first time has fully established their systematic structural classification. Family members have similar primary, secondary, and tertiary structures, and their active sites and reaction mechanisms are conserved. Families may be gathered into clans by their having similar secondary and tertiary structures, even though primary structures of members of different families are not similar. CEHs were gathered from public databases by use of Basic Local Alignment Search Tool (BLAST) and divided into 91 families, with 36 families being grouped into five clans. Members of one clan have standard α/β‐hydrolase folds, while those of other two clans have similar folds but with different sequences of their β‐strands. The other two clans have members with six‐bladed β‐propeller and three‐α‐helix bundle tertiary structures. Those families not in clans have a large variety of structures or have no members with known structures. At the time of writing, the 91 families contained 321,830 primary structures and 1378 tertiary structures. From these data, we constructed an accessible database: CASTLE (CArboxylic eSTer hydroLasEs, http://www.castle.cbe.iastate.edu ).  相似文献   

7.
Thioesterases (TEs) are classified into EC 3.1.2.1 through EC 3.1.2.27 based on their activities on different substrates, with many remaining unclassified (EC 3.1.2.–). Analysis of primary and tertiary structures of known TEs casts a new light on this enzyme group. We used strong primary sequence conservation based on experimentally proved proteins as the main criterion, followed by verification with tertiary structure superpositions, mechanisms, and catalytic residue positions, to accurately define TE families. At present, TEs fall into 23 families almost completely unrelated to each other by primary structure. It is assumed that all members of the same family have essentially the same tertiary structure; however, TEs in different families can have markedly different folds and mechanisms. Conversely, the latter sometimes have very similar tertiary structures and catalytic mechanisms despite being only slightly or not at all related by primary structure, indicating that they have common distant ancestors and can be grouped into clans. At present, four clans encompass 12 TE families. The new constantly updated ThYme (Thioester‐active enzYmes) database contains TE primary and tertiary structures, classified into families and clans that are different from those currently found in the literature or in other databases. We review all types of TEs, including those cleaving CoA, ACP, glutathione, and other protein molecules, and we discuss their structures, functions, and mechanisms.  相似文献   

8.
Several studies based on the known three-dimensional (3-D) structures of proteins show that two homologous proteins with insignificant sequence similarity could adopt a common fold and may perform same or similar biochemical functions. Hence, it is appropriate to use similarities in 3-D structure of proteins rather than the amino acid sequence similarities in modelling evolution of distantly related proteins. Here we present an assessment of using 3-D structures in modelling evolution of homologous proteins. Using a dataset of 108 protein domain families of known structures with at least 10 members per family we present a comparison of extent of structural and sequence dissimilarities among pairs of proteins which are inputs into the construction of phylogenetic trees. We find that correlation between the structure-based dissimilarity measures and the sequence-based dissimilarity measures is usually good if the sequence similarity among the homologues is about 30% or more. For protein families with low sequence similarity among the members, the correlation coefficient between the sequence-based and the structure-based dissimilarities are poor. In these cases the structure-based dendrogram clusters proteins with most similar biochemical functional properties better than the sequence-similarity based dendrogram. In multi-domain protein families and disulphide-rich protein families the correlation coefficient for the match of sequence-based and structure-based dissimilarity (SDM) measures can be poor though the sequence identity could be higher than 30%. Hence it is suggested that protein evolution is best modelled using 3-D structures if the sequence similarities (SSM) of the homologues are very low.  相似文献   

9.
Evolutionary lines of cysteine peptidases   总被引:2,自引:0,他引:2  
The proteolytic enzymes that depend upon a cysteine residue for activity have come from at least seven different evolutionary origins, each of which has produced a group of cysteine peptidases with distinctive structures and properties. We show here that the characteristic molecular topologies of the peptidases in each evolutionary line can be seen not only in their three-dimensional structures, but commonly also in the two-dimensional structures. Clan CA contains the families of papain (C1), calpain (C2), streptopain (C10) and the ubiquitin-specific peptidases (C12, C19), as well as many families of viral cysteine endopeptidases. Clan CD contains the families of clostripain (C11), gingipain R (C25), legumain (C13), caspase-1 (C14) and separin (C50). These enzymes have specificities dominated by the interactions of the S1 subsite. Clan CE contains the families of adenain (C5) from adenoviruses, the eukaryotic Ulp1 protease (C48) and the bacterial YopJ proteases (C55). Clan CF contains only pyroglutamyl peptidase I (C15). The picornains (C3) in clan PA have probably evolved from serine peptidases, which still form the majority of enzymes in the clan. The cysteine peptidase activities in clans PB and CH are autolytic only. In conclusion, we suggest that although almost all the cysteine peptidases depend for activity on catalytic dyads of cysteine and histidine, it is worth noting some important differences that they have inherited from their distant ancestral peptidases.  相似文献   

10.
Discovery of local packing motifs in protein structures   总被引:1,自引:0,他引:1  
We present a language for describing structural patterns of residues in protein structures and a method for the discovery of such patterns that recur in a set of protein structures. The patterns impose restrictions on the spatial position of each residue, their order along the amino acid chain, and which amino acids are allowed in each position. Unlike other methods for comparing sets of protein structures, our method is not based on the use of pairwise structure comparisons which is often time consuming and can produce inconsistent results. Instead, the method simultaneously takes into account information from all structures in the search for conserved structure patterns which are potential structure motifs. The method is based on describing the spatial neighborhoods of each residue in each structure as a string and applying a sequence pattern discovery method to find patterns common to subsets of these strings. Finally it is checked whether the similarities between the neighborhood strings correspond to spatially similar substructures. We apply the method to analyze sets of very disparate proteins from the four different protein families: serine proteases, cuprodoxins, cysteine proteinases, and ferredoxins. The motifs found by the method correspond well to the site and motif information given in the annotation of these proteins in PDB, Swiss-Prot, and PROSITE. Furthermore, the motifs are confirmed by using the motif data to constrain the structural alignment of the proteins obtained with the program SAP. This gave the best superposition/alignment of the proteins given the motif assignment.  相似文献   

11.
王伟  郑小琪  窦永超  刘太岗  赵娟  王军 《生物信息学》2011,9(2):171-175,180
蛋白质的亚细胞位点信息有助于我们了解蛋白质的功能以及它们之间的相互作用,同时还可以为新药物的研发提供帮助。目前普遍采用的亚细胞位点预测方法主要是基于N端分选信号或氨基酸组分特征,但研究表明,单纯基于N端分选信号或氨基酸组分的方法都会丢失序列的序信息。为了克服此缺陷,本文提出了一种基于最优分割位点的蛋白质亚细胞位点预测方法。首先,把每条蛋白质序列分割为N端、中间和C端三部分,然后在每个子序列和整条序列中分别提取氨基酸组分、双肽组分和物理化学性质,最后我们把这些特征融合起来作为整条序列的特征。通过夹克刀检验,该方法在NNPSL数据集上得到的总体精度分别是87.8%和92.1%。  相似文献   

12.
13.
The primary and secondary structure of human plasma apolipoprotein A-I and apolipoprotein E-3 have been analyzed to further our understanding of the secondary and tertiary conformation of these proteins and the structure and function of plasma lipoprotein particles. The methods used to analyze the primary sequence of these proteins used computer programs: (a) to identify repeated patterns within these proteins on the basis of conservative substitutions and similarities within the physicochemical properties of each residue; (b) for local averaging, hydrophobic moment, and Fourier analysis of the physicochemical properties; and (c) for secondary structure prediction of each protein carried out using homology, statistical, and information theory based methods. Circular dichroism was used to study purified lipid-protein complexes of each protein and quantitate the secondary structure in a lipid environment. The data from these analyses were integrated into a single secondary structure prediction to derive a model of each protein. The sequence homology within apolipoproteins A-I, E-3, and A-IV is used to derive a consensus sequence for two 11 amino acid repeating sequences in this family of proteins.  相似文献   

14.
Human prolactin. cDNA structural analysis and evolutionary comparisons   总被引:33,自引:0,他引:33  
Prolactin (Prl), growth hormone, and chorionic sommatomammotropin form a set (the "Prl set") of hormones which is thought to have evolved from a common ancestral gene. This assumption is based on several lines of evidence: overlap in their biological and immunological properties, similarities in their amino acid sequences, and homologies in the nucleic acid sequences of their structural genes. In the current study we report the cloning, amplification in bacteria, and sequence analysis of DNA complementary to Prl mRNA isolated from human pituitary Prl-secreting adenomas. The cloned DNA contains 914 bases, which includes the entire coding sequence of human prePrl as well as portions of the 5- and 3'-untranslated regions of the mRNA. The amino acid sequence predicted by our data differs from a previously reported amino acid sequence in 8 positions. With the results of this study we can now compare in one species the nucleotide sequences of the structural gene coding for each of the hormones of the Prl set. The sequence divergence at replacement sites is used to establish an evolutionary clock for the Prl set of genes. Using this clock, we postulate that the chromosomal segregation of human Prl and human growth hormone occurred about 392 million years ago and that growth hormone and chorionic sommatomammotropin underwent an intrachromosomal recombination within the last 10 million years.  相似文献   

15.
In a previous paper we obtained ten (orthogonal) factors, linear combinations of which can express the properties of the 20 naturally occurring amino acids. In this paper, we assume that the most important properties (linear combinations of these ten factors) that determine the three-dimensional structure of a protein are conserved properties, i.e., are those that have been conserved during evolution. Two definitions of a conserved property are presented: (1) a conserved property for an average protein is defined as that linear combination of the ten factors that optimally expresses the similarity of one amino acid to another (hence, little change during evolution), as given by the relatedness odds matrix of Dayhoff et al.; (2) a conserved property for each position in the amino acid sequence (locus) of a specific family of homologous proteins (the cytochromec family or the globin family) is defined as that linear combination of the ten factors that is common among a set of amino acids at a given locus when the sequences are properly aligned. When the specificity at each locus is averaged over all loci, the same features are observed for three expressions of these two definitions, namely the conserved property for an average protein, the average conserved property for the cytochromec family, and the average conserved property for the globin family; we find that bulk and hydrophobicity (information about packing and long-range interactions) are more important than other properties, such as the preference for adopting a specific backbone structure (information about short-range interactions). We also demonstrate that the sequence profile of a conserved property, defined for each locus of a protein family (definition 2), corresponds uniquely to the three-dimensional structure, while the conserved property for an average protein (definition 1) is not useful for the prediction of protein structure. The amino acid sequences of numerous proteins are searched to find those that are similar, in terms of the conserved properties (definition 2), to sequences of the same size from one of the homologous families (cytochromec and globin, respectively) for whose loci the conserved properties were defined. Many similar sequences are found, the number of similarities decreasing with increasing size of the segment. However, the segments must be rather long (15 residues) before the comparisons become meaningful. As an example, one sufficiently large sequence (20 residues) from a protein of known structure (apo-liver alcohol dehydrogenase that is not a member of either family) is found to be similar in the conserved properties to a particular sequence of a member of the family of human hemoglobin chains, and the two sequences have similar structures. This means that, since conserved properties are expected to be structure determinants, we can use the conserved properties to predict an initial protein structure for subsequent energy minimization for a protein for which the conserved properties are similar to those of a family of proteins with a sufficiently large number of homologous amino acid sequences; such a large number of homologous sequences is required to define a conserved property for each locus of the homologous protein family.  相似文献   

16.
A gonococcal inhibitor produced by Staphylococcus haemolyticus was separated into three components by reverse-phase h.p.l.c. The amino acid composition analysis of each of the three components indicated extensive similarities. N-Terminal sequence analysis of all three components allowed the identification of the first 27-30 residues of each. The complete primary structure of each component was determined from the sequence analysis of trypic peptides and peptides generated by mild acid hydrolysis. Each component is composed of 44 amino acid residues, with evidence suggesting the presence of an N-terminal formylmethionine residue in each. The components I, II and III have respectively 33, 29 and 33 identical amino acid residues in their sequences, which represents 75%, 65.9% and 75% homology. These components contain a high proportion of hydrophobic amino acids, and their hydrophobicity profiles are closely related. Also, each of the three components contains a positively charged residue (lysine) as the third residue, followed by a core of hydrophobic residues. These results suggest that the three components are possible signal sequences of one or more secreted or membrane-associated proteins.  相似文献   

17.
18.
To investigate the evolutionary impact of protein structure, the experimentally determined tertiary structure and the protein-coding DNA sequence were collected for each of 1,195 genes. These genes were studied via a model of sequence change that explicitly incorporates effects on evolutionary rates due to protein tertiary structure. In the model, these effects act via the solvent accessibility environments and pairwise amino acid interactions that are induced by tertiary structure. To compare the hypotheses that structure does and does not have a strong influence on evolution, Bayes factors were estimated for each of the 1,195 sequences. Most of the Bayes factors strongly support the hypothesis that protein structure affects protein evolution. Furthermore, both solvent accessibility and pairwise interactions among amino acids are inferred to have important roles in protein evolution. Our results also indicate that the strength of the relationship between tertiary structure and evolution has a weak but real correlation to the annotation information in the Gene Ontology database. Although their influences on rates of evolution vary among protein families, we find that the mean impacts of solvent accessibility and pairwise interactions are about the same.  相似文献   

19.
This paper reports the complete amino acid sequence of the variable region of heavy chains derived from A/J anti-p-azophenylarsenate antibodies bearing a cross-reactive idiotype. The structure of this induced idiotypically defined antibody is homogeneous and provides the first direct evidence that heritable idiotypes are defined chemical entities. There are certain similarities between the structure of this murine antibody and the human myeloma protein Eu as well as guinea pig anti-p-azophenylarsonate antibodies. The amino acid sequence of approximately 15% of the constant region of this IgG1 molecule is also described. Combined with our studies of the variable region of the light chains of these molecules, this study represents the first complete V domain structure of an induced idiotypically defined antibody with heritable characteristics.  相似文献   

20.
Digestive fluid of the araneid spider Argiope aurantia is known to contain zinc metallopeptidases. Using anion-exchange chromatography, size-exclusion chromatography, sucrose density gradient centrifugation, and gel electrophoresis, we isolated two lower-molecular-mass peptidases, designated p16 and p18. The N-terminal amino acid sequences of p16 (37 residues) and p18 (20 residues) are 85% identical over the first 20 residues and are most similar to the N-terminal sequences of the fully active form of meprin (beta subunits) from several vertebrates (47-52% and 50-60% identical, respectively). Meprin is a peptidase in the astacin (M12A) subfamily of the astacin (M12) family. Additionally, a 66-residue internal sequence obtained from p16 aligns with the conserved astacin subfamily domain. Thus, at least some spider digestive peptidases appear related to astacin of decapod crustaceans. However, important differences between spider and crustacean metallopeptidases with regard to isoelectric point and their susceptibility to hemolymph-borne inhibitors are demonstrated. Anomalous behavior of the lower-molecular-mass Argiope peptidases during certain fractionation procedures indicates that these peptidases may take part in reversible associations with each other or with other proteins. A. aurantia digestive fluid also contains inhibitory activity effective against insect digestive peptidases. Here we present evidence for at least thirteen, heat-stable serine peptidase inhibitors ranging in molecular mass from about 15 to 32 kDa.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号