首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In the postgenomic era it is essential that protein sequences are annotated correctly in order to help in the assignment of their putative functions. Over 1300 proteins in current protein sequence databases are predicted to contain a PAS domain based upon amino acid sequence alignments. One of the problems with the current annotation of the PAS domain is that this domain exhibits limited similarity at the amino acid sequence level. It is therefore essential, when using proteins with low-sequence similarities, to apply profile hidden Markov model searches for the PAS domain-containing proteins, as for the PFAM database. From recent 3D X-ray and NMR structures, however, PAS domains appear to have a conserved 3D fold as shown here by structural alignment of the six representative 3D-structures from the PDB database. Large-scale modelling of the PAS sequences from the PFAM database against the 3D-structures of these six structural prototypes was performed. All 3D models generated (> 5700) were evaluated using prosaii. We conclude from our large-scale modelling studies that the PAS and PAC motifs (which are separately defined in the PFAM database) are directly linked and that these two motifs form the PAS fold. The existing subdivision in PAS and PAC motifs, as used by the PFAM and SMART databases, appears to be caused by major differences in sequences in the region connecting these two motifs. This region, as has been shown by Gardner and coworkers for human PAS kinase (Amezcua, C.A., Harper, S.M., Rutter, J. & Gardner, K.H. (2002) Structure 10, 1349-1361, [1]), is very flexible and adopts different conformations depending on the bound ligand. Some PAS sequences present in the PFAM database did not produce a good structural model, even after realignment using a structure-based alignment method, suggesting that these representatives are unlikely to have a fold resembling any of the structural prototypes of the PAS domain superfamily.  相似文献   

2.
The glutamyl-tRNA synthetase (GluRS) of Bacillus subtilis 168T aminoacylates with glutamate its homologous tRNA(Glu) and tRNA(Gln) in vivo and Escherichia coli tRNA(1Gln) in vitro (Lapointe, J., Duplain, L., and Proulx, M. (1986) J. Bacteriol. 165, 88-93). The gltX gene encoding this enzyme was cloned and sequenced. It encodes a protein of 483 amino acids with a Mr of 55,671. Alignment of the amino acid sequences of four bacterial GluRSs (from B. subtilis, Bacillus stearothermophilus, E. coli, and Rhizobium meliloti) gives 20% identity and reveals the presence of several short highly conserved motifs in the first two thirds of these proteins. Conserved motifs are found at corresponding positions in several other aminoacyl-tRNA synthetases. The only sequence similarity between the GluRSs of these Bacillus species and the E. coli glutaminyl-tRNA synthetase (GlnRS), which has no counterpart in the E. coli GluRS, is in a segment of 30 amino acids in the last third of these synthetases. In the three-dimensional structure of the E. coli tRNA(Gln).GlnRS.ATP complex, this conserved peptide is near the anticodon of tRNA(Gln) (Rould, M. A., Perona, J. J., S?ll, D., and Steitz, T. A. (1989) Science 246, 1135-1142), suggesting that this region is involved in the specific interactions between these enzymes and the anticodon regions of their tRNA substrates.  相似文献   

3.
Signature sequences are contiguous patterns of amino acids 10-50 residues long that are associated with a particular structure or function in proteins. These may be of three types (by our nomenclature): superfamily signatures, remnant homologies, and motifs. We have performed a systematic search through a database of protein sequences to automatically and preferentially find remnant homologies and motifs. This was accomplished in three steps: 1. We generated a nonredundant sequence database. 2. We used BLAST3 (Altschul and Lipman, Proc. Natl. Acad. Sci. U.S.A. 87:5509-5513, 1990) to generate local pairwise and triplet sequence alignments for every protein in the database vs. every other. 3. We selected "interesting" alignments and grouped them into clusters. We find that most of the clusters contain segments from proteins which share a common structure or function. Many of them correspond to signatures previously noted in the literature. We discuss three previously recognized motifs in detail (FAD/NAD-binding, ATP/GTP-binding, and cytochrome b5-like domains) to demonstrate how the alignments generated by our procedure are consistent with previous work and make structural and functional sense. We also discuss two signatures (for N-acetyltransferases and glycerol-phosphate binding) which to our knowledge have not been previously recognized.  相似文献   

4.
On the basis of sequence alignments, the pseudouridine synthases were grouped into four families that share no statistically significant global sequence similarity, though some common sequence motifs were discovered [Koonin, E. V. (1996) Nucleic Acids. Res. 24, 2411-2415; Gustafsson, C., Reid, R., Greene, P. J., and Santi, D. V. (1996) Nucleic Acids Res. 24, 3756-3762]. We have investigated the functional significance of these alignments by substituting the nearly invariant lysine and proline residues in Motif I of RluA and TruB, pseudouridine synthases belonging to different families. Contrary to our expectations, the altered enzymes display only very mild kinetic impairment. Substitution of the aligned lysine and proline residues does, however, reduce structural stability, consistent with a temperature sensitive phenotype that results from substitution of the cognate proline residue in Cbf5p, a yeast homologue of TruB [Zerbarjadian, Y., King, T., Fournier, M. J., Clarke, L., and Carbon, J. (1999) Mol. Cell. Biol. 19, 7461-7472]. Together, our data support a functional role for Motif I, as predicted by sequence alignments, though the effect of substituting the highly conserved residues was milder than we anticipated. By extrapolation, our findings also support the assignment of pseudouridine synthase function to certain physiologically important eukaryotic proteins that contain Motif I, including the human protein dyskerin, alteration of which leads to the disease dyskeratosis congenita.  相似文献   

5.
6.
Amino acid sequence of protein B23 phosphorylation site   总被引:9,自引:0,他引:9  
A major phosphopeptide labeled in vivo, was identified in nucleolar protein B23 (Mr/pI = 37,000/5.1) after tryptic digestion. This peptide was purified by high performance liquid chromatography using reverse-phase (C8 and C18) columns. The phosphopeptide contains 20 amino acids including 1 phosphoserine, 7 glutamic acids, and 4 aspartic acids. The amino acid sequence is: His-Leu-Val-Ala-Val-Glu-Glu-Asp-Ala-Glu-Ser(P)-Glu-Asp-Glu-Asp- Glu-Glu-Asp-Val-Lys. This amino acid sequence is similar to that of nucleolar phosphoprotein C23 (8 consecutive amino acids were identical), and to the regulatory subunit (RII) of cAMP-dependent protein kinase (7 consecutive amino acids were identical, which is phosphorylated by casein kinase II (Carmichael, D.F., Geahlen, R.L., Allen, S.M., and Krebs, E.G. (1982) J. Biol. Chem 257, 10440-10445). The regions near these phosphorylation sites are enriched with glutamic and aspartic acids, suggesting that this acidic amino acid cluster may be essential for kinase recognition.  相似文献   

7.
The fructose-1,6-bisphosphate aldolase gene from the thermophilic bacterium, Anoxybacillus gonensis G2, was cloned and sequenced. Nucleotide sequence analysis revealed an open reading frame coding for a 30.9 kDa protein of 286 amino acids. The amino acid sequence shared approximately 80-90% similarity to the Bacillus sp. class II aldolases. The motifs that are responsible for the binding of a divalent metal ion and catalytic activity completely conserved. The gene encoding aldolase was overexpressed under T7 promoter control in Escherichia coli and the recombinant protein purified by nickel affinity chromatography. Kinetic characterization of the enzyme was performed at 60 degrees C, and K(m) and V(max) were found to be 576 microM and 2.4 microM min(-1) mg protein(-1), respectively. Enzyme exhibits maximal activity at pH 8.5. The activity of enzyme was completely inhibited by EDTA.  相似文献   

8.
The arenavirus L protein has the characteristic sequence motifs conserved among the RNA-dependent RNA polymerase L proteins of negative-strand (NS) RNA viruses. Studies based on the use of reverse-genetics approaches have provided direct experimental evidence of the key role played by the arenavirus L protein in viral-RNA synthesis. Sequence alignment shows six conserved domains among L proteins of NS RNA viruses. The proposed polymerase module of L is located within its domain III, which contains highly conserved amino acids within motifs designated A and C. We have examined the role of these conserved residues in the polymerase activity of the L protein of the prototypic arenavirus, lymphocytic choriomeningitis virus (LCMV), in vivo using a minigenome rescue assay. We show here that the presence of sequence SDD, a characteristic of motif C of segmented NS RNA viruses, as well as the presence of the highly conserved D residue within motif A of L proteins, is strictly required for the polymerase activity of the LCMV L protein. The strong dominant negative phenotype associated with many of the mutants examined and results from coimmunoprecipitation studies provided genetic and biochemical evidence, respectively, for the requirement of the L-L interaction for the polymerase activity of the LCMV L protein.  相似文献   

9.
We have cloned and sequenced the gene that encodes archaerhodopsin, a light-driven H+ pump in Halobacterium sp. aus-1 (Mukohata, Y., Sugiyama, Y., Ihara, K., and Yoshida, M. (1988) Biochem. Biophys. Res. Commun. 151, 1339-1345). The nucleotide sequence of this gene contained an open reading frame which corresponded to a protein of 260 amino acids with a molecular mass of 27,851 daltons, including a precursor sequence of 6 amino acids at the amino terminus and 2 amino acids at the carboxyl terminus. The deduced amino acid sequence of archaerhodopsin exhibited 59 and 32% homology to the sequences of bacteriorhodopsin and halorhodopsin, respectively, from Halobacterium halobium. Three charged residues (Asp-121, Asp-218, and Lys-222) are conserved in the transmembrane segments among the three retinal proteins. Residues Asp-91 and Asp-102 which, it has been suggested, may be essential for the pumping of protons (Mogi, T., Stern, L. J., Marti, T., Chao, B. H., and Khorana, H. G. (1988) Proc. Natl. Acad. Sci. U. S. A. 85,4148-4152) are conserved between archaerhodopsin and bacteriorhodopsin.  相似文献   

10.
11.
The genes for the ribosomal 5S rRNA binding protein L5 have been cloned from three extremely thermophilic eubacteria, Thermus flavus, Thermus thermophilus HB8 and Thermus aquaticus (Jahn et al, submitted). Genes for protein L5 from the three Thermus strains display 95% G/C in third positions of codons. Amino acid sequences deduced from the DNA sequence were shown to be identical for T flavus and T thermophilus, although the corresponding DNA sequences differed by two T to C transitions in the T thermophilus gene. Protein L5 sequences from T flavus and T thermophilus are 95% homologous to L5 from T aquaticus and 56.5% homologous to the corresponding E coli sequence. The lowest degrees of homology were found between the T flavus/T thermophilus L5 proteins and those of yeast L16 (27.5%), Halobacterium marismortui (34.0%) and Methanococcus vannielii (36.6%). From sequence comparison it becomes clear that thermostability of Thermus L5 proteins is achieved by an increase in hydrophobic interactions and/or by restriction of steric flexibility due to the introduction of amino acids with branched aliphatic side chains such as leucine. Alignment of the nine protein sequences equivalent to Thermus L5 proteins led to identification of a conserved internal segment, rich in acidic amino acids, which shows homology to subsequences of E coli L18 and L25. The occurrence of conserved sequence elements in 5S rRNA binding proteins and ribosomal proteins in general is discussed in terms of evolution and function.  相似文献   

12.
Computational methods such as sequence alignment and motif construction are useful in grouping related proteins into families, as well as helping to annotate new proteins of unknown function. These methods identify conserved amino acids in protein sequences, but cannot determine the specific functional or structural roles of conserved amino acids without additional study. In this work, we present 3MATRIX (http://3matrix.stanford.edu) and 3MOTIF (http://3motif.stanford.edu), a web-based sequence motif visualization system that displays sequence motif information in its appropriate three-dimensional (3D) context. This system is flexible in that users can enter sequences, keywords, structures or sequence motifs to generate visualizations. In 3MOTIF, users can search using discrete sequence motifs such as PROSITE patterns, eMOTIFs, or any other regular expression-like motif. Similarly, 3MATRIX accepts an eMATRIX position-specific scoring matrix, or will convert a multiple sequence alignment block into an eMATRIX for visualization. Each query motif is used to search the protein structure database for matches, in which the motif is then visually highlighted in three dimensions. Important properties of motifs such as sequence conservation and solvent accessible surface area are also displayed in the visualizations, using carefully chosen color shading schemes.  相似文献   

13.
VISTAS is a suite of programs for protein sequence and structure analysis. The system allows the simultaneous display, in separate windows, of multiple sequence alignments, of known or model 3D structures, and of 2D graphic representations of sequence and/or alignment properties. The displays are fully integrated, and therefore manipulations in one window can be reflected in each of the others. Beyond its display facilities, VISTAS brings together a number of existing tools under a single, user-friendly umbrella: these include a fully functional interactive color alignment procedure, conserved motif selection, a range of database-scanning routines, and interactive access to the OWL composite sequence database and to the PRINTS protein fingerprint database. Exploration of the sequence database is thus straightforward, and predefined structural motifs from the fingerprint database may be readily visualized. Of particular note is the ability to calculate conservation criteria from sequence alignments and to display the information in a 3D context: this renders VISTAS a powerful tool for aiding mutagenesis studies and for facilitating refinement of molecular models.  相似文献   

14.
Since our characterization of the slit cDNA sequence, encoding a protein secreted by glial cells and involved in the formation of axonal pathways in Drosophila, we have discovered that the protein contains two additional sequence motifs that are highly conserved in a variety of proteins. A search of the GenPept database with the 73 amino acids at the carboxy terminus of slit revealed that this region contains significant similarity to a carboxy-terminal domain found in six other exported proteins. This observation has allowed us to define a new carboxy-terminal protein motif. In addition, comparisons with a 202 amino acid domain residing between epidermal growth factor (EGF) repeats in slit shows this region to be conserved in laminin, agrin and perlecan and, strikingly, also to lie between EGF repeats in both agrin and perlecan. Our analysis suggests this motif is involved in mediating interactions among extracellular proteins. Consistent with our previous characterization of the slit protein, both new motifs are found only in extracellular proteins. The identification of these two conserved motifs in slit reveals that the entire 1469 amino acids of the protein are made up of modular regions similar to those conserved in other extracellular proteins.  相似文献   

15.
U/G and T/G mismatches commonly occur due to spontaneous deamination of cytosine and 5-methylcytosine in double-stranded DNA. This mutagenic effect is particularly strong for extreme thermophiles, since the spontaneous deamination reaction is much enhanced at high temperature. Previously, a U/G and T/G mismatch-specific glycosylase (Mth-MIG) was found on a cryptic plasmid of the archaeon Methanobacterium thermoautotrophicum, a thermophile with an optimal growth temperature of 65 degrees C. We report characterization of a putative DNA glycosylase from the hyperthermophilic archaeon Pyrobaculum aerophilum, whose optimal growth temperature is 100 degrees C. The open reading frame was first identified through a genome sequencing project in our laboratory. The predicted product of 230 amino acids shares significant sequence homology to [4Fe-4S]-containing Nth/MutY DNA glycosylases. The histidine-tagged recombinant protein was expressed in Escherichia coli and purified. It is thermostable and displays DNA glycosylase activities specific to U/G and T/G mismatches with an uncoupled AP lyase activity. It also processes U/7,8-dihydro-oxoguanine and T/7,8-dihydro-oxoguanine mismatches. We designate it Pa-MIG. Using sequence comparisons among complete bacterial and archaeal genomes, we have uncovered a putative MIG protein from another hyperthermophilic archaeon, Aeropyrum pernix. The unique conserved amino acid motifs of MIG proteins are proposed to distinguish MIG proteins from the closely related Nth/MutY DNA glycosylases.  相似文献   

16.
Members of the RNA-binding protein superfamily contain RNA binding domains of about 90 amino acids with a highly conserved motif 'GFGF'. Using the conserved motif with some variations G-(F/Y)-(G/A)-(F/Y)-(V/I)-X-(F/Y) as a probe, we screened protein sequences carrying identical amino acids in an NBRF-protein database. It has been found that the C-terminal portion of clustered asparagine-rich protein (CARP), a malaria antigen from Plasmodium falciparum, shows an unexpected sequence similarity with the RNA-binding protein superfamily for the C-terminal half of the RNA-binding domain. Dot matrix comparisons and alignment of these sequences as well as a statistical test have revealed highly significant sequence similarities. From these analyses, we conclude that the malaria antigen CARP belongs to a large family of the RNA-binding proteins. An evolutionary implication of the sequence similarity was also discussed.  相似文献   

17.
The complete amino acid sequence of coagulogen purified from the hemocytes of the horseshoe crab Carcinoscorpius rotundicauda was determined by characterization of the NH2-terminal sequence and the peptides generated after digestion of the protein with lysyl endopeptidase, Staphylococcal aureus protease V8 and trypsin. Upon sequencing the peptides by the automated Edman method, the following sequence was obtained: A D T N A P L C L C D E P G I L G R N Q L V T P E V K E K I E K A V E A V A E E S G V S G R G F S L F S H H P V F R E C G K Y E C R T V R P E H T R C Y N F P P F V H F T S E C P V S T R D C E P V F G Y T V A G E F R V I V Q A P R A G F R Q C V W Q H K C R Y G S N N C G F S G R C T Q Q R S V V R L V T Y N L E K D G F L C E S F R T C C G C P C R N Y Carcinoscorpius coagulogen consists of a single polypeptide chain with a total of 175 amino acid residues and a calculated molecular weight of 19,675. The secondary structure calculated by the method of Chou and Fasman reveals the presence of an alpha-helix region in the peptide C segment (residue Nos. 19 to 46), which is released during the proteolytic conversion of coagulogen to coagulin gel. The beta-sheet structure and the 16 half-cystines found in the molecule appear to yield a compact protein stable to acid and heat. The amino acid sequences of coagulogen of four species of limulus have been compared and the interspecies evolutionary differences are discussed.  相似文献   

18.
The primary structure of the basic isoform of Acanthamoeba profilin   总被引:6,自引:0,他引:6  
Acanthamoeba profilin-II [Kaiser, D.A., Sato, M., Ebert, R. F. and Pollard, T.D. (1986) J. Cell. Biol. 102, 221-226] was digested with trypsin or cleaved by 2-(2-nitrophenylsulphenyl)-3-methyl-3-bromoindolenine. The tryptic peptides were purified by reversed-phase-high-performance liquid chromatography and completely sequenced using automated gas-phase sequence analysis. The complete profilin-II sequence was deduced by ordering the tryptic peptides using the sequence information of the tryptophan-cleavage products. Acanthamoeba profilin-II was found to be homologous to the previously determined profilin-I sequence [Ampe, C., Vandekerckhove, J., Brenner, L., Tobacman, L. and Korn, E.D. (1985) J. Biol. Chem. 260, 834-840]. Like profilin-I, profilin-II consists of 125 amino acids, has a blocked NH2 terminus and a trimethyllysine residue at position 103. Profilin-II differs in at least 21 positions from one of the profilin-I isoforms. The amino acid exchanges are mainly concentrated in the middle part of the sequence. Profilin-II contains two more basic residues than profilin-I, which explains its higher isoelectric point.  相似文献   

19.
We have purified a novel GTP-binding protein (G protein) with a Mr of about 24,000 to homogeneity from bovine brain membranes (Kikuchi, A., Yamashita, T., Kawata, M., Yamamoto, K., Ikeda, K., Tanimoto, T., and Takai, Y. (1988) J. Biol. Chem. 263, 2897-2904). In the present studies, we have isolated and sequenced the cDNA of this G protein from a bovine brain cDNA library using oligonucleotide probes designed from the partial amino acid sequences. The cDNA of the G protein has an open reading frame encoding a protein of 220 amino acids with a calculated Mr of 24,954. This G protein is designated as the smg-25A protein (smg p25A). The amino acid sequence deduced from the smg-25A cDNA contains the consensus sequences of GTP-binding and GTPase domains. smg p25A shares about 28 and 44% amino acid homology with the ras and ypt1 proteins, respectively. In addition to this cDNA, we have isolated two other homologous cDNAs encoding G proteins of 219 and 227 amino acids with calculated Mr values of 24,766 and 25,975, respectively. These G proteins are designated as the smg-25B and smg-25C proteins (smg p25B and smg p25C), respectively. The amino acid sequences deduced from the three smg-25 cDNAs are highly homologous with one another in the overall sequences except for C-terminal 32 amino acids. Moreover, three smg p25s have a consensus C-terminal sequence, Cys-X-Cys, which is different from the known C-terminal consensus sequences of the ras and ypt1 proteins, Cys-X-X-X and Cys-Cys, respectively. These results together with the biochemical properties of smg p25A described previously indicate that three smg p25s constitute a novel G protein family.  相似文献   

20.
The primary structure of a 61-amino-acid residue peptide from the pancreas of the European eel (Anguilla anguilla) has been established as E E K S G(5)L Y R K P(10)S C G E M(15)S A M H A(20)C P M N F(25)A P V C G(30)T D G N T(35)Y P N E C(40)S L C F Q(45)R Q N T K(50)T D I L I(55)T K D D R(60)C. There was no indication of microheterogeneity. This peptide shows structural similarity to pancreatic secretory trypsin inhibitors from several mammalian species and to a cholecystokinin-releasing peptide isolated from rat pancreatic juice. A comparison of the amino acid sequences of the peptides has identified a domain in the central region of the molecules that has been strongly conserved during evolution. In contrast, the amino acid sequence in the region corresponding to the reactive centre of the mammalian trypsin inhibitors is very poorly conserved in the eel peptide. The P1-P1' reactive site lysine-isoleucine (or arginine-isoleucine) bond in the mammalian trypsin inhibitors is replaced by a methionine-asparagine bond. This region does, however, show limited homology to the reactive centre of human alpha 1-protease inhibitor suggesting that the eel peptide may function as an inhibitor of other proteolytic enzymes in the pancreas.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号