首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
We have used cluster analysis to identify recurring sequence patterns that transcend protein family boundaries. A subset of these patterns occur predominantly in a single type of local structure in proteins. Here we characterize the three-dimensional structures and contexts in which these sequence patterns occur, with particular attention to the interactions responsible for their structural selectivity.  相似文献   

2.
We have purified a 21-kDa protein, designated as P1, from Rehmannia glutinosa to homogeneity by ammonium sulfate precipitation, anion exchange chromatography, hydrophobic interaction chromatography, and preparative native PAGE. The purified P1 had chitin degradation activity. The N-terminal amino acid sequence of P1 indicated that it is very similar to those of thaumatin and other reported thaumatin-like proteins.  相似文献   

3.
In a previous paper we obtained ten (orthogonal) factors, linear combinations of which can express the properties of the 20 naturally occurring amino acids. In this paper, we assume that the most important properties (linear combinations of these ten factors) that determine the three-dimensional structure of a protein are conserved properties, i.e., are those that have been conserved during evolution. Two definitions of a conserved property are presented: (1) a conserved property for an average protein is defined as that linear combination of the ten factors that optimally expresses the similarity of one amino acid to another (hence, little change during evolution), as given by the relatedness odds matrix of Dayhoff et al.; (2) a conserved property for each position in the amino acid sequence (locus) of a specific family of homologous proteins (the cytochromec family or the globin family) is defined as that linear combination of the ten factors that is common among a set of amino acids at a given locus when the sequences are properly aligned. When the specificity at each locus is averaged over all loci, the same features are observed for three expressions of these two definitions, namely the conserved property for an average protein, the average conserved property for the cytochromec family, and the average conserved property for the globin family; we find that bulk and hydrophobicity (information about packing and long-range interactions) are more important than other properties, such as the preference for adopting a specific backbone structure (information about short-range interactions). We also demonstrate that the sequence profile of a conserved property, defined for each locus of a protein family (definition 2), corresponds uniquely to the three-dimensional structure, while the conserved property for an average protein (definition 1) is not useful for the prediction of protein structure. The amino acid sequences of numerous proteins are searched to find those that are similar, in terms of the conserved properties (definition 2), to sequences of the same size from one of the homologous families (cytochromec and globin, respectively) for whose loci the conserved properties were defined. Many similar sequences are found, the number of similarities decreasing with increasing size of the segment. However, the segments must be rather long (15 residues) before the comparisons become meaningful. As an example, one sufficiently large sequence (20 residues) from a protein of known structure (apo-liver alcohol dehydrogenase that is not a member of either family) is found to be similar in the conserved properties to a particular sequence of a member of the family of human hemoglobin chains, and the two sequences have similar structures. This means that, since conserved properties are expected to be structure determinants, we can use the conserved properties to predict an initial protein structure for subsequent energy minimization for a protein for which the conserved properties are similar to those of a family of proteins with a sufficiently large number of homologous amino acid sequences; such a large number of homologous sequences is required to define a conserved property for each locus of the homologous protein family.  相似文献   

4.
R A Broglia  G Tiana 《Proteins》2001,45(4):421-427
While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learned how to extract this information to predict the detailed, biological active, three-dimensional structure of a protein whose sequence is known. Using insight obtained from lattice model simulations of the folding of small proteins (fewer than 100 residues), in particular of the fact that this phenomenon is essentially controlled by conserved contacts (Mirny et al., Proc Natl Acad Sci USA 1995;92:1282) among (few) strongly interacting ("hot") amino acids (Tiana et al., J Chem Phys 1998;108:757-761), which also stabilize local elementary structures formed early in the folding process and leading to the (postcritical) folding core when they assemble together (Broglia et al., Proc Natl Acad Sci USA 1998;95:12930, Broglia & Tiana, J Chem Phys 2001;114:7267), we have worked out a successful strategy for reading the three-dimensional structure of lattice model-designed proteins from the knowledge of only their amino acid sequence and of the contact energies among the amino acids.  相似文献   

5.
The mRNA of a putative small hydrophobic protein (SH) of mumps virus was identified in mumps virus-infected Vero cells, and its complete nucleotide sequence was determined by sequencing the genomic RNA and cDNA clones and partial sequencing of mRNA. The SH mRNA is 310 nucleotides long excluding the poly(A) and contains a single open reading frame encoding a protein of 57 amino acids with a calculated molecular weight of 6,719. The predicted protein is highly hydrophobic and contains a stretch of 25 hydrophobic amino acids near the amino terminus which could act as a membrane anchor region. There is no homology between the putative SH protein of mumps virus and the SH protein of simian virus 5, even though the SH genes are located in the same locus in the corresponding genome. One interesting observation is that the hydrophobic domain of simian virus 5 SH protein is at the carboxyl terminus, whereas that of mumps virus putative SH protein is near the amino terminus.  相似文献   

6.
1. The two cysteine residues forming the disulphide bridge that comprises part of the active site of lipoamide dehydrogenase from pig heart were specifically labelled with iodo[2-(14)C]acetic acid. 2. A tryptic peptide containing these carboxymethylcysteine residues was isolated from digests of reduced and S-carboxymethylated lipoamide dehydrogenase and its amino acid sequence of 23 residues was determined. 3. The sequence is highly homologous with a similar sequence containing the active-site disulphide bridge of lipoamide dehydrogenase derived from the 2-oxoglutarate dehydrogenase complex of Escherichia coli (Crookes strain) and it is probable that, as in the bacterial enzyme, the disulphide bridge forms an intrachain loop containing six residues. The results indicate that the bacterial and mammalian proteins have a common genetic origin. 4. Amino acid sequences containing six other unique carboxymethylcysteine residues were also partly determined. 5. The analysis of the primary structure thus far is consistent with the view that the enzyme (mol.wt. approx. 110000) is composed of two identical polypeptide chains.  相似文献   

7.

Background  

A reliable prediction of the Xaa-Pro peptide bond conformation would be a useful tool for many protein structure calculation methods. We have analyzed the Protein Data Bank and show that the combined use of sequential and structural information has a predictive value for the assessment of the cis versus trans peptide bond conformation of Xaa-Pro within proteins. For the analysis of the data sets different statistical methods such as the calculation of the Chou-Fasman parameters and occurrence matrices were used. Furthermore we analyzed the relationship between the relative solvent accessibility and the relative occurrence of prolines in the cis and in the trans conformation.  相似文献   

8.
Summary The snake venom protein echistatin contains the cell recognition sequence Arg-Gly-Asp and is a potent inhibitor of platelet aggregation. The three-dimensional structure of echistatin and the dynamics of the active RGD site are presented. A set of structures was determined using the Distance Geometry method and subsequently refined by Molecular Dynamics and energy minimization. Disulfide pairings are suggested, based on violations of experimental constraints. The structures satisfy 230 interresidue distance constraints, derived from nuclear Overhauser effect measurements, five hydrogen-bonding constraints, and 21 torsional constraints from vicinal spin-spin coupling constants. The segment from Gly5 to Cys20 and from Asp30 to Asn42 has a well-defined conformation and the Arg-Gly-Asp sequence, which adopts a turn-like structure, is located at the apex of a nine-residue loop connecting the two strands of a distorted -sheet. The mobility of the Arg-Gly-Asp site has been quantitatively characterized by 15N relaxation measurements. The overall correlation time of echistatin was determined from fluorescence measurements, and was used in a model-free analysis to determine internal motional parameters. The active site has order parameters of 0.3–0.5, i.e., among the smallest values ever observed at the active site of a protein. Correlation of the flexible region of the protein as characterized by relaxation experiments and the NMR solution structures was made by calculating generalized order parameters from the ensemble of three-dimensional structures. The motion of the RGD site detected experimentally is more extensive than a simple RGD loop wagging motional model, suggested by an examination of superposed solution structures.  相似文献   

9.
The 102 amino acid residues of CNBr 4, the largest of 5 cyanogen bromide peptides from the Lactobacillus casei thymidylate synthetase were completely sequenced by means of limited tryptic, tryptic, chymotryptic, and staphylococcal protease peptides. CNBr 4 contains both of the cysteines in an enzyme subunit, with the 5-fluorodeoxyuridylate-reactive cysteine at residue 198 and the other at residue 244.  相似文献   

10.
The nucleotide sequence of the fabA gene encoding beta-hydroxydecanoyl thioester dehydrase, a key enzyme of the unsaturated fatty acid synthesis pathway of Escherichia coli, has been determined by the dideoxynucleotide sequencing technique. Most of the sequence was obtained by sequencing intragenic insertions of the transposon, Tn1000, isolated in vivo. A synthetic primer complementary to a portion of the inverted repeat sequences at the ends of the transposon was used to prime DNA synthesis into the flanking fabA sequences. The gene is composed of 516 nucleotides (171 amino acid residues) encoding a protein with a molecular weight of 18,800. Approximately half of the derived amino acid sequence was confirmed by automated Edman sequencing of peptides obtained by cyanogen bromide cleavage. The active site histidine residue (His-70) has been identified by analysis of the peptides labeled by reaction with 14C-labeled 3-decynoyl-N-acetylcysteamine, a specific mechanism-activated inhibitor. A cysteine residue (Cys-69) adjacent to the active site histidine may play the role in catalysis previously assigned to a tyrosine residue. We also report a simplified purification process for the dehydrase beginning with extracts of a brain which greatly overproduces the enzyme.  相似文献   

11.
12.
Three distinct species of IGFBP in porcine serum were identified by NH2-terminal amino acid sequence analysis. The IGFBPs identified include pIGFBP-2 (34 kDa), three isoforms of pIGFBP-3 (43, 40 and 30 kDa) and two isoforms of pIGFBP-4 (30 and 26 kDa). The three isoforms of pIGFBP-3 were found to have a common NH2-terminal amino acid sequence, as were the two isoforms of pIGFBP-4. These results indicate that porcine serum contains a truncated form of IGFBP-3 and two forms of pIGFBP-4, similar to those previously isolated from human and rat serum. Furthermore, the presence of a truncated form(s) of the GH-dependent IGFBP-3 in porcine serum suggests that elucidating its origin and function may be important in understanding how IGFBPs affect the somatogenic actions of GH.  相似文献   

13.
Prediction of amino acid sequence from structure   总被引:2,自引:0,他引:2       下载免费PDF全文
We have developed a method for the prediction of an amino acid sequence that is compatible with a three-dimensional backbone structure. Using only a backbone structure of a protein as input, the algorithm is capable of designing sequences that closely resemble natural members of the protein family to which the template structure belongs. In general, the predicted sequences are shown to have multiple sequence profile scores that are dramatically higher than those of random sequences, and sometimes better than some of the natural sequences that make up the superfamily. As anticipated, highly conserved but poorly predicted residues are often those that contribute to the functional rather than structural properties of the protein. Overall, our analysis suggests that statistical profile scores of designed sequences are a novel and valuable figure of merit for assessing and improving protein design algorithms.  相似文献   

14.
To study the structure and function of reptile lysozymes, we have reported their purification, and in this study we have established the amino acid sequence of three egg white lysozymes in soft-shelled turtle eggs (SSTL A and SSTL B from Trionyx sinensis, ASTL from Amyda cartilaginea) by using the rapid peptide mapping method. The established amino acid sequence of SSTL A, SSTL B, and ASTL showed substitutions of 43, 42, and 44 residues respectively when compared with the HEWL (hen egg white lysozyme) sequence. In these reptile lysozymes, SSTL A had one substitution compared with SSTL B (Gly126Asp) and had an N-terminal extra Gly and 11 substitutions compared with ASTL. SSTL B had an N-terminal extra Gly and 10 residues different from ASTL. The sequence of SSTL B was identical to soft-shelled turtle lysozyme from STL (Trionyx sinensis japonicus). The Ile residue at position 93 of ASTL is the first report in all C-type lysozymes. Furthermore, amino acid substitutions (Phe34His, Arg45Tyr, Thr47Arg, and Arg114Tyr) were also found at subsites E and F when compared with HEWL. The time course using N-acetylglucosamine pentamer as a substrate exhibited a reduction of the rate constant of glycosidic cleavage and increase of binding free energy for subsites E and F, which proved the contribution for amino acids mentioned above for substrate binding at subsites E and F. Interestingly, the variable binding free energy values occurred on ASTL, may be contributed from substitutions at outside of subsites E and F.  相似文献   

15.
A two amino acid (hydrophobic and polar) scheme is used to perform the design on target conformations corresponding to the native states of 20 single chain proteins. Strikingly, the percentage of successful identification of the nature of the residues benchmarked against naturally occurring proteins and their homologues is around 75%, independent of the complexity of the design procedure. Typically, the lowest success rate occurs for residues such as alanine that have a high secondary structure functionality. Using a simple lattice model, we argue that one possible shortcoming of the model studied may involve the coarse-graining of the 20 kinds of amino acids into just two effective types. Proteins 32:80–87, 1998. © 1998 Wiley-Liss, Inc.  相似文献   

16.
17.
A Y Wang  D W Grogan  J E Cronan 《Biochemistry》1992,31(45):11020-11028
Cyclopropane fatty acid (CFA) synthase of Escherichia coli catalyzes a modification of the acyl chains of phospholipid bilayers. We report (i) identification of the CFA synthase protein, (ii) overproduction (> 600-fold) and purification to essential homogeneity of the enzyme, and (iii) the amino acid sequence of CFA synthase as deduced from the nucleotide sequence of the cfa gene. CFA synthase was overproduced by use of the T7 promoter/RNA polymerase system under closely defined conditions. The enzyme was readily purified by a two-step procedure requiring only ammonium sulfate fractionation and binding to phospholipid vesicles followed by flotation in sucrose density gradients. The deduced amino acid sequence predicts a protein of 43,913 Da (382 residues) that lacks long hydrophobic segments. The CFA synthase sequence has no significant similarity to known proteins except for sequences found in other enzymes that utilize S-adenosyl-L-methionine. We also report inhibitor studies of the enzyme active site.  相似文献   

18.
Membrane proteins: amino acid sequence and membrane penetration   总被引:26,自引:0,他引:26  
A computer study shows that the membrane-penetrating portion of the erythrocyte surface MN-glycoprotein (Winzler, 1969; Marchesi et al., 1972) is distinguishable by informal cluster analysis from other segments of globular proteins when sequence length is plotted against hydrophobicity This analysis further suggests the possibility that other membrane-penetrating segments of proteins can be identified in the same way.  相似文献   

19.
Summary Adenovirus E1A and c-myc genes are known to be capable of transforming primary rat cells when they occur in combination with either polyoma middle-T or T24 Harvey-ras 1 genes. There was a low level of amino acid sequence homology between the nuclear adenovirus-12 (Ad12) E1A protein product (289 amino acids) and the c-myc protein based on optimal alignment and percentage identity. In contrast to others [Ralston R, Bishop JM (1983) Nature 306:803–806], we concluded that this low level of amino acid sequence homology was not significant, since rabies glycoprotein (RGP), which has no transforming function and localizes to the cell surface, had a similar low level of amino acid sequence homology to the c-myc protein. Furthermore, dot-matrix analysis, when used to test the overall level of amino acid sequence homology, showed no significant homology between c-myc and Ad12 E1A, E1B, or RGP. Thus, low levels of amino acid sequence homology between two proteins may not be sufficient to predict structural and functional similarities between them reliably, even if the two proteins appear to share a common function.  相似文献   

20.
Statistical analysis of the occurrence of tetrapeptides in 35 globular proteins was performed. It was found that the amino acids along the polypeptide chain are close to being randomly distributed and that the same tetrapeptide segments exist in different types of secondary structure. Therefore, a new method was proposed for locating 'microdomains' in protein interiors. Amino acid replacements in the hydrophobic core of six proteins were analyzed. The results show that the locations of amino acids belonging to defined microdomains are extremely conserved. It is suggested that the structures found may play a role as nucleation centers in protein folding.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号