首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The amino acid sequences of some fiber proteins possibly have a periodic structure. This periodicity can be analyzed using the Fourier transform of the mathematical image of the symbol sequence of amino acid residues in proteins. One of several possible methods of Fourier transform has been chosen as optimal for the given study. This optimal Fourier transform has been used to analyze the periodic structures in several fiber proteins of bacteriophage T4. Amino acids from some groups form sequences of alternating elements with a relatively small period (T=15); those from other groups form sequences with other small periods (T=10 and T=8). Relatively large periods of amino acid arrangement, with the entire amino acid sequence of the protein being divided between them into four or six equal parts, is a new finding. The data on protein structural periodicity make it possible to align the amino acid sequences according to the periodic structures of both type. The results obtained agree with the results of previous crystallographic and electron microscopic studies.__________Translated from Molekulyarnaya Biologiya, Vol. 39, No. 2, 2005, pp. 321–329.Original Russian Text Copyright © 2005 by Simakova, Simakov.  相似文献   

2.
This article is in the area of protein sequence investigation. It studies protein sequence periodicity. The notion of latent periodicity is introduced. A mathematical method for searching for latent periodicity in protein sequences is developed. Implementation of the method developed for known cases of perfect and imperfect periodicity is demonstrated. Latent periodicity of many protein sequences from the SWISS-PROT data bank is revealed by the method and examples of latent periodicity of amino acid sequences are demonstrated for: the translation initiation factor EIF-2B (epsilon subunit) of Saccharomyces cerevisiae from the E2BE_YEAST sequence; the E.coli ferrienterochelin receptor from the FEPA_ECOLI sequence; the lysozyme of Bacteriophage SF6 from the LY_BPSF6 sequence; lipoamide dehydrogenase of Azotobacter vinelandii from the DLDH_AZOVI sequence. These protein sequences have latent periods equal to six, two, seven and 19 amino acids, respectively. We propose that a possible purpose of the amino acid sequence latent periodicity is to determine certain protein structures.  相似文献   

3.
The structure of membrane proteins specifies their functional properties, which are important for medicine and pharmacology and, therefore, is of significant interest. The repetition of transmembrane regions that consist of hydrophobic amino acids is a characteristic and organic feature of polytopic membrane proteins. The ordered repetition (periodicity) can be detected by the Fourier method applied to a digital image of the symbolic amino acid sequence of a protein. In the present work, this investigation was carried out for 24 transmembrane proteins (successfully for 14 of them). If the repetition of transmembrane regions is aperiodic, it can be revealed by another method, that is, the method of the reiterated (four to five times) averaging of the protein hydrophobicity function in a window within the limits of 9–11 amino acids that moves along the sequence. This novel method was applied to the 24 transmembrane proteins (successfully for 19 of them) and demonstrated higher suitability than the Fourier method for predicting the secondary structure of these proteins and the corresponding functional properties.  相似文献   

4.
The amino acid sequences of sulphur-rich proteins derived from the matrix substance of wool keratin have been analysed for internal and external homologies, and the nature of the repeating patterns residues has been investigated in the proteins termed B2A and BIIIA3. The Fourier transform method was used to identify preferred positions in the pentapeptide periodicity that exists in these materials and the structural and functional implications of the results are discussed.  相似文献   

5.
三周期性是大多数基因组序列的编码区所具有的主要特征.本文提出只计算1/3频率点的傅里叶频谱的快速计算方法,并用它分析DNA序列的三周期性,再利用小波变换在一定尺度下滤波来实现对DNA序列编码区的预测.理论分析和大量计算机实验证实了方法的有效性,预测效果良好.该方法运算快速,不需要任何训练组,也不依赖于现有数据库的信息.  相似文献   

6.
The phenotypes of biological systems are to some extent robust to genotypic changes. Such robustness exists on multiple levels of biological organization. We analyzed this robustness for two categories of amino acids in proteins. Specifically, we studied the codons of amino acids that bind or do not bind small molecular ligands. We asked to what extent codon changes caused by mutation or mistranslation may affect physicochemical amino acid properties or protein folding. We found that the codons of ligand-binding amino acids are on average more robust than those of non-binding amino acids. Because mistranslation is usually more frequent than mutation, we speculate that selection for error mitigation at the translational level stands behind this phenomenon. Our observations suggest that natural selection can affect the robustness of very small units of biological organization.  相似文献   

7.
Amino acids at helix-helix parallel interfaces influence arrangement of helices and interhelical angles. Parallel interfaces in 79 proteins were considered. Location of amino acids at the positions analogous to a and d in GCN4 leucine zipper nomenclature shows that certain combinations of amino acids characteristic for parallel packing occur more often than could be expected by chance. Repeating sequence combinations occur at a and d positions of parallel helix-helix interfaces with similar values of interhelical angles not only in homologous proteins but also within the same protein and in nonhomologous proteins. Within each group of observed combinations correlation exists between the size of amino acid and magnitude of the interhelical angle.  相似文献   

8.
We have determined the amino acid sequence of a small copper protein isolated from cucumber peelings. This cupredoxin contains 137 amino acids including a pyroglutamate as the first residue. The N-terminal 110 amino acid-long domain shows 30-37% identity to 2 other cupredoxins, stellacyanin and cucumber basic blue protein. A unique feature of this protein is a 27 amino acid-long C-terminal domain rich in 4-hydroxyproline and serine and resembling certain plant cell wall proteins. The prolines in this domain are hydroxylated to a different extent depending on the surrounding sequence.  相似文献   

9.
Nucleotide sequence of the coding region of the mouse N-myc gene.   总被引:11,自引:3,他引:8       下载免费PDF全文
Y Taya  S Mizusawa    S Nishimura 《The EMBO journal》1986,5(6):1215-1219
A genomic clone for the mouse N-myc gene was isolated and the total nucleotide sequence (4807 bp) of the two coding exons and an intron located between them was determined. The amino acid sequence of the N-myc protein was deduced from the DNA sequence. This protein is composed of 462 amino acids, slightly larger than human and mouse c-myc proteins, and is rich in proline like the c-myc protein. Comparison of the amino acid sequences of the mouse N-myc and c-myc proteins showed that conserved sequences are located in eight regions: four regions are in the N-terminal half of the N-myc protein and are separated from each other by regions poorly homologous to those of the c-myc protein, and the four others are located in the C-terminal half, throughout which certain homology exists. A remarkable sequence containing 13 successive acidic amino acids is present in one of the conserved regions located in the middle of the N-myc protein.  相似文献   

10.
Methods to determine periodicity in protein sequences are useful for inferring function. Fourier transformation is one approach but care is required to ensure the periodicity is genuine. Here we have shown that empirically-derived statistical tables can be used as a measure of significance. Genuine protein sequences data rather than randomly generated sequences were used as the statistical backdrop. The method has been applied to G-protein coupled receptor (GPCR) sequences, by Fourier transformation of hydrophobicity values, codon frequencies and the extent of over-representation of codon pairs; the latter being related to translational step times. Genuine periodicity was observed in the hydrophobicity whereas the apparent periodicity (as inferred from previously reported measures) in the translation step times was not validated statistically. GCR2 has recently been proposed as the plant GPCR receptor for the hormone abscisic acid. It has homology to the Lanthionine synthetase C-like family of proteins, an observation confirmed by fold recognition. Application of the Fourier transform algorithm to the GCR2 family revealed strongly predicted seven fold periodicity in hydrophobicity, suggesting why GCR2 has been reported to be a GPCR, despite negative indications in most transmembrane prediction algorithms. The underlying multiple sequence alignment, also required for the Fourier transform analysis of periodicity, indicated that the hydrophobic regions around the 7 GXXG motifs commence near the C-terminal end of each of the 7 inner helices of the alpha-toroid and continue to the N-terminal region of the helix. The results clearly explain why GCR2 has been understandably but erroneously predicted to be a GPCR.  相似文献   

11.
Many studies of biological sequence data have examined sequence structure in terms of periodicity, and various methods for measuring periodicity have been suggested for this purpose. This paper compares two such methods, autocorrelation and the Fourier transform, using synthetic periodic sequences, and explains the differences in periodicity estimates produced by each. A hybrid autocorrelation—integer period discrete Fourier transform is proposed that combines the advantages of both techniques. Collectively, this representation and a recently proposed variant on the discrete Fourier transform offer alternatives to the widely used autocorrelation for the periodicity characterization of sequence data. Finally, these methods are compared for various tetramers of interest in C. elegans chromosome I.  相似文献   

12.
The distal part of the long tail fibers of the Escherichia coli phage T4 consists of a dimer of protein 37. A fragment of the corresponding gene, encoding 253 amino acids, was inserted into several different sites within the cloned gene for the 325-residue outer membrane protein OmpA. In plasmid pTU T4-5 the fragment was inserted once and in pTU T4-10 tandemly twice between the codons for residues 153 and 154 of the OmpA protein. In pTU T4-22 two fragments were present, in tandem, between the codons for residues 45 and 46 of this protein. In pIN T4-6 one fragment was inserted into the ompA gene immediately following the part encoding the signal sequence. The corresponding mature proteins consist, in this order, of 605, 860, 835, and 279 amino acid residues. All precursor proteins were processed and translocated across the plasma membrane. Hence, not only can the OmpA protein serve as a vehicle for export of a nonsecretory protein, but the signal sequence alone can also mediate export of such a protein. Export of the pro-OmpA protein depends on the SecA protein. Export of the tail fiber fragment expressed from pIN T4-6 remained SecA dependent. Thus, the secA pathway in this case is chosen by the signal peptide. It is proposed that a signal peptide can mediate translocation of nonsecretory proteins as long as they are export-compatible. The inability of a signal sequence to mediate export of some proteins appears to be due to export incompatibility of the protein rather than to the absence of information, within the mature part of the polypeptide, which would be required for translocation.  相似文献   

13.
The sequences of hydrophobic segments of exported bacterial proteins, some serine proteinases and all known plastocyanins are examined in order to find out subsequences differing in amino acid composition and primary structure regularities. It is established that the extension in protein precursor, cleaved by a proteolysis (so-called P-sequence), demonstrates a higher share of usual amino acids (Thr, Pro, Ala, Ser, Arg, Gly, Leu, Val, Glu, Asp) and more clearly expressed periodicity compared to the mature protein (M-sequence). The obtained results confirm the conception of primitive proteins comprising a small number of amino acids realized a preferable bonding (between identical and very similar in structure-function-evolution sense).  相似文献   

14.
Understanding the patterns and causes of protein sequence evolution is a major challenge in evolutionary biology. One of the critical unresolved issues is the relative contribution of selection and genetic drift to the fixation of amino acid sequence differences between species. Molecular homoplasy, the independent evolution of the same amino acids at orthologous sites in different taxa, is one potential signature of selection; however, relatively little is known about its prevalence in eukaryotic proteomes. To quantify the extent and type of homoplasy among evolving proteins, we used phylogenetic methodology to analyze 8 genome-scale data matrices from clades of different evolutionary depths that span the eukaryotic tree of life. We found that the frequency of homoplastic amino acid substitutions in eukaryotic proteins was more than 2-fold higher than expected under neutral models of protein evolution. The overwhelming majority of homoplastic substitutions were parallelisms that involved the most frequently exchanged amino acids with similar physicochemical properties and that could be reached by a single-mutational step. We conclude that the role of homoplasy in shaping the protein record is much larger than generally assumed, and we suggest that its high frequency can be explained by both weak positive selection for certain substitutions and purifying selection that constrains substitutions to a small number of functionally equivalent amino acids.  相似文献   

15.
不具有3-碱基周期性的编码序列初探   总被引:4,自引:0,他引:4  
对120个较短编码序列(<1 200 bp)的Fourier频谱进行分析表明,3-碱基周期性在短编码序列中并不是绝对存在的.统计分析提示,编码序列有无3-碱基周期性与序列的碱基组成和分布、所编码蛋白质氨基酸的选用和顺序以及同义密码子的使用都有一定的关系.一般地,非周期-3序列中A+U含量高于G+C含量,周期-3序列的情况则相反;非周期-3序列中碱基在密码子三个位点上的分布比周期-3序列中的分布均匀;非周期-3序列密码子和氨基酸的使用偏向没有周期-3序列的大.在利用Fourier分析方法预测DNA序列中的基因和外显子时,应充分考虑到这些现象.  相似文献   

16.
Spider dragline silk is a remarkably strong fiber with impressive mechanical properties, which were thought to result from the specific structures of the underlying proteins and their molecular size. In this study, silk protein 11R26 from the dragline silk protein of Nephila clavipes was used to analyze the potential effects of the special amino acids on the function of 11R26. Three protein derivatives, ZF4, ZF5, and ZF6, were obtained by site-directed mutagenesis, based on the sequence of 11R26, and among these derivatives, serine was replaced with cysteine, isoleucine, and arginine, respectively. After these were expressed and purified, the mechanical performance of the fibers derived from the four proteins was tested. Both hardness and average elastic modulus of ZF4 fiber increased 2.2 times compared with those of 11R26. The number of disulfide bonds in ZF4 protein was 4.67 times that of 11R26, which implied that disulfide bonds outside the poly-Ala region affect the mechanical properties of spider silk more efficiently. The results indicated that the mechanical performances of spider silk proteins with small molecular size can be enhanced by modification of the amino acids residues. Our research not only has shown the feasibility of large-scale production of spider silk proteins but also provides valuable information for protein rational design.  相似文献   

17.
alpha-helices within proteins are often terminated (capped) by distinctive configurations of the polypeptide chain. Two common arrangements are the Schellman motif and the alternative alpha(L) motif. Rose and coworkers developed stereochemical rules to identify the locations of such motifs in proteins of unknown structure based only on their amino acid sequences. To check the effectiveness of these rules, they made specific predictions regarding the structural and thermodynamic consequences of certain mutations in T4 lysozyme. We have constructed these mutants and show here that they have neither the structure nor the stability that was predicted. The results show the complexity of the protein-folding problem. Comparison of known protein structures may show that a characteristic sequence of amino acids (a sequence motif) corresponds to a conserved structural motif. In any particular protein, however, changes in other parts of the sequence may result in a different conformation. The structure is determined by sequence as a whole, not by parts considered in isolation.  相似文献   

18.
Cells of methanococci are covered by a single layer of protein subunits (S-layer) in hexagonal arrangement, which are directly exposed to the environment and which cannot be stabilized by cellular components. We have isolated S-layer proteins from cells of Methanococcus vannielii ( T(opt.)=37 degrees C), Methanococcus thermolithotrophicus ( T(opt.)=65 degrees C), and Methanococcus jannaschii ( T(opt.)=85 degrees C). The primary structure of the S-layer proteins was determined by sequencing the corresponding genes. According to the predicted amino acid sequence, the molecular masses of the S-layer proteins of the different methanococci are in a small range between 59,064 and 60,547 Da. Compared with its mesophilic counterparts, it is worth noting that in the S-layer protein of the extreme thermophile Mc. jannaschii the acidic amino acid Asp is predominant, the basic amino acid Lys occurs in higher amounts, and Cys and His are only present in this organism. Despite the differences in the growth optima and the predominance of some amino acids, the comparative total primary structure revealed a relatively high degree of identity (38%-45%) between the methanococci investigated. This observation indicates that the amino acid sequence of the S-layer proteins is significantly conserved from the mesophilic to the extremely thermophilic methanococci.  相似文献   

19.
The score matrix from a structure comparison program (SAP) was used to search for repeated structures using a Fourier analysis. When tested with artificial data, a simple Fourier transform of the smoothed matrix provided a clear signal of the repeat periodicity that could be used to extract the repeating units with the SAP program. The strength of the Fourier signal was calibrated against the signal from model proteins. The most useful of these was the novel random-walk approach employed to generate realistic 'fake' structures. On the basis of these it was possible to conclude that only a small proportion of protein structures have an unexpected degree of symmetry. Artificially generated 'ideal' folds provided an upper limit on the strength of signal that could be expected from a 'perfectly' repeating compact structure. Unexpectedly, some of the very regular beta-propellor folds attained the same strength but the majority of symmetric structures lay below this region. When native proteins were ranked by the power of their spectrum a wide variety of fold types were seen to score highly. In the betaalpha class, these included the globular betaalpha proteins and the more repetitive leucine-rich betaalpha folds. In the all-beta class; beta-propellors, beta-prisms and beta-helices were found as well as the more globular gamma-crystalin domains. When this ranked list was filtered to remove proteins that contained detectable internal sequence similarity (using the program REPRO), the list became exclusively composed of just globular betaalpha class proteins and in the top 50 re-ranked proteins, only a single 4-fold propellor structure remained.  相似文献   

20.
Protein evolution can be seen as the successive replacement of amino acids by other amino acids. In general, it is a very slow process which is triggered by point mutations in the nucleotide sequence. These mutations can transform into single nucleotide polymorphisms (SNPs) within populations and diverging proteins between species. It is well known that in many cases amino acids can be replaced by others without impeding the functioning of the protein, even if these are of quite different physico-chemical character. In some cases, however, almost any replacement would result in a functionally deficient protein. Based upon comprehensive published SNP data and applying correlation analysis we quantified the two antagonist factors controlling the process of amino acid replacement and thus protein evolution: First, the degenerate structure of the genetic code which facilitates the exchange of certain amino acids and, second, the physico-chemical forces which limit the range of possible exchanges to maintain a functional protein. We found that the observed frequencies of amino acid exchanges within species are best explained by the genetic code and that the conservation of physico-chemical properties plays a subordinate role, but has nevertheless to be considered as a key factor. Between moderately diverged species genetic code and physico-chemical properties exert comparable influence on amino acid exchanges. We furthermore studied amino acid exchanges in more detail for six species (four mammals, one bird, and one insect) and found that the profiles are highly correlated across all examined species despite their large evolutionary divergence of up to 800 million years. The species specific exchange profiles are also correlated to the exchange profile observed between different species. The currently available huge body of SNP data allows to characterize the role of two major shaping forces of protein evolution more quantitatively than before.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号