首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 218 毫秒
1.
Sequences of amino acids of some fiber proteins may have a periodic structure. To analyze this periodicity Fourier transform of a mathematical image of symbolic sequence of amino acids in a protein is sometimes used. In this work we employed one (out of few possible) particular way of doing Fourier transform as the most straightforward and optimal. Employing this optimal Fourier transform method we analyzed periodicity of fiber proteins in bacteriophage T4. As a result we managed to confirm that a certain periodicity exists in the investigated proteins. It was found that for a number of proteins the alternation of elements of the same group in the amino acid sequence with a rather small period T = 15 exists, whereas for some other proteins alternations have small periods 10 and 8. The new result is a discovery of relatively large periods of amino acids alternations, which divide the amino acids sequence of the protein into 4 or 6 equal parts. These data on the amino acids periodicity allowed us to align amino acids sequences in accordance with the established periods of both types, in agreement with certain results obtained in X-ray crystallography and electron microscopy experiments.  相似文献   

2.
Many studies of biological sequence data have examined sequence structure in terms of periodicity, and various methods for measuring periodicity have been suggested for this purpose. This paper compares two such methods, autocorrelation and the Fourier transform, using synthetic periodic sequences, and explains the differences in periodicity estimates produced by each. A hybrid autocorrelation—integer period discrete Fourier transform is proposed that combines the advantages of both techniques. Collectively, this representation and a recently proposed variant on the discrete Fourier transform offer alternatives to the widely used autocorrelation for the periodicity characterization of sequence data. Finally, these methods are compared for various tetramers of interest in C. elegans chromosome I.  相似文献   

3.
This article is in the area of protein sequence investigation. It studies protein sequence periodicity. The notion of latent periodicity is introduced. A mathematical method for searching for latent periodicity in protein sequences is developed. Implementation of the method developed for known cases of perfect and imperfect periodicity is demonstrated. Latent periodicity of many protein sequences from the SWISS-PROT data bank is revealed by the method and examples of latent periodicity of amino acid sequences are demonstrated for: the translation initiation factor EIF-2B (epsilon subunit) of Saccharomyces cerevisiae from the E2BE_YEAST sequence; the E.coli ferrienterochelin receptor from the FEPA_ECOLI sequence; the lysozyme of Bacteriophage SF6 from the LY_BPSF6 sequence; lipoamide dehydrogenase of Azotobacter vinelandii from the DLDH_AZOVI sequence. These protein sequences have latent periods equal to six, two, seven and 19 amino acids, respectively. We propose that a possible purpose of the amino acid sequence latent periodicity is to determine certain protein structures.  相似文献   

4.
5.
不具有3-碱基周期性的编码序列初探   总被引:4,自引:0,他引:4  
对120个较短编码序列(<1 200 bp)的Fourier频谱进行分析表明,3-碱基周期性在短编码序列中并不是绝对存在的.统计分析提示,编码序列有无3-碱基周期性与序列的碱基组成和分布、所编码蛋白质氨基酸的选用和顺序以及同义密码子的使用都有一定的关系.一般地,非周期-3序列中A+U含量高于G+C含量,周期-3序列的情况则相反;非周期-3序列中碱基在密码子三个位点上的分布比周期-3序列中的分布均匀;非周期-3序列密码子和氨基酸的使用偏向没有周期-3序列的大.在利用Fourier分析方法预测DNA序列中的基因和外显子时,应充分考虑到这些现象.  相似文献   

6.
Sequence and characterization of 6 Lea proteins and their genes from cotton   总被引:33,自引:0,他引:33  
Lea genes code for mRNAs and proteins that are late embryogenesis abundant in higher plant seed embryos. They appear to be ubiquitous in higher plants and may be induced to high levels of expression in other tissues and at other times of ontogeny by ABA and/or desiccation. Presented here are the genomic and cDNA sequences for 6 of these genes from cotton seed embryos and the derived amino acid sequences of the corresponding proteins.The Lea genes contain the standard sequence features of eucaryotic genes (TATA box and poly (A) addition sequences) and have 1 or more introns. Sequences differences between cDNA and genomic DNA confirm the existence of small multigene families for several Lea genes. The amino acid composition and sequence for the Lea proteins are unusual. Five are extremely hydrophilic, four contain no cys or trp and 4 have sequence domains that suggest amphiphilic helical structures. Hypothetical functions in desiccation survival, based on amino acid sequence, are discussed.  相似文献   

7.
Many proteins exhibit sequence periodicity, often correlated with a visible structural periodicity. The statistical significance of such periodicity can be assessed by means of a chi-squared-based test, with significance thresholds being calculated from shuffled sequences. Comparison of the complete proteomes of 45 species reveals striking differences in the proportion of periodic proteins and the intensity of the most significant periodicities. Eukaryotes tend to have a higher proportion of periodic proteins than eubacteria, which in turn tend to have more than archaea. The intensity of periodicity in the most periodic proteins is also greatest in eukaryotes. By contrast, the relatively small group of periodic proteins in archaea also tend to be weakly periodic compared to those of eukaryotes and eubacteria. Exceptions to this general rule are found in those prokaryotes with multicellular life-cycle phases, e.g., Methanosarcina sp., or Anabaena sp., which have more periodicities than prokaryotes in general, and in unicellular eukaryotes, which have fewer than multicellular eukaryotes. The distribution of significantly periodic proteins in eukaryotes is over a wide range of period lengths, whereas prokaryotic proteins typically have a more limited set of period lengths. This is further investigated by repeating the analysis on the NRL-3D database of proteins of solved structure. Some short-range periodicities are explicable in terms of basic secondary structure, e.g., alpha helices, while middle-range periodicities are frequently found to consist of known short Pfam domains, e.g., leucine-rich repeats, tetratricopeptides or armadillo domains. However, not all can be explained in this way.Reviewing Editor: Dr. John Oakeshott  相似文献   

8.
The amino acid sequences of sulphur-rich proteins derived from the matrix substance of wool keratin have been analysed for internal and external homologies, and the nature of the repeating patterns residues has been investigated in the proteins termed B2A and BIIIA3. The Fourier transform method was used to identify preferred positions in the pentapeptide periodicity that exists in these materials and the structural and functional implications of the results are discussed.  相似文献   

9.
An earlier reported method for revealing latent periodicity of the nucleotide sequences has been considerably modified in a case of small samples, by applying a Monte Carlo method. This improved method has been used to search for the latent periodicity of some nucleotide sequences of the EMBL data bank. The existence of the nucleotide sequences' latent periodicity has been shown for some genes. The results obtained have implied that periodicity of gene structure is projected onto the periodicity of primary amino acid sequences and, further, onto spatial protein conformation. Even though the periodic structure of gene sequences has been eroded, it is still retained in primary and/or spatial structures of corresponding proteins. Furthermore, in a few cases the study of genes' periodicity has suggested their possible evolutionary origin by multifold duplications of some gene's fragments.  相似文献   

10.
We perform a statistical analysis of solvent accessibility and hydrophobicity profiles of a representative set of proteins. The joint probability distribution is well fitted to a multivariable Gaussian, which takes a relatively simple form when expressed in terms of the Fourier transforms of the profiles. This allows us to quantify the asymmetric manner by which these profiles influence each other. For example, the α‐helix periodicity in sequence hydrophobicity is dictated by the solvent accessibility of structures, and not vice versa, possibly indicating the faster evolution of sequences compared to structures. The decorrelated hydrophobicity and solvent accessibility profiles show distinct behaviors at long periods, where sequence hydrophobicity fluctuates less, while solvent accessibility fluctuates more than average. The correlations between the two profiles can be interpreted as the Boltzmann weight of the solvation energy at room temperature, consistent with earlier observations. Proteins 2006. © 2005 Wiley‐Liss, Inc.  相似文献   

11.
Periodicity was quantified in 4289 Escherichia coli K12 confirmed and putative protein sequences, using a simple chi-square technique previously shown to reveal triplet period periodicity in coding DNA. Periodicities were calculated from period n = 2 to period n = 50 in nine different alphabetic representations of the proteins. By comparison with a randomly generated proteome of the same compositional content, the E. coli proteome does not contain a significant excess of periodic proteins. However, 60 proteins do appear to be significantly periodic in at least one alphabetic representation, after Bonferroni correction, at p < 0.01, and 30 at p < 0.001. These are compared with significantly periodic proteins of solved three-dimensional structure, detected by an identical analysis of the sequences from a protein structure database. It is concluded that there is no evidence for the presence of a proteome-wide quasi-periodicity as predicted by the duplication and divergence model of protein evolution and that the major periodicity detected is a consequence of the repetitive tendencies within -helices. However, it is not possible to explain all sequence periodicities in terms of observable secondary structure, as in cases where sequence periodicity can be compared to solved structure, there is often no structural regularity that would provide an obvious explanation in terms of natural selection on protein function.  相似文献   

12.
In treating the Volterra-Verhulst prey-predator system with time dependent coefficients, we ask how far this deterministic system represents or approximates the dynamics of the population evolving in a realistic environment which is stochastic in nature. We consider a stochastic system withsmall Gaussian noise type fluctuations. It is shown that the higher moments of the deviation of the deterministic system from the stochastic approach zero as the strength δ of the perturbation decays to zero. For any δ>0 and allT>0, ε>0, the sample population paths that stay within ε distance from the deterministic path during [0,T] form a collection of positive probability. In comparing the stationary distributions of the two systems, we show that the weak limits of those of the stochastic system form a subset of those of the deterministic system. This is in analogy with a result of May connected with the stability of the two systems. Plant and rodent populations possess periodic parameters andexhibit periodic behaivor. We establish theoretically this periodicity under periodicity conditions on the coefficients and perturbing random forces. We also establish a central limit property for the prey-predator system.  相似文献   

13.
Summary We examine in this paper one of the expected consequences of the hypothesis that modern proteins evolved from random heteropeptide sequences. Specifically, we investigate the lengthwise distributions of amino acids in a set of 1,789 protein sequences with little sequence identity using the run test statistic (r o) of Mood (1940,Ann. Math. Stat. 11, 367–392). The probability density ofr o for a collection of random sequences has mean=0 and variance=1 [the N(0,1) distribution] and can be used to measure the tendency of amino acids of a given type to cluster together in a sequence relative to that of a random sequence. We implement the run test using binary representations of protein sequences in which the amino acids of interest are assigned a value of 1 and all others a value of 0. We consider individual amino acids and sets of various combinations of them based upon hydrophobicity (4 sets), charge (3 sets), volume (4 sets), and secondary structure propensity (3 sets). We find that any sequence chosen randomly has a 90% or greater chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. We regard this as strong support for the random-origin hypothesis. However, we do observe significant deviations from the random expectation as might be expected after billions years of evolution. Two important global trends are found: (1) Amino acids with a strong α-helix propensity show a strong tendency to cluster whereas those with β-sheet or reverse-turn propensity do not. (2) Clustered rather than evenly distributed patterns tend to be preferred by the individual amino acids and this is particularly so for methionine. Finally, we consider the problem of reconciling the random nature of protein sequences with structurally meaningful periodic “patterns” that can be detected by sliding-window, autocorrelation, and Fourier analyses. Two examples, rhodopsin and bacteriorhodopsin, show that such patterns are a natural feature of random sequences.  相似文献   

14.
Methods to determine periodicity in protein sequences are useful for inferring function. Fourier transformation is one approach but care is required to ensure the periodicity is genuine. Here we have shown that empirically-derived statistical tables can be used as a measure of significance. Genuine protein sequences data rather than randomly generated sequences were used as the statistical backdrop. The method has been applied to G-protein coupled receptor (GPCR) sequences, by Fourier transformation of hydrophobicity values, codon frequencies and the extent of over-representation of codon pairs; the latter being related to translational step times. Genuine periodicity was observed in the hydrophobicity whereas the apparent periodicity (as inferred from previously reported measures) in the translation step times was not validated statistically. GCR2 has recently been proposed as the plant GPCR receptor for the hormone abscisic acid. It has homology to the Lanthionine synthetase C-like family of proteins, an observation confirmed by fold recognition. Application of the Fourier transform algorithm to the GCR2 family revealed strongly predicted seven fold periodicity in hydrophobicity, suggesting why GCR2 has been reported to be a GPCR, despite negative indications in most transmembrane prediction algorithms. The underlying multiple sequence alignment, also required for the Fourier transform analysis of periodicity, indicated that the hydrophobic regions around the 7 GXXG motifs commence near the C-terminal end of each of the 7 inner helices of the alpha-toroid and continue to the N-terminal region of the helix. The results clearly explain why GCR2 has been understandably but erroneously predicted to be a GPCR.  相似文献   

15.
It has been suggested (Doolittle et al., 1977) that portions of the α-, β- and γ-chains of fibrinogen form a coiled-coil rope of α-helices and that this rope connects globular domains of the molecule. A fast Fourier transform analysis of the relevant amino acid sequences has shown that there is a significant 3.5-residue period in the linear disposition of the apolar residues in all three chains. This periodicity is characteristic of amino acid sequences of α-fibrous proteins, such as α-tropomyosin and α-keratin, where the tertiary structure is closely related to a coiled-coil of α-helices. However, a detailed study of the fibrinogen sequences shows that the structure is likely to contain several regions which do not have a simple secondary structure. The detailed conformation of the postulated rodlike region of fibrinogen is therefore complex and may approximate a coiled-coil only over relatively short lengths.An important question to emerge from this analysis is whether correct positioning of apolar residues in a pseudo-repeating heptad is sufficiently important to override low α-helix-favouring potential of other residues in the heptad.  相似文献   

16.
17.
D S Horne 《Biopolymers》1988,27(3):451-477
It is demonstrated that protein α-helix content can be predicted from an autocorrelation analysis of the protein hydrophobicity sequence. The Fourier transform of the autocorrelation function yields the spectral densities or weights of the various frequencies contributing to the autocorrelation function. Using sequence and secondary structure data from more than 160 proteins and domains, a linear relationship was found between spectral density at periodicity 3.7 and protein α-helix content (r = 0.83). This relation permits prediction of the helix content (x) of proteins of known sequence to within ± 15%, i.e., as (x ± 15)%. Predictions based on the autocorrelation procedure are compared with values obtained by other methods.  相似文献   

18.
The amino acid sequences of fragments from light meromyosin and heavy meromyosin subfragment-2 have been analysed and structural features noted. As with other α-fibrous protein sequences, there is a regular disposition of apolar residues in positions a and d of the heptapeptide-type repeat characteristic of the coiled-coil conformation. The common occurrence of acidic and basic residues in the e and g positions, respectively, give rise to a maximum number of interchain ionic interactions when the two parallel chains of myosin are in axial register. Although the quasi-repeating heptapeptides in the sequences both have two points of discontinuity (unlike that in most other α-fibrous proteins), secondary structure prediction methods indicate that the fragments will be 90 to 100% α-helical. Fast Fourier transform techniques have revealed a significant periodicity of about 27.4 ± 0.3 residues (~41 Å) in the linear disposition of the acidic residues and the basic residues in both of the fragments. This period is compatible with similarly directed myosin molecules in the thick filament being axially staggered with respect to one another by an odd multiple of 143 Å. Preliminary evidence is also presented to show that the sequence of the rod region of myosin may have a 28 residue gene duplication repeat.  相似文献   

19.
Liang HK  Huang CM  Ko MT  Hwang JK 《Proteins》2005,59(1):58-63
Structural analysis is useful in elucidating structural features responsible for enhanced thermal stability of proteins. However, due to the rapid increase of sequenced genomic data, there are far more protein sequences than the corresponding three-dimensional (3D) structures. The usual sequence-based amino acid composition analysis provides useful but simplified clues about the amino acid types related to thermal stability of proteins. In this work, we developed a statistical approach to identify the significant amino acid coupling sequence patterns in thermophilic proteins. The amino acid coupling sequence pattern is defined as any 2 types of amino acids separated by 1 or more amino acids. Using this approach, we construct the rho profiles for the coupling patterns. The rho value gives a measure of the relative occurrence of a coupling pattern in thermophiles compared with mesophiles. We found that thermophiles and mesophiles exhibit significant bias in their amino acid coupling patterns. We showed that such bias is mainly due to temperature adaptation instead of species or GC content variations. Though no single outstanding coupling pattern can adequately account for protein thermostability, we can use a group of amino acid coupling patterns having strong statistical significance (p values < 10(-7)) to distinguish between thermophilic and mesophilic proteins. We found a good correlation between the optimal growth temperatures of the genomes and the occurrences of the coupling patterns (the correlation coefficient is 0.89). Furthermore, we can separate the thermophilic proteins from their mesophilic orthologs using the amino acid coupling patterns. These results may be useful in the study of the enhanced stability of proteins from thermophiles-especially when structural information is scarce. Proteins 2005. (c) 2005 Wiley-Liss, Inc.  相似文献   

20.
Amphiphilic alpha-helices play a major role in membrane dependent processes and are manifested in the primary structure of a protein by the periodic appearance of hydrophobic residues. Based on these periodic sequences, the hydrophobic moment was introduced, , which essentially treats the hydrophobicity of amino acid residues as a two-dimensional vector sum and provides a measure of amphiphilicity within regular repeat structures. To identify putative amphiphilic alpha-helix forming sequences, hydrophobic moment analysis assumes an amino acid residue periodicity of 100 and scans protein primary structures to find the 11-residue window with maximal . Taken with the window's mean hydrophobicity, , hydrophobic moment plot analysis uses the coordinate pair, [, ] to classify alpha-helices as either surface active, globular or transmembrane. More recently, this latter analysis has been extended to recognize candidate oblique orientated alpha-helices. Here, the hydrophobic moment is reviewed and data to query the logic of using a fixed window length and a fixed residue angular periodicity in hydrophobic moment analysis are provided. In addition, problems associated with the use of such analysis to predict alpha-helix structure/function relationships are considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号