首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Latent amino acid repeats seem to be widespread in genetic sequences and to reflect their structure, function, and evolution. We have recently identified latent periodicity in more than 150 protein families including protein kinases and various nucleotide-binding proteins. The latent repeats in these families were correlated to their structure and evolution. However, a majority of known protein families were not identified with our latent periodicity search algorithm. The main presumable reason for this was the inability of our techniques to identify periodicities interspersed with insertions and deletions. We designed the new latent periodicity search algorithm, which is capable of taking into account insertions and deletions. As a result, we identified many novel cases of latent periodicity peculiar to protein families. Possible origins of the periodic structure of these families are discussed. Summarizing, we presume that latent periodicity is present in a substantial portion of known protein families. The latent periodicity matrices and the results of Swiss-Prot scans are available from http://bioinf.narod.ru/del/.  相似文献   

2.
A method of noise decomposition has been developed. This method allows for the identification of a latent periodicity with symbol insertions and deletions that is specific for all or most amino acid sequences belonging to the same protein family or protein domain. The latent periodicity has been identified in catalytic domains of 85% of serine/threonine and tyrosine protein kinases. Similar results have been obtained for 22 other protein families. The possible role of latent periodicity in protein families is discussed.__________Translated from Molekulyarnaya Biologiya, Vol. 39, No. 3, 2005, pp. 420–436.Original Russian Text Copyright © 2005 by Laskin, Kudryashov, Skryabin, Korotkov.  相似文献   

3.
The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the GenBank database, 14766 sequences with a periodicity of two nucleotides have been found. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed.  相似文献   

4.
This article is in the area of protein sequence investigation. It studies protein sequence periodicity. The notion of latent periodicity is introduced. A mathematical method for searching for latent periodicity in protein sequences is developed. Implementation of the method developed for known cases of perfect and imperfect periodicity is demonstrated. Latent periodicity of many protein sequences from the SWISS-PROT data bank is revealed by the method and examples of latent periodicity of amino acid sequences are demonstrated for: the translation initiation factor EIF-2B (epsilon subunit) of Saccharomyces cerevisiae from the E2BE_YEAST sequence; the E.coli ferrienterochelin receptor from the FEPA_ECOLI sequence; the lysozyme of Bacteriophage SF6 from the LY_BPSF6 sequence; lipoamide dehydrogenase of Azotobacter vinelandii from the DLDH_AZOVI sequence. These protein sequences have latent periods equal to six, two, seven and 19 amino acids, respectively. We propose that a possible purpose of the amino acid sequence latent periodicity is to determine certain protein structures.  相似文献   

5.
The information decomposition (ID) method has been used for searching dinucleotide periodicities, including latent ones, in plant genomes. In nucleotide sequences of genomes of various plants from the Gen-Bank database, 14 766 sequences with a periodicity of two nucleotides have been found at a high level of statistical significance. Classification of the periodicity matrices of the detected DNA sequences has yielded 141 classes of dinucleotide periodicity. Since ID does not detect periodicities with nucleotide deletions or insertions, modified profile analysis (MPA) has been applied to the obtained classes to reveal DNA sequences with dinucleotide periodicities containing nucleotide deletions and insertions. Combined use of ID and MPA has permitted the detection of 80 396 DNA sequences with dinucleotide periodicities in the genomes of various plants. The biological role of dinucleotide periodicity in the detected sequences is discussed.  相似文献   

6.
A web server for searching latent periodicity based on the method of modified profile analysis has been developed. This method allows searching latent periodicity in presence of insertions and deletions. During searching process, the periodicity classes are used which were found by us earlier for various groups of organisms. Period length belongs to the range 2-20 nt, not including the triplet periodicity. The results obtained are subjected to various filtration steps to ensure their statistical significance. Availability: The use of web server is free for non-commercial users. No registration is required. URL of the server is http://victoria.biengi.ac.ru/lepscan. Current software version is 1.06.  相似文献   

7.
An earlier reported method for revealing latent periodicity of the nucleotide sequences has been considerably modified in a case of small samples, by applying a Monte Carlo method. This improved method has been used to search for the latent periodicity of some nucleotide sequences of the EMBL data bank. The existence of the nucleotide sequences' latent periodicity has been shown for some genes. The results obtained have implied that periodicity of gene structure is projected onto the periodicity of primary amino acid sequences and, further, onto spatial protein conformation. Even though the periodic structure of gene sequences has been eroded, it is still retained in primary and/or spatial structures of corresponding proteins. Furthermore, in a few cases the study of genes' periodicity has suggested their possible evolutionary origin by multifold duplications of some gene's fragments.  相似文献   

8.
Latent sequence periodicity of some oncogenes and DNA-binding protein genes   总被引:2,自引:0,他引:2  
A method of latent periodicity search is developed. We use mutualinformation to reveal the latent periodicity of mRNA sequences.The latent periodicity of an mRNA sequence is a periodicitywith a low level of similarity between any two periods insidethe mRNA sequence. The mutual information between an artificialnumerical sequence and an mRNA sequence is calculated. The lengthof the artificial sequence period is varied from 2 to 150. Thehigh level of the mutual information between artificial andmRNA sequences allows us to find any type of latent periodicityof mRNA sequence. The latent periodicity of many mRNA codingregions has been found. For example, the retinoblastoma geneof HSRBS clone contains a region with a latent period equalto 45 bases. The A-RAF oncogene of HSARAFIR clone contains aregion with a latent period equal to 84 bases. Integrated sequencesfor the regions with latent periodicity are determined. Thepotential significance of latent periodicity is discussed.  相似文献   

9.
Insertions and deletions of nucleotides in the genes encoding the variable domains of antibodies are natural components of the hypermutation process, which may expand the available repertoire of hypervariable loop lengths and conformations. Although insertion of amino acids has also been utilized in antibody engineering, little is known about the functional consequences of such modifications. To investigate this further, we have introduced single-codon insertions and deletions as well as more complex modifications in the complementarity-determining regions of human antibody fragments with different specificities. Our results demonstrate that single amino acid insertions and deletions are generally well tolerated and permit production of stably folded proteins, often with retained antigen recognition, despite the fact that the thus modified loops carry amino acids that are disallowed at key residue positions in canonical loops of the corresponding length or are of a length not associated with a known canonical structure. We have thus shown that single-codon insertions and deletions can efficiently be utilized to expand structure and sequence space of the antigen-binding site beyond what is encoded by the germline gene repertoire.  相似文献   

10.
Sequences of amino acids of some fiber proteins may have a periodic structure. To analyze this periodicity Fourier transform of a mathematical image of symbolic sequence of amino acids in a protein is sometimes used. In this work we employed one (out of few possible) particular way of doing Fourier transform as the most straightforward and optimal. Employing this optimal Fourier transform method we analyzed periodicity of fiber proteins in bacteriophage T4. As a result we managed to confirm that a certain periodicity exists in the investigated proteins. It was found that for a number of proteins the alternation of elements of the same group in the amino acid sequence with a rather small period T = 15 exists, whereas for some other proteins alternations have small periods 10 and 8. The new result is a discovery of relatively large periods of amino acids alternations, which divide the amino acids sequence of the protein into 4 or 6 equal parts. These data on the amino acids periodicity allowed us to align amino acids sequences in accordance with the established periods of both types, in agreement with certain results obtained in X-ray crystallography and electron microscopy experiments.  相似文献   

11.
During membrane fusion, the influenza A virus hemagglutinin (HA) adopts an extended helical structure that contains the viral transmembrane and fusion peptide domains at the same end of the molecule. The peptide segments that link the end of this rod-like structure to the membrane-associating domains are approximately 10 amino acids in each case, and their structure at the pH of fusion is currently unknown. Here, we examine mutant HAs and influenza viruses containing such HAs to determine whether these peptide linkers are subject to specific length requirements for the proper folding of native HA and for membrane fusion function. Using pairwise deletions and insertions, we show that the region flanking the fusion peptide appears to be important for the folding of the native HA structure but that mutant proteins with small insertions can be expressed on the cell surface and are functional for membrane fusion. HA mutants with deletions of up to 10 residues and insertions of as many as 12 amino acids were generated for the peptide linker to the viral transmembrane domain, and all folded properly and were expressed on the cell surface. For these mutants, it was possible to designate length restrictions for efficient membrane fusion, as functional activity was observed only for mutants containing linkers with insertions or deletions of eight residues or less. The linker peptide mutants are discussed with respect to requirements for the folding of native HAs and length restrictions for membrane fusion activity.  相似文献   

12.
For detection of the latent periodicity of the protein families responsible for various biological functions, methods of information decomposition, cyclic profile alignment, and the method of noise decomposition have been used. The latent periodicity, being specific to a particular family, is recognized in 94 of 110 analyzed protein families. Family specific periodicity was found for more than 70% of amino acid sequences in each of these families. Based on such sequences the characteristic profile of the latent periodicity has been deduced for each family. Possible relationship between the recognized latent periodicity, evolution of proteins, and their structural organization is discussed.  相似文献   

13.
It is established that the sequences of all different proteins fromE. coli ribosome as well as two protein biosynthesis initiation factors, two ribosome-associated DNA-binding proteins, and the elongation factor EF-Tu from the same source possess a periodicity expressed more weakly and different from that found earlier for a number of proteins representatives of 18 superfamilies. The statistical significance of the periodicity observed was checked by comparing the area below the periodicity curve of every protein examined with that of computer generated sequences having the same amino acid composition and length. The results concerning the proteins from small and large ribosomal subunit are compared. The conclusions support and supplement the concept about the presence of a trend in protein molecular evolution from universal (Gly, Ala) to specialized (Phe, Tyr, Trp, Cys) amino acids.  相似文献   

14.
Runs of identical amino acids encoded by triplet repeats (homopolymers) are components of numerous proteins, yet their role is poorly understood. Large numbers of homopolymers are present in the Drosophila melanogaster mastermind (mam) protein surrounding several unique charged amino acid clusters. Comparison of mam sequences from D. virilis and D. melanogaster reveals a high level of amino acid conservation in the charged clusters. In contrast, significant divergence is found in repetitive regions resulting from numerous amino acid replacements and large insertions and deletions. It appears that repetitive regions are under less selective pressure than unique regions, consistent with the idea that homopolymers act as flexible spacers separating functional domains in proteins. Notwithstanding extensive length variation in intervening homopolymers, there is extreme conservation of the amino acid spacing of specific charge clusters. The results support a model where homopolymer length variability is constrained by natural selection.Correspondence to: B. Yedvobnick  相似文献   

15.
Internal repeats in protein sequences have wide-ranging implications for the structure and function of proteins. A keen analysis of the repeats in protein sequences may help us to better understand the structural organization of proteins and their evolutionary relations. In this paper, a mathematical method for searching for latent periodicity in protein sequences is developed. Using this method, we identified simple sequence repeats in the alkaline proteases and found that the sequences could show the same periodicity as their tertiary structures. This result may help us to reduce difficulties in the study of the relationship between sequences and their structures.  相似文献   

16.
不具有3-碱基周期性的编码序列初探   总被引:4,自引:0,他引:4  
对120个较短编码序列(<1 200 bp)的Fourier频谱进行分析表明,3-碱基周期性在短编码序列中并不是绝对存在的.统计分析提示,编码序列有无3-碱基周期性与序列的碱基组成和分布、所编码蛋白质氨基酸的选用和顺序以及同义密码子的使用都有一定的关系.一般地,非周期-3序列中A+U含量高于G+C含量,周期-3序列的情况则相反;非周期-3序列中碱基在密码子三个位点上的分布比周期-3序列中的分布均匀;非周期-3序列密码子和氨基酸的使用偏向没有周期-3序列的大.在利用Fourier分析方法预测DNA序列中的基因和外显子时,应充分考虑到这些现象.  相似文献   

17.
Amino acid similarity often needs to be considered in DNA sequence comparison to elucidate gene functions. We propose a Smith-Waterman-like algorithm which considers amino acid similarity and insertions/deletions in sequences at the DNA level and at the protein level in a hybrid manner. The algorithm is applied to cDNA sequences of Oryza sativa and those of Arabidopsis thaliana. The results are compared with the results of application of NCBI's tblastx program (which compares the sequences in the BLAST manner after translation). It is shown that the present algorithm is very helpful in discovering nucleotide insertions/deletions originating from experimental errors as well as amino acid insertions/deletions due to evolutionary reasons.  相似文献   

18.
As a protein evolves, not every part of the amino acid sequence has an equal probability of being deleted or for allowing insertions, because not every amino acid plays an equally important role in maintaining the protein structure. However, the most prevalent models in fold recognition methods treat every amino acid deletion and insertion as equally probable events. We have analyzed the alignment patterns for homologous and analogous sequences to determine patterns of insertion and deletion, and used that information to determine the statistics of insertions and deletions for different amino acids of a target sequence. We define these patterns as insertion/deletion (indel) frequency arrays (IFAs). By applying IFAs to the protein threading problem, we have been able to improve the alignment accuracy, especially for proteins with low sequence identity. We have also demonstrated that the application of this information can lead to an improvement in fold recognition.  相似文献   

19.
A series of yeast mitochondrial mit- mutants with defects in the oli2 gene, coding for subunit 6 of the mitochondrial ATPase complex, has been analyzed at the DNA sequence level. Fifteen of sixteen primary mit- mutants were shown to contain frameshift or nonsense mutations predicting truncated subunit 6 polypeptides, in various strains ranging from about 20% to 95% of the wild-type length of 259 amino acids. In only one strain could the defect in subunit 6 function be assigned to amino acid substitution in an otherwise full-length subunit 6. Many mutants carried multiple base substitutions or insertions/deletions, presumably arising from the manganese chloride mutagenesis treatment. Revertants from three of the mit- mutants were analyzed: all contained full-length subunit 6 proteins with one or more amino acid substitutions. The preponderance of truncated proteins as opposed to substituted full-length proteins in oli2 mit- mutants is suggested to reflect the ability of subunit 6 to accommodate amino acid substitutions at many locations, with little or no change in its functional properties in the membrane FO-sector of the ATPase complex.  相似文献   

20.
Archaea-specific radA primers were used with PCR to amplify fragments of radA genes from 11 cultivated archaeal species and one marine sponge tissue sample that contained essentially an archaeal monoculture. The amino acid sequences encoded by the PCR fragments, three RadA protein sequences previously published (21), and two new complete RadA sequences were aligned with representative bacterial RecA proteins and eucaryal Rad51 and Dmc1 proteins. The alignment supported the existence of four insertions and one deletion in the archaeal and eucaryal sequences relative to the bacterial sequences. The sizes of three of the insertions were found to have taxonomic and phylogenetic significance. Comparative analysis of the RadA sequences, omitting amino acids in the insertions and deletions, shows a cladal distribution of species which mimics to a large extent that obtained by a similar analysis of archaeal 16S rRNA sequences. The PCR technique also was used to amplify fragments of 15 radA genes from uncultured natural sources. Phylogenetic analysis of the amino acid sequences encoded by these fragments reveals several clades with affinity, sometimes only distant, to the putative RadA proteins of several species of Crenarcheota. The two most deeply branching archaeal radA genes found had some amino acid deletion and insertion patterns characteristic of bacterial recA genes. Possible explanations are discussed. Finally, signature codons are presented to distinguish among RecA protein family members.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号