首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Latent amino acid repeats seem to be widespread in genetic sequences and to reflect their structure, function, and evolution. We have recently identified latent periodicity in more than 150 protein families including protein kinases and various nucleotide-binding proteins. The latent repeats in these families were correlated to their structure and evolution. However, a majority of known protein families were not identified with our latent periodicity search algorithm. The main presumable reason for this was the inability of our techniques to identify periodicities interspersed with insertions and deletions. We designed the new latent periodicity search algorithm, which is capable of taking into account insertions and deletions. As a result, we identified many novel cases of latent periodicity peculiar to protein families. Possible origins of the periodic structure of these families are discussed. Summarizing, we presume that latent periodicity is present in a substantial portion of known protein families. The latent periodicity matrices and the results of Swiss-Prot scans are available from http://bioinf.narod.ru/del/.  相似文献   

2.
Huang Y  Xiao Y 《Proteins》2007,68(1):267-272
Protein folds may evolve from short peptide ancestors via gene duplication and fusion. For proteins with internal structural symmetry, this means that their sequences should be made up of identical repeats. However, many of these repeat signals can only be seen at the structural level yet. Motivated by the fact that proteins may have similar structures if their sequences have more than 25% identical amino acids, we suggest a method to detect the sequence repeats of proteins directly from their sequences. Using this method, we show that the internal repetitions of the immunoglobulin folds could be identified directly at the sequence level.  相似文献   

3.
An earlier reported method for revealing latent periodicity of the nucleotide sequences has been considerably modified in a case of small samples, by applying a Monte Carlo method. This improved method has been used to search for the latent periodicity of some nucleotide sequences of the EMBL data bank. The existence of the nucleotide sequences' latent periodicity has been shown for some genes. The results obtained have implied that periodicity of gene structure is projected onto the periodicity of primary amino acid sequences and, further, onto spatial protein conformation. Even though the periodic structure of gene sequences has been eroded, it is still retained in primary and/or spatial structures of corresponding proteins. Furthermore, in a few cases the study of genes' periodicity has suggested their possible evolutionary origin by multifold duplications of some gene's fragments.  相似文献   

4.
Repetitiousness is often observed in the primary and tertiary structures of proteins. We are intrigued by the potential role played by periodicity in the evolution of proteins and have created artificial repetitious proteins from repeats of short DNA sequences (microgenes). In this paper we characterize the physicochemical properties of six such artificially created proteins, which are the translated products of repeats of three microgenes. Three of the six proteins contain beta-sheet-like structures and are rather hydrophobic in nature. These proteins form macroscopic membranous structures in the presence of monovalent cationic ions, suggesting they have the capacity to promote strong intermolecular interactions. Of the other three proteins, one is comprised of alpha-helices and two have disordered structures. Small angle X-ray scattering analysis indicates that the artificial proteins do not fold as tightly as natural proteins, but are more compact than if completely denatured. One alpha-helical protein whose microgene unit was designed from coiled coil proteins was crystallized, demonstrating that repetitious artificial proteins can undergo transition to a more ordered state under appropriate conditions. Application of this approach to the development of a novel protein engineering system is discussed.  相似文献   

5.
Sequences of amino acids of some fiber proteins may have a periodic structure. To analyze this periodicity Fourier transform of a mathematical image of symbolic sequence of amino acids in a protein is sometimes used. In this work we employed one (out of few possible) particular way of doing Fourier transform as the most straightforward and optimal. Employing this optimal Fourier transform method we analyzed periodicity of fiber proteins in bacteriophage T4. As a result we managed to confirm that a certain periodicity exists in the investigated proteins. It was found that for a number of proteins the alternation of elements of the same group in the amino acid sequence with a rather small period T = 15 exists, whereas for some other proteins alternations have small periods 10 and 8. The new result is a discovery of relatively large periods of amino acids alternations, which divide the amino acids sequence of the protein into 4 or 6 equal parts. These data on the amino acids periodicity allowed us to align amino acids sequences in accordance with the established periods of both types, in agreement with certain results obtained in X-ray crystallography and electron microscopy experiments.  相似文献   

6.
For detection of the latent periodicity of the protein families responsible for various biological functions, methods of information decomposition, cyclic profile alignment, and the method of noise decomposition have been used. The latent periodicity, being specific to a particular family, is recognized in 94 of 110 analyzed protein families. Family specific periodicity was found for more than 70% of amino acid sequences in each of these families. Based on such sequences the characteristic profile of the latent periodicity has been deduced for each family. Possible relationship between the recognized latent periodicity, evolution of proteins, and their structural organization is discussed.  相似文献   

7.
肖奕  冯建辉  黄延昭 《生命科学》2010,(11):1129-1137
进化的观点认为,蛋白质结构的对称性是基因复制和融合的结果,但是由于在长期进化过程中的氨基酸突变,绝大多数现有的蛋白质序列都失去了这种直观的重复性特征。该文简要地回顾了国际上发展的寻找蛋白质序列中重复片段的方法,重点介绍了作者自己提出的分析蛋白质序列和结构对称性的方法以及在蛋白质对称结构形成机理方面的初步工作,并系统分析了各类对称折叠子的序列与结构关系,发现它们的序列都具有隐含的与结构相同的对称性,或者说序列的对称性决定结构的对称性。  相似文献   

8.
In recent years, a number of new protein structures that possess tandem repeats have emerged. Many of these proteins are comprised of tandem arrays of β-hairpins. Today, the amount and variety of the data on these β-hairpin repeat (BHR) structures have reached a level that requires detailed analysis and further classification. In this paper, we classified the BHR proteins, compared structures, sequences of repeat motifs, functions and distribution across the major taxonomic kingdoms of life and within organisms. As a result, we identified six different BHR folds in tandem repeat proteins of Class III (elongated structures) and one BHR fold (up-and-down β-barrel) in Class IV (“closed” structures). Our survey reveals the high incidence of the BHR proteins among bacteria and viruses and their possible relationship to the structures of amyloid fibrils. It indicates that BHR folds will be an attractive target for future structural studies, especially in the context of age-related amyloidosis and emerging infectious diseases. This work allowed us to update the RepeatsDB database, which contains annotated tandem repeat protein structures and to construct sequence profiles based on BHR structural alignments.  相似文献   

9.
10.
Conserved DNA structures in origins of replication.   总被引:15,自引:7,他引:8       下载免费PDF全文
According to the model of Bramhill and Kornberg, initiation of DNA replication in prokaryotes involves binding of an initiator protein to origin DNA and subsequent duplex opening of adjacent direct repeat sequences. In this report, we have used computer analysis to examine the higher-order DNA structure of a variety of origins of replication from plasmids, phages, and bacteria in order to determine whether these sequences are localized in domains of altered structure. The results demonstrate that the primary sites of initiator protein binding lie in discrete domains of DNA bending, while the direct repeats lie within well-defined boundaries of an unusual anti-bent domain. The anti-bent structures arise from a periodicity of A3 and T3 tracts which avoids the 10-11 bp bending periodicity. Since DNA fragments which serve as replicators in yeast also contain these two conserved structural elements, the results provide new insight into the universal role of conserved DNA structures in DNA replication.  相似文献   

11.
This article is in the area of protein sequence investigation. It studies protein sequence periodicity. The notion of latent periodicity is introduced. A mathematical method for searching for latent periodicity in protein sequences is developed. Implementation of the method developed for known cases of perfect and imperfect periodicity is demonstrated. Latent periodicity of many protein sequences from the SWISS-PROT data bank is revealed by the method and examples of latent periodicity of amino acid sequences are demonstrated for: the translation initiation factor EIF-2B (epsilon subunit) of Saccharomyces cerevisiae from the E2BE_YEAST sequence; the E.coli ferrienterochelin receptor from the FEPA_ECOLI sequence; the lysozyme of Bacteriophage SF6 from the LY_BPSF6 sequence; lipoamide dehydrogenase of Azotobacter vinelandii from the DLDH_AZOVI sequence. These protein sequences have latent periods equal to six, two, seven and 19 amino acids, respectively. We propose that a possible purpose of the amino acid sequence latent periodicity is to determine certain protein structures.  相似文献   

12.
13.
WD40-repeat proteins (WD40s), as one of the largest protein families in eukaryotes, play vital roles in assembling protein-protein/DNA/RNA complexes. WD40s fold into similar β-propeller structures despite diversified sequences. A program WDSP (WD40 repeat protein Structure Predictor) has been developed to accurately identify WD40 repeats and predict their secondary structures. The method is designed specifically for WD40 proteins by incorporating both local residue information and non-local family-specific structural features. It overcomes the problem of highly diversified protein sequences and variable loops. In addition, WDSP achieves a better prediction in identifying multiple WD40-domain proteins by taking the global combination of repeats into consideration. In secondary structure prediction, the average Q3 accuracy of WDSP in jack-knife test reaches 93.7%. A disease related protein LRRK2 was used as a representive example to demonstrate the structure prediction.  相似文献   

14.
蛋白质的序列、结构和功能多种多样.大量研究表明蛋白质的结构与其氨基酸序列的排序有关,并且局部的氨基酸序列环境对蛋白质的结构具有一定的影响.本文提出一种新的基于5-mer氨基酸扭转角统计偏好的蛋白质结构类型预测方法,在该方法通过PDB数据库中5-mer中间氨基酸的扭转角统计偏好来进行结构类型的预测.新方法可以通过计算机仿...  相似文献   

15.
Repeated motifs of amino acids within proteins are an abundant feature of eukaryotic sequences and may catalyze the rapid production of genetic and even phenotypic variation among organisms. The completion of the genome sequencing projects of 12 distinct Drosophila species provides a unique dataset to study these intriguing sequence features on a phylogeny with a variety of timescales. We show that there is a higher percentage of proteins containing repeats within the Drosophila genus than most other eukaryotes, including non-Drosphila insects, which makes this collection of species particularly useful for the study of protein repeats. We also find that proteins containing repeats are overrepresented in functional categories involving developmental processes, signaling, and gene regulation. Using the set of 1-to-1 ortholog alignments for the 12 Drosophila species, we test the ability of repeats to act as reliable phylogenetic signals and find that they resolve the generally accepted phylogeny despite the noise caused by their accelerated rate of evolution. We also determine that in general the position of repeats within a protein sequence is non-random, with repeats more often being absent from the middle regions of sequences. Finally we find evidence to suggest that the presence of repeats is associated with an increase in evolutionary rate upon the entire sequence in which they are embedded. With additional evidence to suggest a corresponding elevation in positive selection we propose that some repeats may be inducing compensatory substitutions in their surrounding sequence.  相似文献   

16.
Alpha-solenoids are flexible protein structural domains formed by ensembles of alpha-helical repeats (Armadillo and HEAT repeats among others). While homology can be used to detect many of these repeats, some alpha-solenoids have very little sequence homology to proteins of known structure and we expect that many remain undetected. We previously developed a method for detection of alpha-helical repeats based on a neural network trained on a dataset of protein structures. Here we improved the detection algorithm and updated the training dataset using recently solved structures of alpha-solenoids. Unexpectedly, we identified occurrences of alpha-solenoids in solved protein structures that escaped attention, for example within the core of the catalytic subunit of PI3KC. Our results expand the current set of known alpha-solenoids. Application of our tool to the protein universe allowed us to detect their significant enrichment in proteins interacting with many proteins, confirming that alpha-solenoids are generally involved in protein-protein interactions. We then studied the taxonomic distribution of alpha-solenoids to discuss an evolutionary scenario for the emergence of this type of domain, speculating that alpha-solenoids have emerged in multiple taxa in independent events by convergent evolution. We observe a higher rate of alpha-solenoids in eukaryotic genomes and in some prokaryotic families, such as Cyanobacteria and Planctomycetes, which could be associated to increased cellular complexity. The method is available at http://cbdm.mdc-berlin.de/~ard2/.  相似文献   

17.
The proteomes that make up the collection of proteins in contemporary organisms evolved through recombination and duplication of a limited set of domains. These protein domains are essentially the main components of globular proteins and are the most principal level at which protein function and protein interactions can be understood. An important aspect of domain evolution is their atomic structure and biochemical function, which are both specified by the information in the amino acid sequence. Changes in this information may bring about new folds, functions and protein architectures. With the present and still increasing wealth of sequences and annotation data brought about by genomics, new evolutionary relationships are constantly being revealed, unknown structures modeled and phylogenies inferred. Such investigations not only help predict the function of newly discovered proteins, but also assist in mapping unforeseen pathways of evolution and reveal crucial, co-evolving inter- and intra-molecular interactions. In turn this will help us describe how protein domains shaped cellular interaction networks and the dynamics with which they are regulated in the cell. Additionally, these studies can be used for the design of new and optimized protein domains for therapy. In this review, we aim to describe the basic concepts of protein domain evolution and illustrate recent developments in molecular evolution that have provided valuable new insights in the field of comparative genomics and protein interaction networks.  相似文献   

18.
Nucleoporins with phenylalanine-glycine repeats (FG Nups) function at the nuclear pore complex (NPC) to facilitate nucleocytoplasmic transport. In Saccharomyces cerevisiae, each FG Nup contains a large natively unfolded domain that is punctuated by FG repeats. These FG repeats are surrounded by hydrophilic amino acids (AAs) common to disordered protein domains. Here we show that the FG domain of Nups from human, fly, worm, and other yeast species is also enriched in these disorder-associated AAs, indicating that structural disorder is a conserved feature of FG Nups and likely serves an important role in NPC function. Despite the conservation of AA composition, FG Nup sequences from different species show extensive divergence. A comparison of the AA substitution rates of proteins with syntenic orthologs in four Saccharomyces species revealed that FG Nups have evolved at twice the rate of average yeast proteins with most substitutions occurring in sequences between FG repeats. The rapid evolution of FG Nups is poorly explained by parameters known to influence AA substitution rate, such as protein expression level, interactivity, and essentiality; instead their rapid evolution may reflect an intrinsic permissiveness of natively unfolded structures to AA substitutions. The overall lack of AA sequence conservation in FG Nups is sharply contrasted by discrete stretches of conserved sequences. These conserved sequences highlight known karyopherin and nucleoporin binding sites as well as other uncharacterized sites that may have important structural and functional properties.  相似文献   

19.
By controlling the growth of inorganic crystals, macro-biomolecules, including proteins, play pivotal roles in modulating biomineralization. Natural proteins that promote biomineralization are often composed of simple repeats of peptide sequences; however, the relationship between these repetitive structures and their functions remains largely unknown. Here we show that an artificial protein containing a repeated peptide sequence allows NaCl, KCl, CuSO4 and sucrose to form a variety of macroscopic structures, as represented by their dendritic configurations. Mutational analyses revealed that the physicochemical characteristics of the protein, not the peptide sequence per se, were responsible for formation of the dendritic structures. This suggests that proteins that modulate crystal growth may have evolved as repeat-containing forms at a relatively high rate. These observations could serve as the basis for developing new genetic programming systems for creation of artificial proteins able to modulate crystal growth from inorganic compounds, and may thus provide a new tool for nano-biotechnology.  相似文献   

20.
The formation and properties of lepidopteran silk fibers depend on amino acid repeats in the principal protein, heavy chain fibroin (H-fibroin). In H-fibroins of the "bombycoid" type, concatenations of alanine or of the GAGAGS crystalline motifs (1st tier repeats) and adjacent sequences breaking periodicity make 2nd tier repeats. Two to six such repeats comprise a 3rd tier assembly, and 12 assemblies, linked by an amorphous sequence, constitute the repetitive H-fibroin region. Heterogeneity in the repeat length and intercalation of amorphous regions prevent excessive crystallization. In the "pyraloid" H-fibroins, iterations of simple motifs are absent and assemblies of several complex motifs constitute highly regular repeats that are organized in about 12 highest order reiterations without specific spacers. Repeat homogeneity appears crucial for the alignment and interaction of the disjunct motifs that must be registered precisely to form crystallites; repeat heterogeneity is associated with decreased fiber strength. Both H-fibroin types are typically hydrophobic, and their secretion requires disulfide linkage to light chain fibroin and participation of another protein, P25. These auxiliary proteins are absent in saturniid moths with amphiphilic H-fibroin repeats. The selection at nucleic acid and protein levels and the availability of nutrients play roles in H-fibroin evolution.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号