首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 171 毫秒
1.
Messenger RNA is a key component of an intricate regulatory network of its own. It accommodates numerous nucleotide signals that overlap protein coding sequences and are responsible for multiple levels of regulation and generation of biological complexity. A wealth of structural and regulatory information, which mRNA carries in addition to the encoded amino acid sequence, raises the question of how these signals and overlapping codes are delineated along non-synonymous and synonymous positions in protein coding regions, especially in eukaryotes. Silent or synonymous codon positions, which do not determine amino acid sequences of the encoded proteins, define mRNA secondary structure and stability and affect the rate of translation, folding and post-translational modifications of nascent polypeptides. The RNA level selection is acting on synonymous sites in both prokaryotes and eukaryotes and is more common than previously thought. Selection pressure on the coding gene regions follows three-nucleotide periodic pattern of nucleotide base-pairing in mRNA, which is imposed by the genetic code. Synonymous positions of the coding regions have a higher level of hybridization potential relative to non-synonymous positions, and are multifunctional in their regulatory and structural roles. Recent experimental evidence and analysis of mRNA structure and interspecies conservation suggest that there is an evolutionary tradeoff between selective pressure acting at the RNA and protein levels. Here we provide a comprehensive overview of the studies that define the role of silent positions in regulating RNA structure and processing that exert downstream effects on proteins and their functions.  相似文献   

2.
Shestopalov BV 《Tsitologiia》2003,45(7):702-706
The calculation of protein three-dimensional structure from the amino acid sequence is a fundamental problem to be solved. This paper presents principles of the code theory of protein secondary structure, and their consequence--the amino acid code of protein secondary structure. The doublet code model of protein secondary structure, developed earlier by the author (Shestopalov, 1990), is part of this theory. The theory basis are: 1) the name secondary structure is assigned to the conformation, stabilized only by the nearest (intraresidual) and middle-range (at a distance no more than that between residues i and i + 5) interactions; 2) the secondary structure consists of regular (alpha-helical and beta-structural) and irregular (coil) segments; 3) the alpha-helices, beta-strands and coil segments are encoded, respectively, by residue pairs (i, i + 4), (i, i + 2), (i, i = 1), according to the numbers of residues per period, 3.6, 2, 1; 4) all such pairs in the amino acid sequence are codons for elementary structural elements, or structurons; 5) the codons are divided into 21 types depending on their strength, i.e. their encoding capability; 6) overlappings of structurons of one and the same structure generate the longer segments of this structure; 7) overlapping of structurons of different structures is forbidden, and therefore selection of codons is required, the codon selection is hierarchic; 8) the code theory of protein secondary structure generates six variants of the amino acid code of protein secondary structure. There are two possible kinds of model construction based on the theory: the physical one using physical properties of amino acid residues, and the statistical one using results of statistical analysis of a great body of structural data. Some evident consequences of the theory are: a) the theory can be used for calculating the secondary structure from the amino acid sequence as a partial solution of the problem of calculation of protein three-dimensional structure from the amino acid sequence, and the calculated secondary structure and codon strength distribution can be used for simulating the next step of protein folding; b) one can propose that the same secondary structures can be folded into different tertiary structures and, vice versa, different secondary structures can be folded into the same tertiary structures, provided codon distributions are considered also; c) codons can be considered as first elements of protein three-dimensional structure language.  相似文献   

3.
Left-handed polyproline II (PPII) helices commonly occur in globular proteins in segments of 4-8 residues. This paper analyzes the structural conservation of PPII-helices in 3 protein families: serine proteinases, aspartic proteinases, and immunoglobulin constant domains. Calculations of the number of conserved segments based on structural alignment of homologous molecules yielded similar results for the PPII-helices, the alpha-helices, and the beta-strands. The PPII-helices are consistently conserved at the level of 100-80% in the proteins with sequence identity above 20% and RMS deviation of structure alignments below 3.0 A. The most structurally important PPII segments are conserved below this level of sequence identity. These results suggest that the PPII-helices, in addition to the other 2 secondary structure classes, should be identified as part of structurally conserved regions in proteins. This is supported by similar values for the local RMS deviations of the aligned segments for the structural classes of PPII-helices, alpha-helices, and beta-strands. The PPII-helices are shown to participate in supersecondary elements such as PPII-helix/alpha-helix. The conservation of PPII-helices depends on the conservation of a supersecondary element as a whole. PPII-helices also form links, possibly flexible, in the interdomain regions. The role of the PPII-helices in model building by homology is 2-fold; they serve as additional conserved elements in the structure allowing improvement of the accuracy of a model and provide correct chain geometry for modeling of the segments equivalenced to them in a target sequence. The improvement in model building is demonstrated in 2 test studies.  相似文献   

4.
Tricodon regions on messenger RNAs corresponding to a set of proteins from Escherichia coli were scrutinized for their translation speed. The fractional frequency values of the individual codons as they occur in mRNAs of highly expressed genes from Escherichia coli were taken as an indicative measure of the translation speed. The tricodons were classified by the sum of the frequency values of the constituent codons. Examination of the conformation of the encoded amino acid residues in the corresponding protein tertiary structures revealed a correlation between codon usage in mRNA and topological features of the encoded proteins. Alpha helices on proteins tend to be preferentially coded by translationally fast mRNA regions while the slow segments often code for beta strands and coil regions. Fast regions correspondingly avoid coding for beta strands and coil regions while the slow regions similarly move away from encoding alpha helices. Structural and mechanistic aspects of the ribosome peptide channel support the relevance of sequence fragment translation and subsequent conformation. A discussion is presented relating the observation to the reported kinetic data on the formation and stabilization of protein secondary structural types during protein folding. The observed absence of such strong positive selection for codons in non-highly expressed genes is compatible with existing theories that mutation pressure may well dominate codon selection in non-highly expressed genes.  相似文献   

5.
Flavors of protein disorder   总被引:1,自引:0,他引:1  
Intrinsically disordered proteins are characterized by long regions lacking 3-D structure in their native states, yet they have been so far associated with 28 distinguishable functions. Previous studies showed that protein predictors trained on disorder from one type of protein often achieve poor accuracy on disorder of proteins of a different type, thus indicating significant differences in sequence properties among disordered proteins. Important biological problems are identifying different types, or flavors, of disorder and examining their relationships with protein function. Innovative use of computational methods is needed in addressing these problems due to relative scarcity of experimental data and background knowledge related to protein disorder. We developed an algorithm that partitions protein disorder into flavors based on competition among increasing numbers of predictors, with prediction accuracy determining both the number of distinct predictors and the partitioning of the individual proteins. Using 145 variously characterized proteins with long (>30 amino acids) disordered regions, 3 flavors, called V, C, and S, were identified by this approach, with the V subset containing 52 segments and 7743 residues, C containing 39 segments and 3402 residues, and S containing 54 segments and 5752 residues. The V, C, and S flavors were distinguishable by amino acid compositions, sequence locations, and biological function. For the sequences in SwissProt and 28 genomes, their protein functions exhibit correlations with the commonness and usage of different disorder flavors, suggesting different flavor-function sets across these protein groups. Overall, the results herein support the flavor-function approach as a useful complement to structural genomics as a means for automatically assigning possible functions to sequences.  相似文献   

6.
enod40 is a plant gene that participates in the regulation of symbiotic interaction between leguminous plants and bacteria or fungi. Furthermore, it has been suggested to play a general role in non-symbiotic plant development. Although enod40 seems to have multiple functions, being present in many land plants, the molecular mechanisms of its activity are unclear; they may be determined though, by short peptides and/or RNA structures encoded in the enod40 genes. We utilized conserved RNA structures in enod40 sequences to search nucleotide sequence databases and identified a number of new enod40 homologues in plant species that belong to known, but also, to yet unknown enod40-containing plant families. RNA secondary structure predictions and comparative sequence analysis of enod40 RNAs allowed us to determine the most conserved structural features, present in all known enod40 genes. Remarkably, the topology and evolution of one of the conserved structural domains are similar to those of the expansion segments found in structural RNAs such as rRNAs, RNase P and SRP RNAs. Surprisingly, the enod40 RNA structural elements are much more stronger conserved than the encoded peptides. This finding suggests that some general functions of enod40 gene could be determined by the encoded RNA structure, whereas short peptides may be responsible for more diverse functions found only in certain plant families.  相似文献   

7.
We have identified multiple distinct splicing enhancer elements within protein-coding sequences of the constitutively spliced human β-globin pre-mRNA. Each of these highly conserved sequences is sufficient to activate the splicing of a heterologous enhancer-dependent pre-mRNA. One of these enhancers is activated by and binds to the SR protein SC35, whereas at least two others are activated by the SR protein SF2/ASF. A single base mutation within another enhancer element inactivates the enhancer but does not change the encoded amino acid. Thus, overlapping protein coding and RNA recognition elements may be coselected during evolution. These studies provide the first direct evidence that SR protein-specific splicing enhancers are located within the coding regions of constitutively spliced pre-mRNAs. We propose that these enhancers function as multisite splicing enhancers to specify 3′ splice-site selection.  相似文献   

8.
Some of the most serious diseases are characterized by the presence of a specific secondary structure within DNA or RNA, often in the promoter or the coding region of the responsible gene, that enhances or disrupts expression of the protein. Structural elements that impact cellular function may also be formed in other genomic regions such as telomeres. Compounds that interact with such structural elements may be useful in diagnosis or treatment of patients. In this report, we present a FRET melting assay that allows testing of libraries of compounds against four different nucleic acid structures. Compounds are tested to determine whether they stabilize preformed secondary structures (i.e., whether they cause an increase in melting temperature (T(m))). This property is described by the ΔT(m) parameter, which is the difference between the T(m) of the compound-stabilized structure and the T(m) of the unbound structure. Model oligonucleotides are labeled with FAM as a fluorescent donor and TAMRA as an acceptor. The intensity of FAM fluorescence is recorded as a function of temperature. Melting temperatures are determined by the FRET method in 96-well plates; this assay could easily be converted into 384-well format.  相似文献   

9.

Background  

Disordered regions are segments of the protein chain which do not adopt stable structures. Such segments are often of interest because they have a close relationship with protein expression and functionality. As such, protein disorder prediction is important for protein structure prediction, structure determination and function annotation.  相似文献   

10.
11.
Although intrinsically disordered proteins are prevalent and functionally important, it has never been asked whether structural disorder should be considered as a separate structural category on its own or merely as a lack of secondary and/or tertiary structure. We address this issue by showing that its length distribution in the human proteome follows a power law, with many short regions but also a significant incidence of very long disordered regions. This behavior is in sharp contrast with that of conventional secondary structural elements and is highly reminiscent of the distribution of tertiary structural units in proteins. We interpret this finding by the direct functional involvement of disorder, which distinguishes it from secondary structural elements and endows it with tertiary structural attributes.  相似文献   

12.
De S  Sur K  Dasgupta S 《Biopolymers》2005,79(2):63-73
Nonstructured regions in proteins that provide the link between two regular structured regions play a significant role in maintaining the scaffold of the protein. Not only do they act as connectors between two regular secondary structural elements of proteins but they also provide the necessary turn or reversal in the polypeptide chain. This incorporates flexibility in the structure. Thus an understanding of the structural aspects of the nonregular regions is necessary to have a better insight into these features. We can assume the nonregular region to be a contorted polypeptide segment tethered by regular secondary structured regions at both ends. To describe the undulating nature of the nonregular regions, we introduce a parameter called the "contortion index." This index describes how tortuously the region is organized. Our analysis shows that the contortion index is related to other physicochemical parameters and can be used to characterize the nonregular regions of proteins.  相似文献   

13.
Synonymous codon replacement can change protein structure and function, indicating that protein structure depends on DNA sequence. During heterologous protein expression, low expression or formation of insoluble aggregates may be attributable to differences in synonymous codon usage between expression and natural hosts. This discordance may be particularly important during translation of the domain boundaries (link/end segments) that separate elements of higher ordered structure. Within such regions, ribosomal progression slows as the ribosome encounters clusters of infrequently used codons that preferentially encode a subset of amino acids. To replicate the modulation of such localized translation rates during heterologous expression, we used known relationships between codon usage frequencies and secondary protein structure to develop an algorithm ("codon harmonization") for identifying regions of slowly translated mRNA that are putatively associated with link/end segments. It then recommends synonymous replacement codons having usage frequencies in the heterologous expression host that are less than or equal to the usage frequencies of native codons in the native expression host. For protein regions other than these putative link/end segments, it recommends synonymous substitutions with codons having usage frequencies matched as nearly as possible to the native expression system. Previous application of this algorithm facilitated E. coli expression, manufacture and testing of two Plasmodium falciparum vaccine candidates. Here we describe the algorithm in detail and apply it to E. coli expression of three additional P. falciparum proteins. Expression of the "recoded" genes exceeded that of the native genes by 4- to 1,000-fold, representing levels suitable for vaccine manufacture. The proteins were soluble and reacted with a variety of functional conformation-specific mAbs suggesting that they were folded properly and had assumed native conformation. Codon harmonization may further provide a general strategy for improving the expression of soluble functional proteins during heterologous expression in hosts other than E. coli.  相似文献   

14.
Biased usage of synonymous codons has been elucidated under the perspective of cellular tRNA abundance for quite a long time now. Taking advantage of publicly available gene expression data for Saccharomyces cerevisiae, a systematic analysis of the codon and amino acid usages in two different coding regions corresponding to the regular (helix and strand) as well as the irregular (coil) protein secondary structures, have been performed. Our analyses suggest that apart from tRNA abundance, mRNA folding stability is another major evolutionary force in shaping the codon and amino acid usage differences between the highly and lowly expressed genes in S. cerevisiae genome and surprisingly it depends on the coding regions corresponding to the secondary structures of the encoded proteins. This is obviously a new paradigm in understanding the codon usage in S. cerevisiae. Differential amino acid usage between highly and lowly expressed genes in the regions coding for the irregular protein secondary structure in S. cerevisiae is expounded by the stability of the mRNA folded structure. Irrespective of the protein secondary structural type, the highly expressed genes always tend to encode cheaper amino acids in order to reduce the overall biosynthetic cost of production of the corresponding protein. This study supports the hypothesis that the tRNA abundance is a consequence of and not a reason for the biased usage of amino acid between highly and lowly expressed genes.  相似文献   

15.
As many diseases can be traced back to altered protein function, studying the effect of genetic variations at the level of proteins can provide a clue to understand how changes at the DNA level lead to various diseases. Cellular processes rely not only on proteins with well-defined structure but can also involve intrinsically disordered proteins (IDPs) that exist as highly flexible ensembles of conformations. Disordered proteins are mostly involved in signaling and regulatory processes, and their functional repertoire largely complements that of globular proteins. However, it was also suggested that protein disorder entails an increased biological cost. This notion was supported by a set of individual IDPs involved in various diseases, especially in cancer, and the increased amount of disorder observed among disease-associated proteins. In this work, we tested if there is any biological risk associated with protein disorder at the level of single nucleotide mutations. Specifically, we analyzed the distribution of mutations within ordered and disordered segments. Our results demonstrated that while neutral polymorphisms were more likely to occur within disordered segments, cancer-associated mutations had a preference for ordered regions. Additionally, we proposed an alternative explanation for the association of protein disorder and the involvement in cancer with the consideration of functional annotations. Individual examples also suggested that although disordered segments are fundamental functional elements, their presence is not necessarily accompanied with an increased mutation rate in cancer. The presented study can help to understand how the different structural properties of proteins influence the consequences of genetic mutations.  相似文献   

16.
A new approach for evaluating the secondary structure of proteins by CD spectroscopy of overlapping peptide segments is applied to porcine adenylate kinase (AK1) and yeast guanylate kinase (GK3). One hundred seventy-six peptide segments of a length of 15 residues, overlapping by 13 residues and covering the complete sequences of AK1 and GK3, were synthesized in order to evaluate their secondary structure composition by CD spectroscopy. The peptides were prepared by solid phase multiple peptide synthesis method using the 9-fluorenylmethoxycarbonyl/tert-butyl strategy. The individual peptide secondary structures were studied with CD spectroscopy in a mixture of 30% trifluoroethanol in phosphate buffer (pH 7) and subsequently compared with x-ray data of AK1 and GK3. Peptide segments that cover α-helical regions of the AK1 or GK3 sequence mainly showed CD spectra with increasing and decreasing Cotton effects that were typical for appearing and disappearing α-helical structures. For segments with dominating β-sheet conformation, however, the application of this method is limited due to the stability and clustering of β-sheet segments in solution and due to the difficult interpretation of random-coiled superimposed β-sheet CD signals. Nevertheless, the results of this method especially for α-helical segments are very impressive. All α-helical and 71% of the β-sheet containing regions of the AK1 and GK3 could be identified. Moreover, it was shown that CD spectra of consecutive peptide content reveal the appearance and disappearance of α-helical secondary structure elements and help localizing them on the sequence string. © 1997 John Wiley & Sons, Inc. Biopoly 41: 213–231, 1997  相似文献   

17.
The interface of a protein molecule that is involved in binding another protein, DNA or RNA has been characterized in terms of the number of unique secondary structural segments (SSSs), made up of stretches of helix, strand and non-regular (NR) regions. On average 10-11 segments define the protein interface in protein-protein (PP) and protein-DNA (PD) complexes, while the number is higher (14) for protein-RNA (PR) complexes. While the length of helical segments in PP interaction increases with the interface area, this is not the case in PD and PR complexes. The propensities of residues to occur in the three types of secondary structural elements (SSEs) in the interface relative to the corresponding elements in the protein tertiary structures have been calculated. Arg, Lys, Asn, Tyr, His and Gln are preferred residues in PR complexes; in addition, Ser and Thr are also favoured in PD interfaces.  相似文献   

18.
Protein structure is generally more conserved than sequence, but for regions that can adopt different structures in different environments, does this hold true? Understanding how structurally disordered regions evolve altered secondary structure element propensities as well as conformational flexibility among paralogs are fundamental questions for our understanding of protein structural evolution. We have investigated the evolutionary dynamics of structural disorder in protein families containing both orthologs and paralogs using phylogenetic tree reconstruction, protein structure disorder prediction, and secondary structure prediction in order to shed light upon these questions. Our results indicate that the extent and location of structurally disordered regions are not universally conserved. As structurally disordered regions often have high conformational flexibility, this is likely to have an effect on how protein structure evolves as spatially altered conformational flexibility can also change the secondary structure propensities for homologous regions in a protein family.  相似文献   

19.
In genetic language a peculiar arrangement of biological information is provided by overlapping genes in which the same region of DNA can code for functionally unrelated messages. In this work, the informational content of overlapping genes belonging to prokaryotic and eukaryotic viruses was analyzed. Using information theory indices, we identified in the regions of overlap a first pattern, exhibiting a more uniform base composition and more severe constraints in base ordering with respect to the nonoverlapping regions. This pattern was found to be peculiar to coliphage, avian hepatitis B virus, human lentivirus, and plant luteovirus families. A second pattern, characterized by the occurrence of similar compositional constraints in both types of coding regions, was found to be limited to plant tymoviruses. At the level of codon usage, a low degree of correlation between overlapping and nonoverlapping coding regions characterized the first pattern, whereas a close link was found in tymoviruses, indicating a fine adaptation of the overlapping frame to the original codon choice of the virus. As a result of codon usage correlation analysis, deductions concerning the origin and evolution of several overlapping frames were also proposed. Comparison of amino acid composition revealed an increased frequency of amino acid residues with a high level of degeneracy (arginine, leucine, and serine) in the proteins encoded by overlapping genes; this peculiar feature of overlapping genes can be viewed as a way with which they may expand their coding ability and gain new, specialized functions. Received: 28 October 1996 / Accepted: 29 January 1997  相似文献   

20.
Messenger RNA sequences often have to preserve functional secondary structure elements in addition to coding for proteins. We present a statistical analysis of retroviral mRNA which supports the hypothesis that the natural genetic code is adapted to such complementary coding. These sequences are still able to explore efficiently the space of possible proteins by point mutations. This is borne out by the observation that, in stem regions of retroviral mRNA foldings, silent mutations on one strand are preferentially accompanied by conservative mutations on the other. Distances between amino acids based on physicochemical properties are used to quantify the conservation of protein function under the constraint of maintained RNA secondary structure. We find that preservation of RNA secondary structure by compensatory mutations is evolutionary compatible with the efficient search for new variants on the protein level. Received: 4 June 1999 / Accepted: 12 October 1999  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号