首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The secondary structure is a fundamental feature of both non-coding RNAs (ncRNAs) and messenger RNAs (mRNAs). However, our understanding of the secondary structures of mRNAs, especially those of the coding regions, remains elusive, likely due to translation and the lack of RNA-binding proteins that sustain the consensus structure like those binding to ncRNAs. Indeed, mRNAs have recently been found to adopt diverse alternative structures, but the overall functional significance remains untested. We hereby approach this problem by estimating the folding specificity, i.e., the probability that a fragment of an mRNA folds back to the same partner once refolded. We show that the folding specificity of mRNAs is lower than that of ncRNAs and exhibits moderate evolutionary conservation. Notably, we find that specific rather than alternative folding is likely evolutionarily adaptive since specific folding is frequently associated with functionally important genes or sites within a gene. Additional analysis in combination with ribosome density suggests the ability to modulate ribosome movement as one potential functional advantage provided by specific folding. Our findings reveal a novel facet of the RNA structurome with important functional and evolutionary implications and indicate a potential method for distinguishing the mRNA secondary structures maintained by natural selection from molecular noise.  相似文献   

2.
S Rackovsky 《Proteins》1990,7(4):378-402
We address herein the problem of delineating the relationships between the known protein structures. In order to study this problem, methods have been developed to represent arbitrarily sized fragments of biopolymer backbone, and to compare distributions of such fragments. These methods are applied to a classification of 123 structures representing the entire set of known x-ray structures. The resulting data are analyzed (on the four-C alpha length scale) to determine both the large-scale organization of the set of known structures (i.e., the relationships between large groups of structures, each comprised of proteins that are structurally related) and its local structure (i.e., the quantitative degree of similarity between any two specific structures). It is shown that the set of structures forms a continuum of structural types, ranging from all-helical to all-sheet/barrel proteins. It is further demonstrated that the density of protein structures is not uniform across this continuum, but rather that structures cluster in certain regions, separated by regions of lower population. The properties of the various regions of the structural space are determined. The existence is demonstrated of strong quantitative correlations between the contents of different types of four-C alpha fragments within protein structures, which imply significant constraints on the types of architecture that can occur in proteins. Analysis of the distribution of structures demonstrates some hitherto unsuspected similarities and suggests that, in some circumstances, neither structural similarity nor sequence homology may be necessary conditions for evolutionary relationship between proteins. It is also suggested that these unsuspected similarities may imply similar folding mechanisms for structures of apparently different global architecture. Cases are also noted in which apparently similar structures may fold by different mechanisms. The connection between structure and dynamic properties is discussed, and a possible role of dynamics in the evolution of protein structures is suggested. The sensitivity of the methods presented herein to anomalies of structure refinement is demonstrated. It is suggested that the present results provide a framework for analyzing experimental results on structural similarity obtained using vibrational circular dichroism spectra, which are sensitive to local backbone structure.  相似文献   

3.
A Dong  B Caughey  W S Caughey  K S Bhat  J E Coe 《Biochemistry》1992,31(39):9364-9370
The secondary structure of hamster female protein in aqueous solutions in the presence or absence of calcium and phosphorylcholine has been investigated using Fourier transform infrared spectroscopy. Our present studies provide the first evaluation of the secondary structure of FP and its calcium- and phosphorylcholine-dependent conformational changes. Quantitative analysis indicated that FP is composed of 50% beta-sheet, 11% alpha-helix, 29% beta-turn, and 10% random structures. Calcium- and phosphorylcholine-dependent infrared spectral changes were observed in regions assigned to beta-sheet, alpha-helix, turn, and random structures. The infrared-based secondary structure compositions were used as constraints to compute theoretical locations for the different secondary structures along the amino acid sequence of the FP protein. Two putative calcium-binding sites were proposed for FP (residues 93-109 and 150-168) as well as other members of the pentraxin family on the basis of the theoretical secondary structure predictions and the similarity in sequence between the pentraxins and EF-hand calcium-binding proteins. The changes in protein conformation detected upon binding of calcium and phosphorylcholine provide a mechanism for the effects of these ligands on physiologically important properties of the protein, e.g., activation of complement and association with amyloids.  相似文献   

4.
Synonymous constraint elements (SCEs) are protein-coding genomic regions with very low synonymous mutation rates believed to carry additional, overlapping functions. Thousands of such potentially multi-functional elements were recently discovered by analyzing the levels and patterns of evolutionary conservation in human coding exons. These elements provide a good opportunity to improve our understanding of how the redundant nature of the genetic code is exploited in the cell. Our premise is that the protein segments encoded by such elements might better comply with the increased functional demands if they are structurally less constrained (i.e. intrinsically disordered). To test this idea, we investigated the protein segments encoded by SCEs with computational tools to describe the underlying structural properties. In addition to SCEs, we examined the level of disorder, secondary structure, and sequence complexity of protein regions overlapping with experimentally validated splice regulatory sites. We show that multi-functional gene regions translate into protein segments that are significantly enriched in structural disorder and compositional bias, while they are depleted in secondary structure and domain annotations compared to reference segments of similar lengths. This tendency suggests that relaxed protein structural constraints provide an advantage when accommodating multiple overlapping functions in coding regions.  相似文献   

5.
Influenza A is a negative sense RNA virus of significant public health concern. While much is understood about the life cycle of the virus, knowledge of RNA secondary structure in influenza A virus is sparse. Predictions of RNA secondary structure can focus experimental efforts. The present study analyzes coding regions of the eight viral genome segments in both the (+) and (-) sense RNA for conserved secondary structure. The predictions are based on identifying regions of unusual thermodynamic stabilities and are correlated with studies of suppression of synonymous codon usage (SSCU). The results indicate that secondary structure is favored in the (+) sense influenza RNA. Twenty regions with putative conserved RNA structure have been identified, including two previously described structured regions. Of these predictions, eight have high thermodynamic stability and SSCU, with five of these corresponding to current annotations (e.g., splice sites), while the remaining 12 are predicted by the thermodynamics alone. Secondary structures with high conservation of base-pairing are proposed within the five regions having known function. A combination of thermodynamics, amino acid and nucleotide sequence comparisons along with SSCU was essential for revealing potential secondary structures.  相似文献   

6.
7.
8.
9.
10.
基于支持向量机的人类5’非翻译区剪接位点识别   总被引:5,自引:0,他引:5  
基因非编码区域剪接位点的识别是基因识别中一个非常具有挑战性的问题,尤其是5’非翻译区中剪接位点的识别。与一般剪接位点不同,5’非翻译区剪接位点的两侧不存在由编码到非编码的状态转移,所以通常的剪接位点识别算法在非翻译区的性能不太理想。文章采用了基于支持向量机的方法对5’非翻译区中的剪接位点进行识别。为了提高识别精度,采用了基于矩阵相似性度量的核函数参数选取方法,它能够简单快速地确定合适的核函数参数,进而提高核函数的识别性能。通过实验验证,经过参数选择后的支持向量机能够较好地识别5'非翻译区剪接位点。  相似文献   

11.

Introduction

The Hepatitis B Virus (HBV) genome contains four ORFs, S (surface), P (polymerase), C (core) and X. S is completely overlapped by P and as a consequence the overlapping region is subject to distinctive evolutionary constraints compared to the remainder of the genome. Specifically, a non-synonymous substitution in one coding frame may produce a synonymous substitution in the alternative frame, suggesting a possible conflict between requirements for diversifying and purifying forces. To examine how these contrasting requirements are balanced within this region, we investigated the relationship amongst positive selection sites, conserved regions, epitopes and elements of protein structure to consider how HBV balances the contrasting evolutionary pressures.

Methodology/Results

323 HBV genotype D genome sequences were collected and analyzed to identify sites under positive selection and highly conserved regions. Epitopes sequences were retrieved from previously published experimental studies stored in the Immune Epitope Database. Predicted secondary structures were used to investigate the association between structure and conservation. Entropy was used as a measure of conservation and bivariate logistic regression was used to investigate the relationship between positive selection/conserved sites and epitope/secondary structure regions. Our results indicate: (i) conservation in S is primarily dictated by α-helix elements in the protein structure, (ii) variable residues are mainly located in PreS, the major hydrophilic region (MHR) and the C-terminus, (iii) epitopes in S, which are directly targeted by the host immune system, are significantly associated with sites under positive selection.

Conclusions

The highly variable spacer domain in P, which corresponds to PreS in S, appears to act as a harbor for the accumulation of mutations that can provide flexibility for conformational changes and responding to immune pressure.  相似文献   

12.
Riboflavin, an essential cofactor for all organisms, is biosynthesized in plants, fungi and microorganisms. The penultimate step in the pathway is catalyzed by the enzyme lumazine synthase. One of the most distinctive characteristics of this enzyme is that it is found in different species in two different quaternary structures, pentameric and icosahedral, built from practically the same structural monomeric unit. In fact, the icosahedral structure is best described as a capsid of twelve pentamers. Despite this noticeable difference, the active sites are virtually identical in all structurally studied members. Furthermore, the main regions involved in the catalysis are located at the interface between adjacent subunits in the pentamer. Thus, the two quaternary forms of the enzyme must meet similar structural requirements to achieve their function, but, at the same time, they should differ in the sequence traits responsible for the different quaternary structures observed. Here, we present a combined analysis that includes sequence-structure and evolutionary studies to find the sequence determinants of the different quaternary assemblies of this enzyme. A data set containing 86 sequences of the lumazine synthase family was recovered by sequence similarity searches. Seven of them had resolved three-dimensional structures. A subsequent phylogenetic reconstruction by maximum parsimony (MP) allowed division of the total set into two clusters in accord with their quaternary structure. The comparison between the patterns of three-dimensional contacts derived from the known three-dimensional structures and variation in sequence conservation revealed a significant shift in structural constraints of certain positions. Also, to explore the changes in functional constraints between the two groups, site-specific evolutionary rate shifts were analyzed. We found that the positions involved in icosahedral contacts suffer a larger increase in constraints than the rest. We found eight sequence sites that would be the most important icosahedral sequence determinants. We discuss our results and compare them with previous work. These findings should contribute to refinement of the current structural data, to the design of assays that explore the role of these positions, to the structural characterization of new sequences, and to initiation of a study of the underlying evolutionary mechanisms.  相似文献   

13.
14.
Abstract: Developmental changes in the levels of N -methyl- d -aspartate (NMDA) receptor subunit mRNAs were identified in rat brain using solution hybridization/RNase protection assays. Pronounced increases in the levels of mRNAs encoding NR1 and NR2A were seen in the cerebral cortex, hippocampus, and cerebellum between postnatal days 7 and 20. In cortex and hippocampus, the expression of NR2B mRNA was high in neonatal rats and remained relatively constant over time. In contrast, in cerebellum, the level of NR2B mRNA was highest at postnatal day 1 and declined to undetectable levels by postnatal day 28. NR2C mRNA was not detectable in cerebellum before postnatal day 11, after which it increased to reach adult levels by postnatal day 28. In cortex, the expression of NR2A and NR2B mRNAs corresponds to the previously described developmental profile of NMDA receptor subtypes having low and high affinities for ifenprodil, i.e., a delayed expression of NR2A correlating with the late expression of low-affinity ifenprodil sites. In cortex and hippocampus, the predominant splice variants of NR1 were those without the 5' insert and with or without both 3' inserts. In cerebellum, however, the major NR1 variants were those containing the 5' insert and lacking both 3' inserts. The results show that the expression of NR1 splice variants and NR2 subunits is differentially regulated in various brain regions during development. Changes in subunit expression are likely to underlie some of the changes in the functional and pharmacological properties of NMDA receptors that occur during development.  相似文献   

15.
Comparative analysis is one of the most powerful methods available for understanding the diverse and complex systems found in biology, but it is often limited by a lack of comprehensive taxonomic sampling. Despite the recent development of powerful genome technologies capable of producing sequence data in large quantities (witness the recently completed first draft of the human genome), there has been relatively little change in how evolutionary studies are conducted. The application of genomic methods to evolutionary biology is a challenge, in part because gene segments from different organisms are manipulated separately, requiring individual purification, cloning, and sequencing. We suggest that a feasible approach to collecting genome-scale data sets for evolutionary biology (i.e., evolutionary genomics) may consist of combination of DNA samples prior to cloning and sequencing, followed by computational reconstruction of the original sequences. This approach will allow the full benefit of automated protocols developed by genome projects to be realized; taxon sampling levels can easily increase to thousands for targeted genomes and genomic regions. Sequence diversity at this level will dramatically improve the quality and accuracy of phylogenetic inference, as well as the accuracy and resolution of comparative evolutionary studies. In particular, it will be possible to make accurate estimates of normal evolution in the context of constant structural and functional constraints (i.e., site-specific substitution probabilities), along with accurate estimates of changes in evolutionary patterns, including pairwise coevolution between sites, adaptive bursts, and changes in selective constraints. These estimates can then be used to understand and predict the effects of protein structure and function on sequence evolution and to predict unknown details of protein structure, function, and functional divergence. In order to demonstrate the practicality of these ideas and the potential benefit for functional genomic analysis, we describe a pilot project we are conducting to simultaneously sequence large numbers of vertebrate mitochondrial genomes.  相似文献   

16.
When aligning RNAs, it is important to consider both the secondary structure similarity and primary sequence similarity to find an accurate alignment. However, algorithms that can handle RNA secondary structures typically have high computational complexity that limits their utility. For this reason, there have been a number of attempts to find useful alignment constraints that can reduce the computations without sacrificing the alignment accuracy. In this paper, we propose a new method for finding effective alignment constraints for fast and accurate structural alignment of RNAs, including pseudoknots. In the proposed method, we use a profile-HMM to identify the “seedâ€� regions that can be aligned with high confidence. We also estimate the position range of the aligned bases that are located outside the seed regions. The location of the seed regions and the estimated range of the alignment positions are then used to establish the sequence alignment constraints. We incorporated the proposed constraints into the profile context-sensitive HMM (profile-csHMM) based RNA structural alignment algorithm. Experiments indicate that the proposed method can make the alignment speed up to 11 times faster without degrading the accuracy of the RNA alignment.  相似文献   

17.
RNA secondary structure plays a central role in the replication and metabolism of all RNA viruses, including retroviruses like HIV-1. However, structures with known function represent only a fraction of the secondary structure reported for HIV-1NL4-3. One tool to assess the importance of RNA structures is to examine their conservation over evolutionary time. To this end, we used SHAPE to model the secondary structure of a second primate lentiviral genome, SIVmac239, which shares only 50% sequence identity at the nucleotide level with HIV-1NL4-3. Only about half of the paired nucleotides are paired in both genomic RNAs and, across the genome, just 71 base pairs form with the same pairing partner in both genomes. On average the RNA secondary structure is thus evolving at a much faster rate than the sequence. Structure at the Gag-Pro-Pol frameshift site is maintained but in a significantly altered form, while the impact of selection for maintaining a protein binding interaction can be seen in the conservation of pairing partners in the small RRE stems where Rev binds. Structures that are conserved between SIVmac239 and HIV-1NL4-3 also occur at the 5′ polyadenylation sequence, in the plus strand primer sites, PPT and cPPT, and in the stem-loop structure that includes the first splice acceptor site. The two genomes are adenosine-rich and cytidine-poor. The structured regions are enriched in guanosines, while unpaired regions are enriched in adenosines, and functionaly important structures have stronger base pairing than nonconserved structures. We conclude that much of the secondary structure is the result of fortuitous pairing in a metastable state that reforms during sequence evolution. However, secondary structure elements with important function are stabilized by higher guanosine content that allows regions of structure to persist as sequence evolution proceeds, and, within the confines of selective pressure, allows structures to evolve.  相似文献   

18.
Sequence-specific nuclear magnetic resonance (NMR) assignments have been determined for the peptide alphaS2-CN(2-20) containing the multiphosphorylated motif-8Ser(P)-Ser(P)-Ser(P)-Glu-Glu12- in the presence of molar excess Ca2+. The secondary structure of the peptide was characterized by sequential (i,i + 1), medium-range (i,i + 2/3/4) nOes and H alpha chemical shifts. Molecular modelling of the peptide based on these constraints suggests a nascent helix for residues Ser(P)9 to Glu12. The spectral data for alphaS2-CN(2-20) were compared with those of other casein phosphopeptides beta-CN(1-25) and alphaS1-CN(59-79) that also contain the multiphosphorylated motif. This comparison revealed a similar pattern of secondary amide chemical shifts in the multiphosphorylated motif. However, the patterns of medium-range nOe connectivities in the three peptides suggests they have distinctly different conformations in the presence of Ca2+ despite having a high degree of sequential similarity.  相似文献   

19.
Measuring the (dis)similarity between RNA secondary structures is critical for the study of RNA secondary structures and has implications to RNA functional characterization. Although a number of methods have been developed for comparing RNA structural similarities, their applications have been limited by the complexity of the required computation. In this paper, we present a novel method for comparing the similarity of RNA secondary structures generated from the same RNA sequence, i.e., a secondary structure ensemble, using a matrix representation of the RNA structures. Relevant features of the RNA secondary structures can be easily extracted through singular value decomposition (SVD) of the representing matrices. We have mapped the feature vectors of the singular values to a kernel space, where (dis)similarities among the mapped feature vectors become more evident, making clustering of RNA secondary structures easier to handle. The pair-wise comparison of RNA structures is achieved through computing the distance between the singular value vectors in the kernel space. We have applied a fuzzy kernel clustering method, using this similarity metric, to cluster the RNA secondary structure ensembles. Our application results suggest that our fuzzy kernel clustering method is highly promising for classifications of RNA structure ensembles, because of its low computational complexity and high clustering accuracy.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号