首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The coexistence of multiple codes in the genome of human immunodeficiency virus type 1 (HIV-1) was analyzed. We explored factors constraining the variability of the virus genome primarily in relation to conserved RNA secondary structures overlapping coding sequences, and used a simple combination of algorithms for RNA secondary structure prediction based on the nearest-neighbor thermodynamic rules and a statistical approach. In our previous study, we applied this combination to a non- redundant data set of env nucleotide sequences, confirmed the conservative secondary structure of the rev-responsive element (RRE) and found a new RNA structure in the first conserved (C1) region of the env gene. In this study, we analyzed the variability of putative RNA secondary structures inside the nef gene of HIV-1 by applying these algorithms to a non-redundant data set of 104 nef sequences retrieved from the Los Alamos HIV database, and predicted the existence of a novel functional RNA secondary structure in the β3/β4 regions of nef. The predicted RNA fold in the β3/β4 region of nef appears in two forms with different loop sizes. The loop of the first fold consists of seven nucleotides (positions 494–500), with consensus UCAAGCU appearing in 79% of sequences. The other has a five-base loop (positions 495–499) with consensus CAAGC. The difference in size between these two loops may reflect the difference between respective counterparts in the hairpin recognition. This may also have an adaptive biological significance.  相似文献   

2.
The functional structure of all biologically active molecules is dependent on intra- and inter-molecular interactions. This is especially evident for RNA molecules whose functionality, maturation, and regulation require formation of correct secondary structure through encoded base-pairing interactions. Unfortunately, intra- and inter-molecular base-pairing information is lacking for most RNAs. Here, we marry classical nuclease-based structure mapping techniques with high-throughput sequencing technology to interrogate all base-paired RNA in Arabidopsis thaliana and identify ∼200 new small (sm)RNA–producing substrates of RNA–DEPENDENT RNA POLYMERASE6. Our comprehensive analysis of paired RNAs reveals conserved functionality within introns and both 5′ and 3′ untranslated regions (UTRs) of mRNAs, as well as a novel population of functional RNAs, many of which are the precursors of smRNAs. Finally, we identify intra-molecular base-pairing interactions to produce a genome-wide collection of RNA secondary structure models. Although our methodology reveals the pairing status of RNA molecules in the absence of cellular proteins, previous studies have demonstrated that structural information obtained for RNAs in solution accurately reflects their structure in ribonucleoprotein complexes. Furthermore, our identification of RNA–DEPENDENT RNA POLYMERASE6 substrates and conserved functional RNA domains within introns and both 5′ and 3′ untranslated regions (UTRs) of mRNAs using this approach strongly suggests that RNA molecules are correctly folded into their secondary structure in solution. Overall, our findings highlight the importance of base-paired RNAs in eukaryotes and present an approach that should be widely applicable for the analysis of this key structural feature of RNA.  相似文献   

3.
Rotaviruses are a major cause of acute, often fatal, gastroenteritis in infants and young children world-wide. Virions contain an 11 segment double-stranded RNA genome. Little is known about the cis-acting sequences and structural elements of the viral RNAs. Using a database of 1621 full-length sequences of mammalian group A rotavirus RNA segments, we evaluated the codon, sequence and RNA structural conservation of the complete genome. Codon conservation regions were found in eight ORFs, suggesting the presence of functional RNA elements. Using ConStruct and RNAz programmes, we identified conserved secondary structures in the positive-sense RNAs including long-range interactions (LRIs) at the 5′ and 3′ terminal regions of all segments. In RNA9, two mutually exclusive structures were observed suggesting a switch mechanism between a conserved terminal LRI and an independent 3′ stem–loop structure. In RNA6, a conserved stem–loop was found in a region previously reported to have translation enhancement activity. Biochemical structural analysis of RNA11 confirmed the presence of terminal LRIs and two internal helices with high codon and sequence conservation. These extensive in silico and in vitro analyses provide evidence of the conservation, complexity, multi-functionality and dynamics of rotavirus RNA structures which likely influence RNA replication, translation and genome packaging.  相似文献   

4.
Messenger RNA (mRNA) processing plays important roles in gene expression in all domains of life. A number of cases of mRNA cleavage have been documented in Archaea, but available data are fragmentary. We have examined RNAs present in Methanocaldococcus (Methanococcus) jannaschii for evidence of RNA processing upstream of protein-coding genes. Of 123 regions covered by the data, 31 were found to be processed, with 30 including a cleavage site 12–16 nucleotides upstream of the corresponding translation start site. Analyses with 3′-RACE (rapid amplification of cDNA ends) and 5′-RACE indicate that the processing is endonucleolytic. Analyses of the sequences surrounding the processing sites for functional sites, sequence motifs, or potential RNA secondary structure elements did not reveal any recurring features except for an AUG translation start codon and (in most cases) a ribosome binding site. These properties differ from those of all previously described mRNA processing systems. Our data suggest that the processing alters the representation of various genes in the RNA pool and therefore, may play a significant role in defining the balance of proteins in the cell.  相似文献   

5.
A bacterial RNA functioning as both tRNA and mRNA, transfer-messenger RNA (tmRNA) rescues stalled ribosomes and clears the cell of incomplete polypeptides. For function, Escherichia coli tmRNA requires an elaborate interplay between a tRNA-like structure and an internal mRNA domain that are connected by a 295 nt long compact secondary structure. The tRNA-like structure is surrounded by 16 unpaired nt, including 10 residues that are >95% conserved among the known 140 tmRNA sequences. All these residues were mutated to define their putative role(s) in trans-translation. Both the extent of aminoacylation and the alanine incorporation into the tag sequence, reflecting the two functions of tmRNA, were measured in vitro for all variants. As anticipated from the low sequence conservation, mutating positions 8–12 and position 15 affects neither aminoacylation nor protein tagging. Mutating a set of two conserved positions 13 and 14 abolishes both functions. Probing the solution conformation indicates that this defective mutant adopts an alternate conformation of its acceptor stem that is no more aminoacylatable, and thus inactive in protein tagging. Selected point mutations at the conserved nucleotide stretches 16–20 and 333–335 seriously impair protein tagging with only minor changes in their solution conformations and aminoacylation. Point mutations at conserved positions 19 and 334 abolish trans-translation and 70S ribosome binding, although retaining nearly normal aminoacylation capacities. Two proteins that are known to interact with tmRNA were purified, and their interactions with the defective RNA variants were examined in vitro. Based on phylogenetic and functional data, an additional structural motif consisting of a quartet composed of non-Watson–Crick base pairs 5′-YGAC-3′:5′-GGAC-3′ involving some of the conserved nucleotides next to the tRNA-like portion is proposed. Overall, the highly conserved nucleotides around the tRNA-like portion are maintained for both structural and functional requirements during evolution.  相似文献   

6.
Single-stranded regions in RNA secondary structure are important for RNA–RNA and RNA–protein interactions. We present a probability profile approach for the prediction of these regions based on a statistical algorithm for sampling RNA secondary structures. For the prediction of phylogenetically-determined single-stranded regions in secondary structures of representative RNA sequences, the probability profile offers substantial improvement over the minimum free energy structure. In designing antisense oligonucleotides, a practical problem is how to select a secondary structure for the target mRNA from the optimal structure(s) and many suboptimal structures with similar free energies. By summarizing the information from a statistical sample of probable secondary structures in a single plot, the probability profile not only presents a solution to this dilemma, but also reveals ‘well-determined’ single-stranded regions through the assignment of probabilities as measures of confidence in predictions. In antisense application to the rabbit β-globin mRNA, a significant correlation between hybridization potential predicted by the probability profile and the degree of inhibition of in vitro translation suggests that the probability profile approach is valuable for the identification of effective antisense target sites. Coupling computational design with DNA–RNA array technique provides a rational, efficient framework for antisense oligonucleotide screening. This framework has the potential for high-throughput applications to functional genomics and drug target validation.  相似文献   

7.
It is important to control CRISPR/Cas9 when sufficient editing is obtained. In the current study, rational engineering of guide RNAs (gRNAs) is performed to develop small-molecule-responsive CRISPR/Cas9. For our purpose, the sequence of gRNAs are modified to introduce ligand binding sites based on the rational design of ligand–RNA pairs. Using short target sequences, we demonstrate that the engineered RNA provides an excellent scaffold for binding small molecule ligands. Although the ‘stem–loop 1’ variants of gRNA induced variable cleavage activity for different target sequences, all ‘stem–loop 3’ variants are well tolerated for CRISPR/Cas9. We further demonstrate that this specific ligand–RNA interaction can be utilized for functional control of CRISPR/Cas9 in vitro and in human cells. Moreover, chemogenetic control of gene editing in human cells transfected with all-in-one plasmids encoding Cas9 and designer gRNAs is demonstrated. The strategy may become a general approach for generating switchable RNA or DNA for controlling other biological processes.  相似文献   

8.
We use evolutionary conservation derived from structure alignment of polypeptide sequences along with structural and physicochemical attributes of protein–RNA interfaces to probe the binding hot spots at protein–RNA recognition sites. We find that the degree of conservation varies across the RNA binding proteins; some evolve rapidly compared to others. Additionally, irrespective of the structural class of the complexes, residues at the RNA binding sites are evolutionary better conserved than those at the solvent exposed surfaces. For recognitions involving duplex RNA, residues interacting with the major groove are better conserved than those interacting with the minor groove. We identify multi-interface residues participating simultaneously in protein–protein and protein–RNA interfaces in complexes where more than one polypeptide is involved in RNA recognition, and show that they are better conserved compared to any other RNA binding residues. We find that the residues at water preservation site are better conserved than those at hydrated or at dehydrated sites. Finally, we develop a Random Forests model using structural and physicochemical attributes for predicting binding hot spots. The model accurately predicts 80% of the instances of experimental ΔΔG values in a particular class, and provides a stepping-stone towards the engineering of protein–RNA recognition sites with desired affinity.  相似文献   

9.
10.
RNA is known to be involved in several cellular processes; however, it is only active when it is folded into its correct 3D conformation. The folding, bending and twisting of an RNA molecule is dependent upon the multitude of canonical and non-canonical secondary structure motifs. These motifs contribute to the structural complexity of RNA but also serve important integral biological functions, such as serving as recognition and binding sites for other biomolecules or small ligands. One of the most prevalent types of RNA secondary structure motifs are single mismatches, which occur when two canonical pairs are separated by a single non-canonical pair. To determine sequence–structure relationships and to identify structural patterns, we have systematically located, annotated and compared all available occurrences of the 30 most frequently occurring single mismatch-nearest neighbor sequence combinations found in experimentally determined 3D structures of RNA-containing molecules deposited into the Protein Data Bank. Hydrogen bonding, stacking and interaction of nucleotide edges for the mismatched and nearest neighbor base pairs are described and compared, allowing for the identification of several structural patterns. Such a database and comparison will allow researchers to gain insight into the structural features of unstudied sequences and to quickly look-up studied sequences.  相似文献   

11.
Currently there is no successful computational approach for identification of genes encoding novel functional RNAs (fRNAs) in genomic sequences. We have developed a machine learning approach using neural networks and support vector machines to extract common features among known RNAs for prediction of new RNA genes in the unannotated regions of prokaryotic and archaeal genomes. The Escherichia coli genome was used for development, but we have applied this method to several other bacterial and archaeal genomes. Networks based on nucleotide composition were 80–90% accurate in jackknife testing experiments for bacteria and 90–99% for hyperthermophilic archaea. We also achieved a significant improvement in accuracy by combining these predictions with those obtained using a second set of parameters consisting of known RNA sequence motifs and the calculated free energy of folding. Several known fRNAs not included in the training datasets were identified as well as several hundred predicted novel RNAs. These studies indicate that there are many unidentified RNAs in simple genomes that can be predicted computationally as a precursor to experimental study. Public access to our RNA gene predictions and an interface for user predictions is available via the web.  相似文献   

12.
13.
In allostery, a binding event at one site in a protein modulates the behavior of a distant site. Identifying residues that relay the signal between sites remains a challenge. We have developed predictive models using support-vector machines, a widely used machine-learning method. The training data set consisted of residues classified as either hotspots or non-hotspots based on experimental characterization of point mutations from a diverse set of allosteric proteins. Each residue had an associated set of calculated features. Two sets of features were used, one consisting of dynamical, structural, network, and informatic measures, and another of structural measures defined by Daily and Gray [1]. The resulting models performed well on an independent data set consisting of hotspots and non-hotspots from five allosteric proteins. For the independent data set, our top 10 models using Feature Set 1 recalled 68–81% of known hotspots, and among total hotspot predictions, 58–67% were actual hotspots. Hence, these models have precision P = 58–67% and recall R = 68–81%. The corresponding models for Feature Set 2 had P = 55–59% and R = 81–92%. We combined the features from each set that produced models with optimal predictive performance. The top 10 models using this hybrid feature set had R = 73–81% and P = 64–71%, the best overall performance of any of the sets of models. Our methods identified hotspots in structural regions of known allosteric significance. Moreover, our predicted hotspots form a network of contiguous residues in the interior of the structures, in agreement with previous work. In conclusion, we have developed models that discriminate between known allosteric hotspots and non-hotspots with high accuracy and sensitivity. Moreover, the pattern of predicted hotspots corresponds to known functional motifs implicated in allostery, and is consistent with previous work describing sparse networks of allosterically important residues.  相似文献   

14.
15.
Many well-characterized examples of antisense RNAs from prokaryotic systems involve hybridization of the looped regions of stem–loop RNAs, presumably due to the high thermodynamic stability of the resulting loop–loop and loop–linear interactions. In this study, the identification of RNA stem–loops that inhibit U1A protein binding to the hpII RNA through RNA–RNA interactions was attempted using a bacterial reporter system based on phage λ N-mediated antitermination. As a result, loop sequences possessing 7–8 base complementarity to the 5′ region of the boxA element important for functional antitermination complex formation, but not the U1 hpII loop, were identified. In vitro and in vivo mutational analysis strongly suggested that the selected loop sequences were binding to the boxA region, and that the structure of the antisense stem–loop was important for optimal inhibitory activity. Next, in an attempt to demonstrate the ability to inhibit the interaction between the U1A protein and the hpII RNA, the rational design of an RNA stem–loop that inhibits U1A-binding to a modified hpII was carried out. Moderate inhibitory activity was observed, showing that it is possible to design and select antisense RNA stem–loops that disrupt various types of RNA–protein interactions.  相似文献   

16.
RNA function is determined by its structural organization. The RNA structure consists of the combination of distinct secondary structure motifs connected by junctions that play an essential role in RNA folding. Selective 2′-hydroxyl acylation analyzed by primer extension (SHAPE) probing is an established methodology to analyze the secondary structure of long RNA molecules in solution, which provides accurate data about unpaired nucleotides. However, the residues located at the junctions of RNA structures usually remain undetected. Here we report an RNA probing method based on the use of a novel open-paddlewheel diruthenium (OPW-Ru) compound [Ru2Cl2(µ-DPhF)3(DMSO)] (DPhF = N,N′-diphenylformamidinate). This compound has four potential coordination sites in a singular disposition to establish covalent bonds with substrates. As a proof of concept, we have analyzed the reactivity of OPW-Ru toward RNA using two viral internal ribosome entry site (IRES) elements whose function depends on the structural organization of the molecule. Our study suggests that the compound OPW-Ru preferentially attacks at positions located one or two nucleotides away from junctions or bulges of the RNA structure. The OPW-Ru fingerprinting data differ from that obtained by other chemical reagents and provides new information about RNA structure features.  相似文献   

17.
Clinical usage of lentiviral vectors is now established and increasing but remains constrained by vector titer with RNA packaging being a limiting factor. Lentiviral vector RNA is packaged through specific recognition of the packaging signal on the RNA by the viral structural protein Gag. We investigated structurally informed modifications of the 5′ leader and gag RNA sequences in which the extended packaging signal lies, to attempt to enhance the packaging process by facilitating vector RNA dimerization, a process closely linked to packaging. We used in-gel SHAPE to study the structures of these mutants in an attempt to derive structure-function correlations that could inform optimized vector RNA design. In-gel SHAPE of both dimeric and monomeric species of RNA revealed a previously unreported direct interaction between the U5 region of the HIV-1 leader and the downstream gag sequences. Our data suggest a structural equilibrium exists in the dimeric viral RNA between a metastable structure that includes a U5–gag interaction and a more stable structure with a U5–AUG duplex. Our data provide clarification for the previously unexplained requirement for the 5′ region of gag in enhancing genomic RNA packaging and provide a basis for design of optimized HIV-1 based vectors.  相似文献   

18.
The rapid evolution of RNA viruses has been long considered to result from a combination of high copying error frequencies during RNA replication, short generation times and the consequent extensive fixation of neutral or adaptive changes over short periods. While both the identities and sites of mutations are typically modelled as being random, recent investigations of sequence diversity of SARS coronavirus 2 (SARS-CoV-2) have identified a preponderance of C->U transitions, proposed to be driven by an APOBEC-like RNA editing process. The current study investigated whether this phenomenon could be observed in datasets of other RNA viruses. Using a 5% divergence filter to infer directionality, 18 from 36 datasets of aligned coding region sequences from a diverse range of mammalian RNA viruses (including Picornaviridae, Flaviviridae, Matonaviridae, Caliciviridae and Coronaviridae) showed a >2-fold base composition normalised excess of C->U transitions compared to U->C (range 2.1x–7.5x), with a consistently observed favoured 5’ U upstream context. The presence of genome scale RNA secondary structure (GORS) was the only other genomic or structural parameter significantly associated with C->U/U->C transition asymmetries by multivariable analysis (ANOVA), potentially reflecting RNA structure dependence of sites targeted for C->U mutations. Using the association index metric, C->U changes were specifically over-represented at phylogenetically uninformative sites, potentially paralleling extensive homoplasy of this transition reported in SARS-CoV-2. Although mechanisms remain to be functionally characterised, excess C->U substitutions accounted for 11–14% of standing sequence variability of structured viruses and may therefore represent a potent driver of their sequence diversification and longer-term evolution.  相似文献   

19.
20.
Although tetraloops are one of the most frequently occurring secondary structure motifs in RNA, less than one-third of the 30 most frequently occurring RNA tetraloops have been thermodynamically characterized. Therefore, 24 stem–loop sequences containing common tetraloops were optically melted, and the thermodynamic parameters ΔH°, ΔS°, ΔG°37, and TM for each stem–loop were determined. These new experimental values, on average, are 0.7 kcal/mol different from the values predicted for these tetraloops using the model proposed by Vecenie CJ, Morrow CV, Zyra A, Serra MJ. 2006. Biochemistry 45: 1400–1407. The data for the 24 tetraloops reported here were then combined with the data for 28 tetraloops that were published previously. A new model, independent of terminal mismatch data, was derived to predict the free energy contribution of previously unmeasured tetraloops. The average absolute difference between the measured values and the values predicted using this proposed model is 0.4 kcal/mol. This new experimental data and updated predictive model allow for more accurate calculations of the free energy of RNA stem–loops containing tetraloops and, furthermore, should allow for improved prediction of secondary structure from sequence. It was also shown that tetraloops within the sequence 5′-GCCNNNNGGC-3′ are, on average, 0.6 kcal/mol more stable than the same tetraloop within the sequence 5′-GGCNNNNGCC-3′. More systemic studies are required to determine the full extent of non-nearest-neighbor effects on tetraloop stability.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号