首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Sequences from ribosomal RNA (rRNA) genes have made a huge contribution to our current understanding of metazoan phylogeny and indeed the phylogeny of all of life. That said, some parts of this rRNA-based phylogeny remain unresolved. One approach to increase the resolution of these trees would be to use more appropriate models of sequence evolution in phylogenetic analysis. RNAs transcribed from rRNA genes have a complex secondary structure mediated by base pairing between sometimes distant regions of the rRNA molecule. The pairing between the stem nucleotides has important consequences for their evolution which differs from that of unpaired loop nucleotides. These differences in evolution should ideally be accounted for when using rRNA sequences for phylogeny estimation. We use a novel permutation approach to demonstrate the significant superiority of models of sequence evolution that allow stem and loop regions to evolve according to separate models and, in common with previous studies, we show that 16-state models that take base pairing of stems into account are significantly better than simpler, 4-state, single-nucleotide models. One of these 16-state models has been applied to the phylogeny of the Bilateria using small subunit rRNA (SSU) sequences. Our optimal tree largely echoes previous results based on SSU in particular supporting the tripartite Bilaterian tree of deuterostomes, lophotrochozoans, and ecdysozoans. There are also a number of differences, however, perhaps most important of which is the observation of a clade consisting of the gastrotrichs plus platyheminthes that is basal to all other lophotrochozoan taxa. Use of 16-state models also appears to reduce the Bayesian support given to certain biologically improbable groups found using standard 4-state models.  相似文献   

2.
To establish relationships among the genera Blepharocalyx, Luma, and Myrceugenia, the phylogeny of tribe Myrteae was reconstructed based on secondary structures of the sequences of ITS (ITS1-5.8S-ITS2) and ETS regions from 93 taxa belonging to 29 genera. The DNA sequences were aligned according to their secondary structure and analysed with Bayesian inference (BI) using base substitution models for RNA, which can differentiate between the regions consisting of base pairs (helices) and unpaired bases (loops). The best-fit models were RNA7C for ITS and RNA7B for ETS, which differ in how they predict double substitutions and symmetry of the frequency of base pairs. The phylogenetic hypothesis was compared with results obtained using classical 4-state models, the results of maximum parsimony, and using sequence alignment without considering the secondary structure. Analyses show low support along the backbone of the consensus trees, those using substitution models of paired sites showing more highly supported derived clades than substitution models of independent sites. All tests were consistent and differed in the positioning of certain groups, but they also showed that Blepharocalyx, Luma, and Myrceugenia arise from three independent lineages that are not closely related. The substitution model for RNA shows that Luma is a monophyletic group and sister to the other genera of Myrteae, except for Myrtus communis. Myrceugenia appears as a monophyletic group, except for M. fernandeziana, which is related to Blepharocalyx, a genus that appears to be biphyletic.  相似文献   

3.
RNA hairpin loop stability depends on closing base pair.   总被引:7,自引:4,他引:3       下载免费PDF全文
Thermodynamic parameters are reported for hairpin formation in 1 M NaCl by RNA sequences of the type GGXAUAAUAYCC, where X and Y are CG, GC, AU, UA, GU, or UG. A nearest neighbor analysis of the data indicates the free energy change for loop formation at 37 degrees C, delta degrees Gl,37, averages 3.4 kcal/mol for hairpin loops closed with C.G, G.C, and G.U pairs. In contrast, delta G degree l,37 averages 4.6 kcal/mol for loops closed with A.U, U.A, or U.G pairs. Thus the stability of an RNA hairpin depends on the closing base pair. The hairpin with a GA mismatch that is formed by GGCGUAAUAGCC is more stable than the corresponding hairpin with an AA mismatch. Thus hairpin stability also depends on loop sequence. These effects are not included in current algorithms for prediction of RNA structure from sequence.  相似文献   

4.
The internal transcribed spacers (ITS) of nuclear ribosomal DNA are widely used for phylogenetic inference. Several characteristics, including the influence of RNA secondary structure on the mutational dynamics of ITS, may impact on the accuracy of phylogenies estimated from these regions. Here, we develop RNA secondary structure predictions for representatives of the angiosperm family Myrtaceae. On this basis, we assess the utility of structural (stem vs. loop) partitioning, and RNA-specific (paired-sites) models for a 76 taxon Syzygium alignment, and for a broader, family-wide Myrtaceae ITS data set. We use a permutation approach to demonstrate that structural partitioning significantly improves the likelihood of the data. Similarly, models that account for the non-independence of stem-pairs in RNA structure have a higher likelihood than those that do not. The best-fit RNA models for ITS are those that exclude simultaneous double substitutions in stem-pairs, which suggests an absence of strong selection against non-canonical (G.U/U.G) base-pairs at a high proportion of stem-paired sites. We apply the RNA-specific models to the phylogeny of Syzygium and Myrtaceae and contrast these with hypotheses derived using standard 4-state models. There is little practical difference amongst relationships inferred for Syzygium although for Myrtaceae, there are several differences. The RNA-specific approach finds topologies that are less resolved but are more consistent with conventional views of myrtaceous relationships, compared with the 4-state models.  相似文献   

5.
Vecenie CJ  Morrow CV  Zyra A  Serra MJ 《Biochemistry》2006,45(5):1400-1407
Thermodynamic parameters are reported for hairpin formation in 1 M NaCl by RNA sequence of the types GCGXUAAUYCGC and GGUXUAAUYACC with Watson-Crick loop closure, where XY is the set of 10 possible mismatch base pairs. A nearest-neighbor analysis of the data indicates the free energy of loop formation at 37 degrees C varies from 3.1 to 5.1 kcal/mol. These results agree with the model previously developed [Vecenie, C. J., and Serra, M. J. (2004) Biochemistry 43, 11813] to predict the stability of RNA hairpin loops: DeltaG degrees (37L(n) = DeltaG degrees (37i(n) + DeltaG degrees (37MM) - 0.8 (if first mismatch is GA or UU) - 0.8 (if first mismatch is GG and loop is closed on the 5' side by a purine). Here, DeltaG degrees (37i(n) is the free energy for initiating a loop of n nucleotides, and DeltaG degrees (37MM) is the free energy for the interaction of the first mismatch with the closing base pair. Thermodynamic parameters are also reported for hairpin formation in 1 M NaCl by RNA sequence of the types GACGXUAAUYUGUC and GGUXUAAUYGCC with GU base pair closure, where XY is the set of 10 possible mismatch base pairs. A nearest-neighbor analysis of the data indicates the free energy of loop formation at 37 degrees C varies from 3.6 to 5.3 kcal/mol. These results allow the development of a model for predicting the stability of hairpin loops closed by GU base pairs. DeltaG degrees (37L(n) (kcal/mol) = DeltaG degrees (37i(n) - 0.8 (if the first mismatch is GA) - 0.8 (if the first mismatch is GG and the loop is closed on the 5' side by a purine). Note that for these hairpins, the stability of the loops does not depend on DeltaG degrees (37MM). For hairpin loops closed by GU base pairs, the DeltaG degrees (37i(n) values, when n = 4, 5, 6, 7, and 8, are 4.9, 5.0, 4.6, 5.0, and 4.8 kcal/mol, respectively. The model gives good agreement when tested against six naturally occurring hairpin sequences. Thermodynamic values for terminal mismatches adjacent to GC, GU, and UG base pairs are also reported.  相似文献   

6.
MOTIVATION: Most phylogenetic methods assume that the sequences of nucleotides or amino acids have evolved under stationary, reversible and homogeneous conditions. When these assumptions are violated by the data, there is an increased probability of errors in the phylogenetic estimates. Methods to examine aligned sequences for these violations are available, but they are rarely used, possibly because they are not widely known or because they are poorly understood. RESULTS: We describe and compare the available tests for symmetry of k-dimensional contingency tables from homologous sequences, and develop two new tests to evaluate different aspects of the evolutionary processes. For any pair of sequences, we consider a partition of the test for symmetry into a test for marginal symmetry and a test for internal symmetry. The proposed tests can be used to identify appropriate models for estimation of evolutionary relationships under a Markovian model. Simulations under more or less complex evolutionary conditions were done to display the performance of the tests. Finally, the tests were applied to an alignment of small-subunit ribosomal RNA sequences of five species of bacteria to outline the evolutionary processes under which they evolved. AVAILABILITY: Programs written in R to do the tests on nucleotides are available from http://www.maths.usyd.edu.au/u/johnr/testsym/  相似文献   

7.

Background  

As one of the most widely used parsimony methods for ancestral reconstruction, the Fitch method minimizes the total number of hypothetical substitutions along all branches of a tree to explain the evolution of a character. Due to the extensive usage of this method, it has become a scientific endeavor in recent years to study the reconstruction accuracies of the Fitch method. However, most studies are restricted to 2-state evolutionary models and a study for higher-state models is needed since DNA sequences take the format of 4-state series and protein sequences even have 20 states.  相似文献   

8.
Nguyen MT  Schroeder SJ 《Biochemistry》2010,49(49):10574-10581
Consecutive GU pairs at the ends of RNA helices provide significant thermodynamic stability between -1.0 and -3.8 kcal/mol at 37 °C, which is equivalent to approximately 2 orders of magnitude in the value of a binding constant. The thermodynamic stabilities of GU pairs depend on the sequence, stacking orientation, and position in the helix. In contrast to GU pairs in the middle of a helix that may be destabilizing, all consecutive terminal GU pairs contribute favorable thermodynamic stability. This work presents measured thermodynamic stabilities for 30 duplexes containing two, three, or four consecutive GU pairs at the ends of RNA helices and a model to predict the thermodynamic stabilities of terminal GU pairs. Imino proton NMR spectra show that the terminal GU nucleotides form hydrogen-bonded pairs. Different orientations of terminal GU pairs can have different conformations with equivalent thermodynamic stabilities. These new data and prediction model will help improve RNA secondary structure prediction, identification of miRNA target sequences with GU pairs, and efforts to understand the fundamental physical forces directing RNA structure and energetics.  相似文献   

9.
We describe here the error specificity of mammalian DNA polymerase eta (pol eta), an enzyme that performs translesion DNA synthesis and may participate in somatic hypermutation of immunoglobulin genes. Both mouse and human pol eta lack intrinsic proofreading exonuclease activity and both copy undamaged DNA inaccurately. Analysis of more than 1500 single-base substitutions by human pol eta indicates that error rates for all 12 mismatches are high and variable depending on the composition and symmetry of the mismatch and its location. pol eta also generates tandem base substitutions at an unprecedented rate, and kinetic analysis indicates that it extends a tandem double mismatch about as efficiently as other replicative enzymes extend single-base mismatches. This ability to use an aberrant primer terminus and the high rate of single and double-base substitutions support the idea that pol eta may forego strict shape complementarity in order to facilitate highly efficient lesion bypass. Relaxed discrimination is further indicated by pol eta infidelity for a wide variety of nucleotide deletion and addition errors. The nature and location of these errors suggest that some may be initiated by strand slippage, while others result from additional mechanisms.  相似文献   

10.
The 135-nucleotide Drosophila melanogaster 5 S RNA precursor is processed by removal of 15 nucleotides from its 3' end before incorporation into the large ribosomal subunit. Mature 5 S RNA consists of five helical stem-loops; stem IV and part of V are dispensable, whereas stem III and the 1/118 G-C base pair closest to the processing site at nucleotide 120 are required for processing (Preiser, P., and Levinger, L. (1991) J. Biol. Chem. 266, 7509-7516; Preiser, P., and Levinger, L. (1991) J. Biol. Chem. 266, 23602-23605). We have investigated the effects of stem I and loop A transversions, transitions, selected additions and deletions on 5 S RNA processing. Stem I single substitutions generally prevent processing, whereas compensatory double substitutions restore a range of processing rates. Proximal to the processing site, stem I double substitutions inhibit processing. In the distal portion of stem I and loop A, the processing effect of paired sequence changes varies widely in an irregular pattern. The 7/112 GU pair and nucleotide 13A least tolerate sequence changes; several mutations clustered close to the stem I-loop A boundary stimulate processing. We interpret these results in terms of the RNA helix path and possible RNA-protein contacts.  相似文献   

11.
Ribosomal DNA internal transcribed spacers (ITS) and partial external transcribed spacers (ETSf) are popularly used to infer evolutionary hypotheses. However, there is generally little consideration given to the secondary structures of these small RNA molecules and their potential effects on sequence alignment and phylogenetic analyzes. Intergeneric relationships amongst three of the four major lineages in the Sapindaceae, the Dodonaeoideae, Hippcastanoideae and Xanthoceroideae were assessed by firstly, generating secondary structure predictions for ITS and partial ETSf sequences, and then these predictions were used to assist alignment of the sequences. Secondly, the alignment was analyzed using RNA specific models of sequence evolution that account for the variation in nucleotide evolution in the independent loops and covariating stems regions of the ribosomal spacers. These models and phylogeny drawn from these analyzes were compared with that from analyzes using ‘traditional’ 4-state models and previous plastid analyzes. These analyzes identified that paired-site models developed to deal specifically with stem structures in RNA encoding sequences more appropriately account for the evolutionary history of the sequences than traditional 4-state substitution models.  相似文献   

12.
Empirical models of substitution are often used in protein sequence analysis because the large alphabet of amino acids requires that many parameters be estimated in all but the simplest parametric models. When information about structure is used in the analysis of substitutions in structured RNA, a similar situation occurs. The number of parameters necessary to adequately describe the substitution process increases in order to model the substitution of paired bases. We have developed a method to obtain substitution rate matrices empirically from RNA alignments that include structural information in the form of base pairs. Our data consisted of alignments from the European Ribosomal RNA Database of Bacterial and Eukaryotic Small Subunit and Large Subunit Ribosomal RNA ( Wuyts et al. 2001. Nucleic Acids Res. 29:175-177; Wuyts et al. 2002. Nucleic Acids Res. 30:183-185). Using secondary structural information, we converted each sequence in the alignments into a sequence over a 20-symbol code: one symbol for each of the four individual bases, and one symbol for each of the 16 ordered pairs. Substitutions in the coded sequences are defined in the natural way, as observed changes between two sequences at any particular site. For given ranges (windows) of sequence divergence, we obtained substitution frequency matrices for the coded sequences. Using a technique originally developed for modeling amino acid substitutions ( Veerassamy, Smith, and Tillier. 2003. J. Comput. Biol. 10:997-1010), we were able to estimate the actual evolutionary distance for each window. The actual evolutionary distances were used to derive instantaneous rate matrices, and from these we selected a universal rate matrix. The universal rate matrices were incorporated into the Phylip Software package ( Felsenstein 2002. http://evolution.genetics.washington.edu/phylip.html), and we analyzed the ribosomal RNA alignments using both distance and maximum likelihood methods. The empirical substitution models performed well on simulated data, and produced reasonable evolutionary trees for 16S ribosomal RNA sequences from sequenced Bacterial genomes. Empirical models have the advantage of being easily implemented, and the fact that the code consists of 20 symbols makes the models easily incorporated into existing programs for protein sequence analysis. In addition, the models are useful for simulating the evolution of RNA sequence and structure simultaneously.  相似文献   

13.
Thirteen strains of industrial bacterial cultures of the genus Lactobacillus (from a collection of Gabrichevsky Research Institute of Epidemiology and Microbiology) were studied. These strains were used for decades in Russian Federation for food and drug production, as ferments for lactic acid products, for production of probiotics, biologically active and veterinary preparations. Complex analysis of data on cultures obtained using microbiological and molecular-genetic methods was conducted for the first time. Biochemical characteristics of these cultures were studied and the sequence of the proximal region of 16S ribosomal RNA gene was determined. The employment of the test system API-50CHL was shown to broaden the opportunities of a more accurate biochemical identification of bacteria belonging to the genus Lactobacillus, in comparison with the set ANAEROTEST-23. According to the results obtained in a comparative analysis of nucleotide sequences of 16S rRNA gene, all strains examined show 97-99% homology of the proximal region of this gene with that of the type representatives of studied species. These data allowed taxonomic reclassification of the species position of cultures with consideration of the more advanced level of systematics. Nucleotide sequences of gene fragments of examined lactobacilli strains were recorded in NCBI database (accession numbers of deposits GU560031, GU560032, GU560033, GU560034, GU560035, GU560036, GU560037, GU560038, GU560039, GU560040, GU560041, GU560042, GU560043).  相似文献   

14.
The assembly and maturation of viruses with icosahedral capsids must be coordinated with icosahedral symmetry. The icosahedral symmetry imposes also the restrictions on the cooperative specific interactions between genomic RNA/DNA and coat proteins that should be reflected in quasi-regular segmentation of viral genomic sequences. Combining discrete direct and double Fourier transforms, we studied the quasi-regular large-scale segmentation in genomic sequences of different ssRNA, ssDNA, and dsDNA viruses. The particular representatives included satellite tobacco mosaic virus (STMV) and the strains of satellite tobacco necrosis virus (STNV), STNV-C, STNV-1, STNV-2, Escherichia phages MS2, ?X174, α3, and HK97, and Simian virus 40. In all their genomes, we found the significant quasi-regular segmentation of genomic sequences related to the virion assembly and the genome packaging within icosahedral capsid. We also found good correspondence between our results and available cryo-electron microscopy data on capsid structures and genome packaging in these viruses. Fourier analysis of genomic sequences provides the additional insight into mechanisms of hierarchical genome packaging and may be used for verification of the concepts of 3-fold or 5-fold intermediates in virion assembly. The results of sequence analysis should be taken into account at the choice of models and data interpretation. They also may be helpful for the development of antiviral drugs.  相似文献   

15.
Here we present a model of nucleotide substitution in protein-coding regions that also encode the formation of conserved RNA structures. In such regions, apparent evolutionary context dependencies exist, both between nucleotides occupying the same codon and between nucleotides forming a base pair in the RNA structure. The overlap of these fundamental dependencies is sufficient to cause "contagious" context dependencies which cascade across many nucleotide sites. Such large-scale dependencies challenge the use of traditional phylogenetic models in evolutionary inference because they explicitly assume evolutionary independence between short nucleotide tuples. In our model we address this by replacing context dependencies within codons by annotation-specific heterogeneity in the substitution process. Through a general procedure, we fragment the alignment into sets of short nucleotide tuples based on both the protein coding and the structural annotation. These individual tuples are assumed to evolve independently, and the different tuple sets are assigned different annotation-specific substitution models shared between their members. This allows us to build a composite model of the substitution process from components of traditional phylogenetic models. We applied this to a data set of full-genome sequences from the hepatitis C virus where five RNA structures are mapped within the coding region. This allowed us to partition the effects of selection on different structural elements and to test various hypotheses concerning the relation of these effects. Of particular interest, we found evidence of a functional role of loop and bulge regions, as these were shown to evolve according to a different and more constrained selective regime than the nonpairing regions outside the RNA structures. Other potential applications of the model include comparative RNA structure prediction in coding regions and RNA virus phylogenetics.  相似文献   

16.
Summary Three measures of sequence dissimilarity have been compared on a computer-generated model system in which substitutions in random sequences were made at randomly selected sites and the replacement character was chosen at random from the set of characters different from the original occupant of the site. The three measures were the conventionalmmismatch count between aligned sequences (AMC=m) and two measures not requiring prior sequence alignment. The latter two measures were the squared Euclidean distance between vectors of counts of t-tuples (t=1–6) of characters in the two sequences (multiplet distribution distances or MDD=d) and counts of characters not covered by word structures of statistically significant length common to the two sequences (common long words or CLW=SIB, SIS, or SAB). Average MDD distances were found to be two times average mismatch counts in the simulated sequences for all values of t from 1 to 6 and all degrees of substitution from one per sequence to so many as to produce, effectively, random sequences. This simple relation held independently of sequence length and of sequence composition. The relation was confirmed by exact results on small model systems and by formal asymptotic results in the limit of so few substitutions that no double hits occur and in the limit of two random sequences. The coefficient of variation for MDD distances was greater than that for mismatch counts for singlets but both measures approached the same low value for sextets. Needleman-Wunsch alignment produced incorrect mismatch counts at higher degrees of substitution. The model satisfied the conditions for the derivation of the Jukes-Cantor asymptotic adjustment, but its application produced increasingly bad results with increasing degrees of substitution in accord with earlier results on model and natural sequences. This fact was a consequence of the increase with increasing degrees of substitution of the sensitivity of the adjustment to error in the observations. Average CLW distances for a variety of common word structures were more or less parallel to MDD distances for appropriately long t-tuples. These results on model systems supported the validity of the two dissimilarity measures not requiring sequence alignment that was found in earlier work on natural sequences (Blaisdell 1989).  相似文献   

17.
An understanding of the stability of nucleic acid folding is critical for applications involving RNA viruses, small molecule–RNA binding, and therapeutics, for example. To explore factors that affect this stability, hairpins made from oligonucleotides containing both a GAAA tetraloop and three to five complements in the stem have been used as models where locked nucleic acids (LNAs) have been substituted into the sequence. UV spectroscopy was used to obtain melting curves in 20% by volume formamide, and the enthalpies and entropies of melting were determined. Although LNA substitutions typically increase the stability of a hybrid, we have found a decrease in stability for DNA and RNA GAAA hairpins when LNA is substituted into the loop. Tetraloops synthesized from natural bases show higher enthalpies and entropies of melting compared to the LNA substituted sequences indicating that LNA substitutions can destabilize a hairpin but stabilize the corresponding double stranded structure.  相似文献   

18.
19.
The mitochondrial 16S ribosomal RNA (rRNA) gene sequences from 93 cyprinid fishes were examined to reconstruct the phylogenetic relationships within the diverse and economically important subfamily Cyprininae. Within the subfamily a biased nucleotide composition (A>T, C>G) was observed in the loop regions of the gene, and in stem regions apparent selective pressures of base pairing showed a bias in favor of G over C and T over A. The bias may be associated with transition-transversion bias. Rates of nucleotide substitution were lower in stems than in loops. Analysis of compensatory substitutions across these taxa demonstrates 68% covariation in the gene and a logical weighting factor to account for dependence in mutations for phylogenetic inference should be 0.66. Comparisons of varied stem-loop weighting schemes indicate that the down-weightings for stem regions could improve the phylogenetic analysis and the degree of non-independence of stem substitutions was not as important as expected. Bayesian inference under four models of nucleotide substitution indicated that likelihood-based phylogenetic analyses were more effective in improving the phylogenetic performance than was weighted parsimony analysis. In Bayesian analyses, the resolution of phylogenies under the 16-state models for paired regions, incorporating GTR + G + I models for unpaired regions was better than those under other models. The subfamily Cyprininae was resolved as a monophyletic group, as well as tribe Labein and several genera. However, the monophyly of the currently recognized tribes, such as Schizothoracin, Barbin, Cyprinion + Onychostoma lineages, and some genera was rejected. Furthermore, comparisons of the parsimony and Bayesian analyses and results of variable length bootstrap analysis indicates that the mitochondrial 16S rRNA gene should contain important character variation to recover well-supported phylogeny of cyprinid taxa whose divergences occurred within the recent 8 MY, but could not provide resolution power for deep phylogenies spanning 10-19 MYA.  相似文献   

20.
Packaging of the Cystovirus varphi8 genome into the polymerase complex is catalysed by the hexameric P4 packaging motor. The motor is located at the fivefold vertices of the icosahedrally symmetric polymerase complex, and the symmetry mismatch between them may be critical for function. We have developed a novel image-processing approach for the analysis of symmetry-mismatched structures and applied it to cryo-electron microscopy images of P4 bound to the polymerase complex. This approach allowed us to solve the three-dimensional structure of the P4 in situ to 15-A resolution. The C-terminal face of P4 was observed to interact with the polymerase complex, supporting the current view on RNA translocation. We suggest that the symmetry mismatch between the two components may facilitate the ring opening required for RNA loading prior to its translocation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号