首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background  

Rare genomic changes (RGCs) that are thought to comprise derived shared characters of individual clades are becoming an increasingly important class of markers in genome-wide phylogenetic studies. Recently, we proposed a new type of RGCs designated RGC_CAMs (after Conserved Amino acids-Multiple substitutions) that were inferred using genome-wide identification of amino acid replacements that were: i) located in unambiguously aligned regions of orthologous genes, ii) shared by two or more taxa in positions that contain a different, conserved amino acid in a much broader range of taxa, and iii) require two or three nucleotide substitutions. When applied to animal phylogeny, the RGC_CAM approach supported the coelomate clade that unites deuterostomes with arthropods as opposed to the ecdysozoan (molting animals) clade. However, a non-negligible level of homoplasy was detected.  相似文献   

2.
The genomic era has seen a remarkable increase in the number of genomes being sequenced and annotated. Nonetheless, annotation remains a serious challenge for compositionally biased genomes. For the preliminary annotation, popular nucleotide and protein comparison methods such as BLAST are widely employed. These methods make use of matrices to score alignments such as the amino acid substitution matrices. Since a nucleotide bias leads to an overall bias in the amino acid composition of proteins, it is possible that a genome with nucleotide bias may have introduced atypical amino acid substitutions in its proteome. Consequently, standard matrices fail to perform well in sequence analysis of these genomes. To address this issue, we examined the amino acid substitution in the AT-rich genome of Plasmodium falciparum, chosen as a reference and reconstituted a substitution matrix in the genome's context. The matrix was used to generate protein sequence alignments for the parasite proteins that improved across the functional regions. We attribute this to the consistency that may have been achieved amid the target and background frequencies calculated exclusively in our study. This study has important implications on annotation of proteins that are of experimental interest but give poor sequence alignments with standard conventional matrices.  相似文献   

3.
We examined a broad selection of protein-coding loci from a diverse array of clades and genomes to quantify three factors that determine whether nucleotide or amino acid characters should be preferred for phylogenetic inference. First, we quantified the difference in observed character-state space between nucleotides and amino acids. Second, we quantified the loss of potential phylogenetic signal from silent substitutions when amino acids are used. Third, we used the disparity index to quantify the relative compositional heterogeneity of nucleotides and amino acids and then determined how commonly convergent (rather than unique) shifts in nucleotide and amino acid composition occur in a phylogenetic context. The greater potential phylogenetic signal for nucleotide characters was found to be enormous (on average 440% that of amino acids), whereas the greater observed character-state space for amino acids was less impressive (on average 150.4% that of nucleotides). While matrices of amino acid sequences had less compositional heterogeneity than their corresponding nucleotide sequences, heterogeneity in amino acid composition may be more homoplasious than heterogeneity in nucleotide composition. Given the ability of increased taxon sampling to better utilize the greater potential phylogenetic signal of nucleotide characters and decrease the potential for artifacts caused by heterogeneous nucleotide composition among taxa, we suggest that increased taxon sampling be performed whenever possible instead of restricting analyses to amino acid characters.  相似文献   

4.
The use of some multiple-sequence alignments in phylogenetic analysis, particularly those that are not very well conserved, requires the elimination of poorly aligned positions and divergent regions, since they may not be homologous or may have been saturated by multiple substitutions. A computerized method that eliminates such positions and at the same time tries to minimize the loss of informative sites is presented here. The method is based on the selection of blocks of positions that fulfill a simple set of requirements with respect to the number of contiguous conserved positions, lack of gaps, and high conservation of flanking positions, making the final alignment more suitable for phylogenetic analysis. To illustrate the efficiency of this method, alignments of 10 mitochondrial proteins from several completely sequenced mitochondrial genomes belonging to diverse eukaryotes were used as examples. The percentages of removed positions were higher in the most divergent alignments. After removing divergent segments, the amino acid composition of the different sequences was more uniform, and pairwise distances became much smaller. Phylogenetic trees show that topologies can be different after removing conserved blocks, particularly when there are several poorly resolved nodes. Strong support was found for the grouping of animals and fungi but not for the position of more basal eukaryotes. The use of a computerized method such as the one presented here reduces to a certain extent the necessity of manually editing multiple alignments, makes the automation of phylogenetic analysis of large data sets feasible, and facilitates the reproduction of the final alignment by other researchers.  相似文献   

5.
Yu Z  Wright SI  Bureau TE 《Genetics》2000,156(4):2019-2031
While genome-wide surveys of abundance and diversity of mobile elements have been conducted for some class I transposable element families, little is known about the nature of class II transposable elements on this scale. In this report, we present the results from analysis of the sequence and structural diversity of Mutator-like elements (MULEs) in the genome of Arabidopsis thaliana (Columbia). Sequence similarity searches and subsequent characterization suggest that MULEs exhibit extreme structure, sequence, and size heterogeneity. Multiple alignments at the nucleotide and amino acid levels reveal conserved, potentially transposition-related sequence motifs. While many MULEs share common structural features to Mu elements in maize, some groups lack characteristic long terminal inverted repeats. High sequence similarity and phylogenetic analyses based on nucleotide sequence alignments indicate that many of these elements with diverse structural features may remain transpositionally competent and that multiple MULE lineages may have been evolving independently over long time scales. Finally, there is evidence that MULEs are capable of the acquisition of host DNA segments, which may have implications for adaptive evolution, both at the element and host levels.  相似文献   

6.
For their apparent morphological simplicity, the Platyhelminthes or “flatworms” are a diverse clade found in a broad range of habitats. Their body plans have however made them difficult to robustly classify. Molecular evidence is only beginning to uncover the true evolutionary history of this clade. Here we present nine novel mitochondrial genomes from the still undersampled orders Polycladida and Rhabdocoela, assembled from short Illumina reads. In particular we present for the first time in the literature the mitochondrial sequence of a Rhabdocoel, Bothromesostoma personatum (Typhloplanidae, Mesostominae). The novel mitochondrial genomes examined generally contained the 36 genes expected in the Platyhelminthes, with all possessing 12 of the 13 protein-coding genes normally found in metazoan mitochondrial genomes (ATP8 being absent from all Platyhelminth mtDNA sequenced to date), along with two ribosomal RNA genes. The majority presented possess 22 transfer RNA genes, and a single tRNA gene was absent from two of the nine assembled genomes. By comparison of mitochondrial gene order and phylogenetic analysis of the protein coding and ribosomal RNA genes contained within these sequences with those of previously sequenced species we are able to gain a firm molecular phylogeny for the inter-relationships within this clade.Our phylogenetic reconstructions, using both nucleotide and amino acid sequences under several models and both Bayesian and Maximum Likelihood methods, strongly support the monophyly of Polycladida, and the monophyly of Acotylea and Cotylea within that clade. They also allow us to speculate on the early emergence of Macrostomida, the monophyly of a “Turbellarian-like” clade, the placement of Rhabditophora, and that of Platyhelminthes relative to the Lophotrochozoa (=Spiralia). The data presented here therefore represent a significant advance in our understanding of platyhelminth phylogeny, and will form the basis of a range of future research in the still-disputed classifications within this taxon.  相似文献   

7.
Hughes AL  Piontkivska H  Foppa I 《Gene》2007,399(2):152-161
Phylogenetic analysis of complete genomes of West Nile virus (WNV) by a variety of methods supported the hypothesis that North American isolates of WNV constitute a monophyletic group, together with an isolate from Israel and one from Hungary. We used ancestral sequence reconstruction in order to obtain evidence for evolutionary changes that might be correlated with increased virulence in this clade (designated the N.A. clade). There was one amino acid change (I-->T at residue 356 of the NS3 protein) that occurred in the ancestor of the N.A. clade and remained conserved in all N.A. clade genomes analyzed. There were four changes in the upstream portion of the 3' noncoding region (the AT-enriched region) that occurred in the ancestor of the N.A. clade and remained conserved in all N.A. clade genomes analyzed, changes predicted to alter RNA secondary structure. The AT-enriched region showed a higher rate of substitution in the branch ancestral to the N.A. clade, relative to polymorphism, than did the remainder of the noncoding regions, synonymous sites in coding regions, or nonsynonymous sites in coding regions. The high rate of occurrence of fixed nucleotide substitutions in this region suggests that positive Darwinian selection may have acted on this portion of the 3'NCR and that these fixed changes, possibly in concert with the amino acid change in NS3, may underlie phenotypic effects associated with increased virulence in North American WNV.  相似文献   

8.
MOTIVATION: Multiple sequence alignments of homologous proteins are useful for inferring their phylogenetic history and to reveal functionally important regions in the proteins. Functional constraints may lead to co-variation of two or more amino acids in the sequence, such that a substitution at one site is accompanied by compensatory substitutions at another site. It is not sufficient to find the statistical correlations between sites in the alignment because these may be the result of several undetermined causes. In particular, phylogenetic clustering will lead to many strong correlations. RESULTS: A procedure is developed to detect statistical correlations stemming from functional interaction by removing the strong phylogenetic signal that leads to the correlations of each site with many others in the sequence. Our method relies upon the accuracy of the alignment but it does not require any assumptions about the phylogeny or the substitution process. The effectiveness of the method was verified using computer simulations and then applied to predict functional interactions between amino acids in the Pfam database of alignments.  相似文献   

9.
Conflict between Amino Acid and Nucleotide Characters   总被引:5,自引:0,他引:5  
Slowly evolving characters, such as amino acids and replacement substitutions, have generally been favored over faster evolving characters for inferring phylogenetic relationships. However, amino acids constitute composite characters and, because of the degenerate genetic code, are subject to convergence. Based on an analysis of atpB and rbcL in 567 seed plants, we show that silent substitutions may be more phylogenetically informative than replacement substitutions and that artifacts caused by composite characters and/or convergence cause clades on amino acid trees to conflict with nucleotide trees and independent evidence. These findings indicate that coding nucleotide sequences only as amino acid characters for phylogenetic analysis provides little benefit and may yield misleading results.  相似文献   

10.
Both traditional as well as 10 more recent methods of coding characters from exons of protein‐coding genes are reviewed. The more recent methods collectively blur the distinction between nucleotide and amino‐acid coding and enable investigators to carefully quantify the effects of different sources of phylogenetic signal as well as their potential biases. Codon models, which explicitly model silent and replacement substitutions, are a major advance and are expected to be broadly useful for simultaneously inferring recent and ancient divergences, unlike amino‐acid coding. Degeneracy coding, wherein ambiguity codes are used to eliminate silent substitutions at the individual‐nucleotide level, has clear advantages over scoring amino‐acid characters. Nucleotide, codon, and amino‐acid models are now directly comparable with easy‐to‐use programs, and widely used phylogenetics programs can analyze partitioned supermatrices that incorporate all three types of model. Therefore, it should become standard practice to test among these alternative model types before conducting parametric phylogenetic analyses. An earlier study of 78 protein‐coding genes from 360 green‐plant plastid genomes is used as an empirical example with which to quantify the relative performance of alternative character‐coding methods using five quantification measures. Codon models were selected as having the best fit to the data, yet were outperformed by nucleotide models for all five quantification measures. Third‐codon positions were found to be an important source of phylogenetic signal and even outperformed analyses of first and second positions for some measures. Degeneracy coding generally performed at least as well as amino‐acid coding and is an arguably more effective alternative.  相似文献   

11.
Phylogenetic analyses based on mitochondrial DNA have yielded widely differing relationships among members of the arthropod lineage Arachnida, depending on the nucleotide coding schemes and models of evolution used. We enhanced taxonomic coverage within the Arachnida greatly by sequencing seven new arachnid mitochondrial genomes from five orders. We then used all 13 mitochondrial protein-coding genes from these genomes to evaluate patterns of nucleotide and amino acid biases. Our data show that two of the six orders of arachnids (spiders and scorpions) have experienced shifts in both nucleotide and amino acid usage in all their protein-coding genes, and that these biases mislead phylogeny reconstruction. These biases are most striking for the hydrophobic amino acids isoleucine and valine, which appear to have evolved asymmetrical exchanges in response to shifts in nucleotide composition. To improve phylogenetic accuracy based on amino acid differences, we tested two recoding methods: (1) removing all isoleucine and valine sites and (2) recoding amino acids based on their physiochemical properties. We find that these methods yield phylogenetic trees that are consistent in their support of ancient intraordinal divergences within the major arachnid lineages. Further refinement of amino acid recoding methods may help us better delineate interordinal relationships among these diverse organisms.  相似文献   

12.

Background

The orders Ascaridida, Oxyurida, and Spirurida represent major components of zooparasitic nematode diversity, including many species of veterinary and medical importance. Phylum-wide nematode phylogenetic hypotheses have mainly been based on nuclear rDNA sequences, but more recently complete mitochondrial (mtDNA) gene sequences have provided another source of molecular information to evaluate relationships. Although there is much agreement between nuclear rDNA and mtDNA phylogenies, relationships among certain major clades are different. In this study we report that mtDNA sequences do not support the monophyly of Ascaridida, Oxyurida and Spirurida (clade III) in contrast to results for nuclear rDNA. Results from mtDNA genomes show promise as an additional independently evolving genome for developing phylogenetic hypotheses for nematodes, although substantially increased taxon sampling is needed for enhanced comparative value with nuclear rDNA. Ultimately, topological incongruence (and congruence) between nuclear rDNA and mtDNA phylogenetic hypotheses will need to be tested relative to additional independent loci that provide appropriate levels of resolution.

Results

For this comparative phylogenetic study, we determined the complete mitochondrial genome sequences of three nematode species, Cucullanus robustus (13,972 bp) representing Ascaridida, Wellcomia siamensis (14,128 bp) representing Oxyurida, and Heliconema longissimum (13,610 bp) representing Spirurida. These new sequences were used along with 33 published nematode mitochondrial genomes to investigate phylogenetic relationships among chromadorean orders. Phylogenetic analyses of both nucleotide and amino acid sequence datasets support the hypothesis that Ascaridida is nested within Rhabditida. The position of Oxyurida within Chromadorea varies among analyses; in most analyses this order is sister to the Ascaridida plus Rhabditida clade, with representative Spirurida forming a distinct clade, however, in one case Oxyurida is sister to Spirurida. Ascaridida, Oxyurida, and Spirurida (the sampled clade III taxa) do not form a monophyletic group based on complete mitochondrial DNA sequences. Tree topology tests revealed that constraining clade III taxa to be monophyletic, given the mtDNA datasets analyzed, was a significantly worse result.

Conclusion

The phylogenetic hypotheses from comparative analysis of the complete mitochondrial genome data (analysis of nucleotide and amino acid datasets, and nucleotide data excluding 3rd positions) indicates that nematodes representing Ascaridida, Oxyurida and Spirurida do not share an exclusive most recent common ancestor, in contrast to published results based on nuclear ribosomal DNA. Overall, mtDNA genome data provides reliable support for nematode relationships that often corroborates findings based on nuclear rDNA. It is anticipated that additional taxonomic sampling will provide a wealth of information on mitochondrial genome evolution and sequence data for developing phylogenetic hypotheses for the phylum Nematoda.
  相似文献   

13.
ABSTRACT: BACKGROUND: In spite of a high occurrence of Hepatitis Delta in the province of Sindh in Pakistan, no genetic study of Hepatitis Delta virus (HDV) isolates from this region was carried out. The aim of this study is to analyze the genetic proximity within local HDV strains, and relationship with other clades of HDV, using phylogenetic analysis. RESULTS: Phylogenetic analysis of nucleotide sequences of the Hepatitis Delta Antigen (HDAg) R0 region obtained in this study, showed considerable diversity among the local strains with a potential subgroup formation within clade I. The multiple sequence alignment of predicted amino acids within clade I showed many uncommon amino acid substitutions within some conserved regions that are crucial for replication and assembly of HDV. CONCLUSIONS: The studied strains showed a range of genetic diversity within HDV clade I. There is clustering of sequences into more than one group, along with formation of potential subgroup within clade I. Clustering shows the genetic closeness of strains and indicates a common origin of spread of HDV infection. Further phylogeny-based studies may provide more information about subgroup formation within clade I and may be used as an effective tool in checking and/or preventing the spread of hepatitis D virus infection in this region.  相似文献   

14.
Chloromonas is distinguished from Chlamydomonas primarily by the absence of pyrenoids, which are structures that are present in the chloroplasts of most algae and are composed primarily of the CO2-fixing enzyme Rubisco. In this study we compared sequences of the rbcL (Rubisco large subunit-encoding) genes of pyrenoid-less Chloromonas species with those of closely related pyrenoid-containing Chlamydomonas species in the "Chloromonas lineage" and with those of 45 other green algae. We found that the proteins encoded by the rbcL genes had a much higher level of amino acid substitution in members of the Chloromonas lineage than they did in other algae. This kind of elevated substitution rate was not observed, however, in the deduced proteins encoded by two other chloroplast genes that we analyzed: atpB and psaB. The rates of synonymous and nonsynonymous nucleotide substitutions in the rbcL genes indicate that the rapid evolution of these genes in members of the Chloromonas lineage is not due to relaxed selection (as it preasumably is in parasitic land plants). A phylogenetic tree based on rbcL nucleotide sequences nested two Chlamydomonas species as a "pyrenoid-regained" clade within a monophyletic Chloromonas "pyrenoid-lost" clade. Character-state optimization with this tree suggested that the loss and the regain of pyrenoids were accompanied by eight synapomorphic amino acid replacements in the Rubisco large subunit, four of which are positioned in the region involved in its dimerization. However, both the atpB and the psaB sequence data gave robust support for a rather different set of phylogenetic relationships in which neither the "pyrenoid-lost" nor the "pyrenoid-regained" clade was resolved. The appearance of such clades in the rbcL-based tree may be an artifact of convergent evolutionary changes that have occurred in a region of the large subunit that determines whether Rubisco molecules will aggregate to form a visible pyrenoid.  相似文献   

15.
Miyazawa S 《PloS one》2011,6(12):e28892
BACKGROUND: A mechanistic codon substitution model, in which each codon substitution rate is proportional to the product of a codon mutation rate and the average fixation probability depending on the type of amino acid replacement, has advantages over nucleotide, amino acid, and empirical codon substitution models in evolutionary analysis of protein-coding sequences. It can approximate a wide range of codon substitution processes. If no selection pressure on amino acids is taken into account, it will become equivalent to a nucleotide substitution model. If mutation rates are assumed not to depend on the codon type, then it will become essentially equivalent to an amino acid substitution model. Mutation at the nucleotide level and selection at the amino acid level can be separately evaluated. RESULTS: The present scheme for single nucleotide mutations is equivalent to the general time-reversible model, but multiple nucleotide changes in infinitesimal time are allowed. Selective constraints on the respective types of amino acid replacements are tailored to each gene in a linear function of a given estimate of selective constraints. Their good estimates are those calculated by maximizing the respective likelihoods of empirical amino acid or codon substitution frequency matrices. Akaike and Bayesian information criteria indicate that the present model performs far better than the other substitution models for all five phylogenetic trees of highly-divergent to highly-homologous sequences of chloroplast, mitochondrial, and nuclear genes. It is also shown that multiple nucleotide changes in infinitesimal time are significant in long branches, although they may be caused by compensatory substitutions or other mechanisms. The variation of selective constraint over sites fits the datasets significantly better than variable mutation rates, except for 10 slow-evolving nuclear genes of 10 mammals. An critical finding for phylogenetic analysis is that assuming variable mutation rates over sites lead to the overestimation of branch lengths.  相似文献   

16.
Q Lin  P Cui  F Ding  S Hu  J Yu 《Current Genomics》2012,13(1):28-36
The nucleotide composition of the light (L-) and heavy (H-) strands of animal mitochondrial genomes is known to exhibit strand-biased compositional asymmetry (SCA). One of the possibilities is the existence of a replication-associated mutational pressure (RMP) that may introduce characteristic nucleotide changes among mitochondrial genomes of different animal lineages. Here, we discuss the influence of RMP on nucleotide and amino acid compositions as well as gene organization. Among animal mitochondrial genomes, RMP may represent the major force that compels the evolution of mitochondrial protein-coding genes, coupled with other process-based selective pressures, such as on components of translation machinery- tRNAs and their anticodons. Through comparative analyses of sequenced mitochondrial genomes among diverse animal lineages and literature reviews, we suggest a strong RMP effect, observed among invertebrate mitochondrial genes as compared to those of vertebrates, that is either a result of positive selection on the invertebrate or a relaxed selective pressure on the vertebrate mitochondrial genes.  相似文献   

17.
We study to what degree patterns of amino acid substitution vary between genes using two models of protein-coding gene evolution. The first divides the amino acids into groups, with one substitution rate for pairs of residues in the same group and a second for those in differing groups. Unlike previous applications of this model, the groups themselves are estimated from data by simulated annealing. The second model makes substitution rates a function of the physical and chemical similarity between two residues. Because we model the evolution of coding DNA sequences as opposed to protein sequences, artifacts arising from the differing numbers of nucleotide substitutions required to bring about various amino acid substitutions are avoided. Using 10 alignments of related sequences (five of orthologous genes and five gene families), we do find differences in substitution patterns. We also find that, although patterns of amino acid substitution vary temporally within the history of a gene, variation is not greater in paralogous than in orthologous genes. Improved understanding of such gene-specific variation in substitution patterns may have implications for applications such as sequence alignment and phylogenetic inference.  相似文献   

18.
In order to establish the molecular basis of the pathogenicity of the attenuated RC-HL strain of rabies virus used for the production of animal vaccine in Japan, the complete genome sequence of this strain was determined and compared with that of the parental Nishigahara strain which is virulent for adult mice. The viral genome of both strains was composed of 11,926 nucleotides. The nucleotide sequences of the two genomes showed a high homology of 98.9%. The homology of the G gene was lower than those of N, P, M and L genes at both nucleotide and deduced amino acid levels, and the percentage of radical amino acid substitutions on the G protein was the highest among the five proteins. These findings raise the possibility that the structure of the G protein is the most variable among the five proteins of the two strains. Furthermore, we found two clusters of amino acid substitutions on the G and L proteins. The relevance of these clusters to the difference in the pathogenicity between the two strains is discussed.  相似文献   

19.
Mitochondrial genomes provide a valuable dataset for phylogenetic studies, in particular of metazoan phylogeny because of the extensive taxon sample that is available. Beyond the traditional sequence-based analysis it is possible to extract phylogenetic information from the gene order. Here we present a novel approach utilizing these data based on cyclic list alignments of the gene orders. A progressive alignment approach is used to combine pairwise list alignments into a multiple alignment of gene orders. Parsimony methods are used to reconstruct phylogenetic trees, ancestral gene orders, and consensus patterns in a straightforward approach. We apply this method to study the phylogeny of protostomes based exclusively on mitochondrial genome arrangements. We, furthermore, demonstrate that our approach is also applicable to the much larger genomes of chloroplasts.  相似文献   

20.
Three Markov models (Dayhoff, Proportional and Poisson models; Hasegawa et al., 1992a) for amino acid substitution during evolution were used for maximum likelihood analyses of proteins coded for in mitochondrial DNA in estimating a phylogenetic tree among human, bovine and murids (mouse and rat) with chicken as an outgroup. It turned out that Dayhoff model is the most appropriate model among the alternatives in approximating the amino acid substitutions of proteins coded for in mitochondrial DNA. In spite of the presence of the complete sequence data of mitochondrial genomes, we could not resolve the trichotomy among human, bovine and murids, probably because the time length separating two branching events among these three lines was short and because chicken is too distant from mammals to be used as an outgroup. It was suggested that the average substitution rate of amino acids coded for in mitochondrial DNA is lower along the bovine line than those along the human or murid lines. Advantages of amino acid sequence analysis over nucleotide sequence analysis in phylogenetic study were discussed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号