首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Evolution of mitochondrial gene orders in echinoderms   总被引:1,自引:0,他引:1  
A comprehensive analysis of the mitochondrial gene orders of all previously published and two novel Antedon mediterranea (Crinoidea) and Ophiura albida (Ophiuroidea) complete echinoderm mitochondrial genomes shows that all major types of rearrangement operations are necessary to explain the evolution of mitochondrial genomes. In addition to protein coding genes we include all tRNA genes as well as the control region in our analysis. Surprisingly, 7 of the 16 genomes published in the GenBank database contain misannotations, mostly unannotated tRNAs and/or mistakes in the orientation of tRNAs, which we have corrected here. Although the gene orders of mt genomes appear very different, only 8 events are necessary to explain the evolutionary history of echinoderms with the exception of the ophiuroids. Only two of these rearrangements are inversions, while we identify three tandem-duplication-random-loss events and three transpositions.  相似文献   

2.
Evolution operates on whole genomes through direct rearrangements of genes, such as inversions, transpositions, and inverted transpositions, as well as through operations, such as duplications, losses, and transfers, that also affect the gene content of the genomes. Because these events are rare relative to nucleotide substitutions, gene order data offer the possibility of resolving ancient branches in the tree of life; the combination of gene order data with sequence data also has the potential to provide more robust phylogenetic reconstructions, since each can elucidate evolution at different time scales. Distance corrections greatly improve the accuracy of phylogeny reconstructions from DNA sequences, enabling distance-based methods to approach the accuracy of the more elaborate methods based on parsimony or likelihood at a fraction of the computational cost. This paper focuses on developing distance correction methods for phylogeny reconstruction from whole genomes. The main question we investigate is how to estimate evolutionary histories from whole genomes with equal gene content, and we present a technique, the empirically derived estimator (EDE), that we have developed for this purpose. We study the use of EDE on whole genomes with identical gene content, and we explore the accuracy of phylogenies inferred using EDE with the neighbor joining and minimum evolution methods under a wide range of model conditions. Our study shows that tree reconstruction under these two methods is much more accurate when based on EDE distances than when based on other distances previously suggested for whole genomes. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users. [Reviewing Editor: Dr. Martin Kreitman]  相似文献   

3.
Sorting by weighted reversals, transpositions, and inverted transpositions.   总被引:1,自引:0,他引:1  
During evolution, genomes are subject to genome rearrangements that alter the ordering and orientation of genes on the chromosomes. If a genome consists of a single chromosome (like mitochondrial, chloroplast, or bacterial genomes), the biologically relevant genome rearrangements are (1) inversions--also called reversals--where a section of the genome is excised, reversed in orientation, and reinserted and (2) transpositions, where a section of the genome is excised and reinserted at a new position in the genome; if this also involves an inversion, one speaks of an inverted transposition. To reconstruct ancient events in the evolutionary history of organisms, one is interested in finding an optimal sequence of genome rearrangements that transforms a given genome into another genome. It is well known that this problem is equivalent to the problem of "sorting" a signed permutation into the identity permutation. In this paper, we provide a 1.5-approximation algorithm for sorting by weighted reversals, transpositions and inverted transpositions for biologically realistic weights.  相似文献   

4.
ParIS Genome Rearrangement server   总被引:2,自引:0,他引:2  
SUMMARY: ParIS Genome Rearrangement is a web server for a Bayesian analysis of unichromosomal genome pairs. The underlying model allows inversions, transpositions and inverted transpositions. The server generates a Markov chain using a Partial Importance Sampler technique, and samples trajectories of mutations from this chain. The user can specify several marginalizations to the posterior: the posterior distribution of number of mutations needed to transform one genome into another, length distribution of mutations, number of mutations that have occurred at a given site. Both text and graphical outputs are available. We provide a limited server, a downloadable unlimited server that can be installed locally on any linux/Unix operating system, and a database of mitochondrial gene orders.  相似文献   

5.
Most molecular analyses, including phylogenetic inference, are based on sequence alignments. We present an algorithm that estimates relatedness between biomolecules without the requirement of sequence alignment by using a protein frequency matrix that is reduced by singular value decomposition (SVD), in a latent semantic index information retrieval system. Two databases were used: one with 832 proteins from 13 mitochondrial gene families and another composed of 1000 sequences from nine types of proteins retrieved from GenBank. Firstly, 208 sequences from the first database and 200 from the second were randomly selected and compared using edit distance between each pair of sequences and respective cosines and Euclidean distances from SVD. Correlation between cosine and edit distance was -0.32 (P < 0.01) and between Euclidean distance and edit distance was +0.70 (P < 0.01). In order to check the ability of SVD in classifying sequences according to their categories, we used a sample of 202 sequences from the 13 gene families as queries (test set), and the other proteins (630) were used to generate the frequency matrix (training set). The classification algorithm applies a voting scheme based on the five most similar sequences with each query. With a 3-peptide frequency matrix, all 202 queries were correctly classified (accuracy = 100%). This algorithm is very attractive, because sequence alignments are neither generated nor required. In order to achieve results similar to those obtained with edit distance analysis, we recommend that Euclidean distance be used as a similarity measure for protein sequences in latent semantic indexing methods.  相似文献   

6.
MOTIVATION: The evolutionary distance inferred from gene-order comparisons of related bacteria is dependent on the model. Therefore, it is highly important to establish reliable assumptions before inferring its magnitude. RESULTS: We investigate the patterns of dotplots between species of bacteria with the purpose of model selection in gene-order problems. We find several categories of data which can be explained by carefully weighing the contributions of reversals, transpositions, symmetrical reversals, single gene transpositions and single gene reversals. We also derive method of moments distance estimates for some previously uncomputed cases, such as symmetrical reversals, single gene reversals and their combinations, as well as the single gene transpositions edit distance.  相似文献   

7.
We study three classical problems of genome rearrangement--sorting, halving, and the median problem--in a restricted double cut and join (DCJ) model. In the DCJ model, introduced by Yancopoulos et al., we can represent rearrangement events that happen in multichromosomal genomes, such as inversions, translocations, fusions, and fissions. Two DCJ operations can mimic transpositions or block interchanges by first extracting an appropriate segment of a chromosome, creating a temporary circular chromosome, and then reinserting it in its proper place. In the restricted model, we are concerned with multichromosomal linear genomes and we require that each circular excision is immediately followed by its reincorporation. Existing linear-time DCJ sorting and halving algorithms ignore this reincorporation constraint. In this article, we propose a new algorithm for the restricted sorting problem running in O(n log n) time, thus improving on the known quadratic time algorithm. We solve the restricted halving problem and give an algorithm that computes a multilinear halved genome in linear time. Finally, we show that the restricted median problem is NP-hard as conjectured.  相似文献   

8.
The order of genes in the genomes of species can change during evolution and can provide information about their phylogenetic relationship. An interesting method to infer the phylogenetic relationship from the gene orders is to use different types of rearrangement operations and to find possible rearrangement scenarios using these operations. One of the most common rearrangement operations is reversals, which reverse the order of a subset of neighbored genes. In this paper, we study the problem to find the ancestral gene order for three species represented by their gene orders. The rearrangement scenario should use a minimal number of reversals and no other rearrangement operations. This problem is called the Median problem and is known to be NP--complete. In this paper, we describe a heuristic algorithm for finding solutions to the Median problem that searches for rearrangement scenarios with the additional property that gene groups should not be destroyed by reversal operations. The concept of conserved intervals for signed permutations is used to describe such gene groups. We show experimentally, for different types of test problems, that the proposed algorithm produces very good results compared to other algorithms for the Median problem. We also integrate our reversal selection procedure into the well-known MGR and GRAPPA algorithms and show that they achieve a significant speedup while obtaining solutions of the same quality as the original algorithms on the test problems.  相似文献   

9.
Members of subclass Copepoda are abundant, diverse, and—as a result of their variety of ecological roles in marine and freshwater environments—important, but their phylogenetic interrelationships are unclear. Recent studies of arthropods have used gene arrangements in the mitochondrial (mt) genome to infer phylogenies, but for copepods, only seven complete mt genomes have been published. These data revealed several within-order and few among-order similarities. To increase the data available for comparisons, we sequenced the complete mt genome (13,831 base pairs) of Amphiascoides atopus and 10,649 base pairs of the mt genome of Schizopera knabeni (both in the family Miraciidae of the order Harpacticoida). Comparison of our data to those for Tigriopus japonicus (family Harpacticidae, order Harpacticoida) revealed similarities in gene arrangement among these three species that were consistent with those found within and among families of other copepod orders. Comparison of the mt genomes of our species with those known from other copepod orders revealed the arrangement of mt genes of our Harpacticoida species to be more similar to that of Sinergasilus polycolpus (order Poecilostomatoida) than to that of T. japonicus. The similarities between S. polycolpus and our species are the first to be noted across the boundaries of copepod orders and support the possibility that mt-gene arrangement might be used to infer copepod phylogenies. We also found that our two species had extremely truncated transfer RNAs and that gene overlaps occurred much more frequently than has been reported for other copepod mt genomes.  相似文献   

10.

Background

The animal mitochondrial genome is generally considered to be under selection for both compactness and gene order conservation. As more mitochondrial genomes are sequenced, mitochondrial duplications and gene rearrangements have been frequently identified among diverse animal groups. Although several mechanisms of gene rearrangement have been proposed thus far, more observational evidence from major taxa is needed to validate specific mechanisms. In the current study, the complete mitochondrial DNA of sixteen bird species from the family Ardeidae was sequenced and the evolution of mitochondrial gene rearrangements was investigated. The mitochondrial genomes were then used to review the phylogenies of these ardeid birds.

Results

The complete mitochondrial genome sequences of the sixteen ardeid birds exhibited four distinct mitochondrial gene orders in which two of them, named as “duplicate tRNAGlu–CR” and “duplicate tRNAThr–tRNAPro and CR”, were newly discovered. These gene rearrangements arose from an evolutionary process consistent with the tandem duplication - random loss model (TDRL). Additionally, duplications in these gene orders were near identical in nucleotide sequences within each individual, suggesting that they evolved in concert. Phylogenetic analyses of the sixteen ardeid species supported the idea that Ardea ibis, Ardea modesta and Ardea intermedia should be classified as genus Ardea, and Ixobrychus flavicollis as genus Ixobrychus, and indicated that within the subfamily Ardeinae, Nycticorax nycticorax is closely related to genus Egretta and that Ardeola bacchus and Butorides striatus are closely related to the genus Ardea.

Conclusions

The duplicate tRNAThr–CR gene order is found in most ardeid lineages, suggesting this gene order is the ancestral pattern within these birds and persisted in most lineages via concerted evolution. In two independent lineages, when the concerted evolution stopped in some subsections due to the accumulation of numerous substitutions and deletions, the duplicate tRNAThr–CR gene order was transformed into three other gene orders. The phylogenetic trees produced from concatenated rRNA and protein coding genes have high support values in most nodes, indicating that the mitochondrial genome sequences are promising markers for resolving the phylogenetic issues of ardeid birds when more taxa are added.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-573) contains supplementary material, which is available to authorized users.  相似文献   

11.

Background

One way to estimate the evolutionary distance between two given genomes is to determine the minimum number of large-scale mutations, or genome rearrangements, that are necessary to transform one into the other. In this context, genomes can be represented as ordered sequences of genes, each gene being represented by a signed integer. If no gene is repeated, genomes are thus modeled as signed permutations of the form \(\pi =(\pi _1 \pi _2 \ldots \pi _n)\), and in that case we can consider without loss of generality that one of them is the identity permutation \(\iota _n =(1 2 \ldots n)\), and that we just need to sort the other (i.e., transform it into \(\iota _n\)). The most studied genome rearrangement events are reversals, where a segment of the genome is reversed and reincorporated at the same location; and transpositions, where two consecutive segments are exchanged. Many variants, e.g., combining different types of (possibly constrained) rearrangements, have been proposed in the literature. One of them considers that the number of genes involved, in a reversal or a transposition, is never greater than two, which is known as the problem of sorting by super short operations (or SSOs).

Results and conclusions

All problems considering SSOs in permutations have been shown to be in \(\mathsf {P}\), except for one, namely sorting signed circular permutations by super short reversals and super short transpositions. Here we fill this gap by introducing a new graph structure called cyclic permutation graph and providing a series of intermediate results, which allows us to design a polynomial algorithm for sorting signed circular permutations by super short reversals and super short transpositions.
  相似文献   

12.
Abstract.— The role played by gene transpositions during the evolution of eukaryotic genomes is still poorly understood and indeed has been analyzed in detail only in nematodes. In Drosophila , a limited number of transpositions have been detected by comparing the chromosomal location of genes between different species. The relative importance of gene transposition versus other types of chromosomal rearrangements, for example, inversions, has not yet been evaluated. Here, we use physical mapping to perform an extensive search for long-distance gene transpositions and assess their impact during the evolution of the Drosophila genome. We compare the relative order of 297 molecular markers that cover 60% of the euchromatic fraction of the genome between two related Drosophila species and conclude that the frequency of gene transpositions is very low, namely one order of magnitude lower than that of nematodes. In addition, gene transpositions seem to be events almost exclusively associated with genes of repetitive nature such as the Histone gene complex ( HIS-C ).  相似文献   

13.
Mitochondrial genomes are useful tools for inferring evolutionary history. However, many taxa are poorly represented by available data. Thus, to further understand the phylogenetic potential of complete mitochondrial genome sequence data in Annelida (segmented worms), we examined the complete mitochondrial sequence for Clymenella torquata (Maldanidae) and an estimated 80% of the sequence of Riftia pachyptila (Siboglinidae). These genomes have remarkably similar gene orders to previously published annelid genomes, suggesting that gene order is conserved across annelids. This result is interesting, given the high variation seen in the closely related Mollusca and Brachiopoda. Phylogenetic analyses of DNA sequence, amino acid sequence, and gene order all support the recent hypothesis that Sipuncula and Annelida are closely related. Our findings suggest that gene order data is of limited utility in annelids but that sequence data holds promise. Additionally, these genomes show AT bias (approximately 66%) and codon usage biases but have a typical gene complement for bilaterian mitochondrial genomes.  相似文献   

14.
The complete nucleotide sequences of the mitochondrial genomes were determined for the three pelagic chaetognaths, Sagitta nagae, Sagitta decipiens, and Sagitta enflata. The mitochondrial genomes of these species which were 11,459, 11,121, and 12,631 bp in length, respectively, contained 14 genes (11 protein-coding genes, one transfer RNA gene, and two ribosomal RNA genes), and were found to have lost 23 genes that are present in the typical metazoan mitochondrial genome. The same mitochondrial genome contents have been reported from the benthic chaetognaths belonging to the family Spadellidae, Paraspadella gotoi and Spadella cephaloptera. Within the phylum Chaetognatha, Sagitta and Spadellidae are distantly related, suggesting that the gene loss occurred in the ancestral species of the phylum. The gene orders of the three Sagitta species are markedly different from those of the other non-Chaetognatha metazoans. In contrast to the region with frequent gene rearrangements, no gene rearrangements were observed in the gene cluster encoding COII–III, ND1–3, srRNA, and tRNAmet. Within this conserved gene cluster, gene rearrangements were not observed in the three Sagitta species or between the Sagitta and Spadellidae species. The gene order of this cluster was also assumed to be the ancestral state of the phylum.  相似文献   

15.
Tree structures are useful for describing and analyzing biological objects and processes. Consequently, there is a need to design metrics and algorithms to compare trees. A natural comparison metric is the "Tree Edit Distance," the number of simple edit (insert/delete) operations needed to transform one tree into the other. Rooted-ordered trees, where the order between the siblings is significant, can be compared in polynomial time. Rooted-unordered trees are used to describe processes or objects where the topology, rather than the order or the identity of each node, is important. For example, in immunology, rooted-unordered trees describe the process of immunoglobulin (antibody) gene diversification in the germinal center over time. Comparing such trees has been proven to be a difficult computational problem that belongs to the set of NP-Complete problems. Comparing two trees can be viewed as a search problem in graphs. A* is a search algorithm that explores the search space in an efficient order. Using a good lower bound estimation of the degree of difference between the two trees, A* can reduce search time dramatically. We have designed and implemented a variant of the A* search algorithm suitable for calculating tree edit distance. We show here that A* is able to perform an edit distance measurement in reasonable time for trees with dozens of nodes.  相似文献   

16.
The complete sequence of the mitochondrial genome of Tetrahymena thermophila has been determined and compared with the mitochondrial genome of Tetrahymena pyriformis. The sequence similarity clearly indicates homology of the entire T.thermophila and T.pyriformis mitochondrial genomes. The T.thermophila genome is very compact, most of the intergenic regions are short (only three are longer than 63 bp) and comprise only 3.8% of the genome. The nad9 gene is tandemly duplicated in T.thermophila. Long terminal inverted repeats and the nad9 genes are undergoing concerted evolution. There are 55 putative genes: three ribosomal RNA genes, eight transfer RNA genes, 22 proteins with putatively assigned functions and 22 additional open reading frames of unknown function. In order to extend indications of homology beyond amino acid sequence similarity we have examined a number of physico-chemical properties of the mitochondrial proteins, including theoretical pI, molecular weight and particularly the predicted transmembrane spanning regions. This approach has allowed us to identify homologs to ymf58 (nad4L), ymf62 (nad6) and ymf60 (rpl6).  相似文献   

17.
We review the combinatorial optimization problems in calculating edit distances between genomes and phylogenetic inference based on minimizing gene order changes. With a view to avoiding the computational cost and the "long branches attract" artifact of some tree-building methods, we explore the probabilization of genome rearrangement models prior to developing a methodology based on branch-length invariants. We characterize probabilistically the evolution of the structure of the gene adjacency set for reversals on unsigned circular genomes and, using a nontrivial recurrence relation, reversals on signed genomes. Concepts from the theory of invariants developed for the phylogenetics of homologous gene sequences can be used to derive a complete set of linear invariants for unsigned reversals, as well as for a mixed rearrangement model for signed genomes, though not for pure transposition or pure signed reversal models. The invariants are based on an extended Jukes-Cantor semigroup. We illustrate the use of these invariants to relate mitochondrial genomes from a number of invertebrate animals.  相似文献   

18.
For their apparent morphological simplicity, the Platyhelminthes or “flatworms” are a diverse clade found in a broad range of habitats. Their body plans have however made them difficult to robustly classify. Molecular evidence is only beginning to uncover the true evolutionary history of this clade. Here we present nine novel mitochondrial genomes from the still undersampled orders Polycladida and Rhabdocoela, assembled from short Illumina reads. In particular we present for the first time in the literature the mitochondrial sequence of a Rhabdocoel, Bothromesostoma personatum (Typhloplanidae, Mesostominae). The novel mitochondrial genomes examined generally contained the 36 genes expected in the Platyhelminthes, with all possessing 12 of the 13 protein-coding genes normally found in metazoan mitochondrial genomes (ATP8 being absent from all Platyhelminth mtDNA sequenced to date), along with two ribosomal RNA genes. The majority presented possess 22 transfer RNA genes, and a single tRNA gene was absent from two of the nine assembled genomes. By comparison of mitochondrial gene order and phylogenetic analysis of the protein coding and ribosomal RNA genes contained within these sequences with those of previously sequenced species we are able to gain a firm molecular phylogeny for the inter-relationships within this clade.Our phylogenetic reconstructions, using both nucleotide and amino acid sequences under several models and both Bayesian and Maximum Likelihood methods, strongly support the monophyly of Polycladida, and the monophyly of Acotylea and Cotylea within that clade. They also allow us to speculate on the early emergence of Macrostomida, the monophyly of a “Turbellarian-like” clade, the placement of Rhabditophora, and that of Platyhelminthes relative to the Lophotrochozoa (=Spiralia). The data presented here therefore represent a significant advance in our understanding of platyhelminth phylogeny, and will form the basis of a range of future research in the still-disputed classifications within this taxon.  相似文献   

19.

Background

Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda.

Results

The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic.

Conclusions

The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect. As such, these data are likely inappropriate for investigating such ancient relationships.  相似文献   

20.
Mitochondrial genomes provide a valuable dataset for phylogenetic studies, in particular of metazoan phylogeny because of the extensive taxon sample that is available. Beyond the traditional sequence-based analysis it is possible to extract phylogenetic information from the gene order. Here we present a novel approach utilizing these data based on cyclic list alignments of the gene orders. A progressive alignment approach is used to combine pairwise list alignments into a multiple alignment of gene orders. Parsimony methods are used to reconstruct phylogenetic trees, ancestral gene orders, and consensus patterns in a straightforward approach. We apply this method to study the phylogeny of protostomes based exclusively on mitochondrial genome arrangements. We, furthermore, demonstrate that our approach is also applicable to the much larger genomes of chloroplasts.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号