首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Lateral gene transfer has been identified as an important mode of genome evolution within prokaryotes. Except for the special case of gene transfer from organelle genomes to the eukaryotic nucleus, only a few cases of lateral gene transfer involving eukaryotes have been described. Here we present phylogenetic and gene order analyses on the small subunit of glutamate synthase (encoded by gltD) and its homologues, including the large subunit of sulfide dehydrogenase (encoded by sudA). The scattered distribution of the sudA and sudB gene pair and the phylogenetic analysis strongly suggest that lateral gene transfer was involved in the propagation of the genes in the three domains of life. One of these transfers most likely occurred between a prokaryote and an ancestor of diplomonad protists. Furthermore, phylogenetic analyses indicate that the gene for the small subunit of glutamate synthase was transferred from a low-GC gram-positive bacterium to a common ancestor of animals, fungi, and plants. Interestingly, in both examples, the eukaryotes encode a single gene that corresponds to a conserved operon structure in prokaryotes. Our analyses, together with several recent publications, show that lateral gene transfers from prokaryotes to unicellular eukaryotes occur with appreciable frequency. In the case of the genes for sulfide dehydrogenase, the transfer affected only a limited group of eukaryotes—the diplomonads—while the transfer of the glutamate synthase gene probably happened earlier in evolution and affected a wider range of eukaryotes.  相似文献   

2.
3.
4.
Despite Diplostomum baeri (Dubois, 1937) being one of the most widely distributed parasites of freshwater fish, there is no complete mitochondrial (mt) genome currently available. The complicated systematics presented by D. baeri has hampered investigations into the species distributions and infective dynamics of the species. Within this study we obtained complete mt genome sequences of D. baeri and assessed its phylogenetic relationship with other species of Digenea. The complete mitochondrial genome of D. baeri is 14,480 bp in length, containing 36 genes in total. The phylogenetic tree resulting from Bayesian inference of concatenated 12 protein coding gene sequences placed D. baeri alongside published mt genomes of Diplostomidae, with the overall taxonomic placement of the genus being a sister lineage of the order Plagiochiida The characterization of further mitochondrial genomes within the family Diplostomidae will help progress phylogenetic and epidemiological investigations as well as providing a framework for the analysis of diagnostic markers to be used in further monitoring of the parasite worldwide.  相似文献   

5.
Protein domains characteristic of eukaryotic innate immunity and apoptosis have many prokaryotic counterparts of unknown function. By reconstructing interactomes computationally, we found that bacterial proteins containing these domains are part of a network that also includes other domains not hitherto associated with immunity. This network is connected to the network of prokaryotic signal transduction proteins, such as histidine kinases and chemoreceptors. The network varies considerably in domain composition and degree of paralogy, even between strains of the same species, and its repetitive domains are often amplified recently, with individual repeats sharing up to 100% sequence identity. Both phenomena are evidence of considerable evolutionary pressure and thus compatible with a role in the “arms race” between host and pathogen. In order to investigate the relationship of this network to its eukaryotic counterparts, we performed a cluster analysis of organisms based on a census of its constituent domains across all fully sequenced genomes. We obtained a large central cluster of mainly unicellular organisms, from which multicellular organisms radiate out in two main directions. One is taken by multicellular bacteria, primarily cyanobacteria and actinomycetes, and plants form an extension of this direction, connected via the basal, unicellular cyanobacteria. The second main direction is taken by animals and fungi, which form separate branches with a common root in the α-proteobacteria of the central cluster. This analysis supports the notion that the innate immunity networks of eukaryotes originated from their endosymbionts and that increases in the complexity of these networks accompanied the emergence of multicellularity.  相似文献   

6.
7.
The physical and functional organizations of a genome are correlated outcomes of evolution. Inbred strains of mice provide a unique opportunity for exploring these relationships, representing as they do, diverse genomes originally separated by millions of generations that were then scrambled in the laboratory and subjected to intense selection during inbreeding to homozygosity. Here we show that the resulting pattern of chromosome organization includes regional domains of functionally related elements that promote the co-inheritance and survival of compatible sets of alleles. There are also patterns of linkage disequilibrium between domains on separate chromosomes; these are distinctly non-random and form networks with scale-free architecture. The strong conservation of gene order among mammals suggests that the domains and networks we find likely characterize all mammals, and possibly beyond.  相似文献   

8.
RNA-Seq techniques generate hundreds of millions of short RNA reads using next-generation sequencing (NGS). These RNA reads can be mapped to reference genomes to investigate changes of gene expression but improved procedures for mining large RNA-Seq datasets to extract valuable biological knowledge are needed. RNAMiner—a multi-level bioinformatics protocol and pipeline—has been developed for such datasets. It includes five steps: Mapping RNA-Seq reads to a reference genome, calculating gene expression values, identifying differentially expressed genes, predicting gene functions, and constructing gene regulatory networks. To demonstrate its utility, we applied RNAMiner to datasets generated from Human, Mouse, Arabidopsis thaliana, and Drosophila melanogaster cells, and successfully identified differentially expressed genes, clustered them into cohesive functional groups, and constructed novel gene regulatory networks. The RNAMiner web service is available at http://calla.rnet.missouri.edu/rnaminer/index.html.  相似文献   

9.
Viruses are the most numerous biological entity, existing in all environments and infecting all cellular organisms. Compared with cellular life, the evolution and origin of viruses are poorly understood; viruses are enormously diverse, and most lack sequence similarity to cellular genes. To uncover viral sequences without relying on either reference viral sequences from databases or marker genes that characterize specific viral taxa, we developed an analysis pipeline for virus inference based on clustered regularly interspaced short palindromic repeats (CRISPR). CRISPR is a prokaryotic nucleic acid restriction system that stores the memory of previous exposure. Our protocol can infer CRISPR-targeted sequences, including viruses, plasmids, and previously uncharacterized elements, and predict their hosts using unassembled short-read metagenomic sequencing data. By analyzing human gut metagenomic data, we extracted 11,391 terminally redundant CRISPR-targeted sequences, which are likely complete circular genomes. The sequences included 2,154 tailed-phage genomes, together with 257 complete crAssphage genomes, 11 genomes larger than 200 kilobases, 766 genomes of Microviridae species, 56 genomes of Inoviridae species, and 95 previously uncharacterized circular small genomes that have no reliably predicted protein-coding gene. We predicted the host(s) of approximately 70% of the discovered genomes at the taxonomic level of phylum by linking protospacers to taxonomically assigned CRISPR direct repeats. These results demonstrate that our protocol is efficient for de novo inference of CRISPR-targeted sequences and their host prediction.  相似文献   

10.
Next-generation sequencing technologies have allowed researchers to determine the collective genomes of microbial communities co-existing within diverse ecological environments. Varying species abundance, length and complexities within different communities, coupled with discovery of new species makes the problem of taxonomic assignment to short DNA sequence reads extremely challenging. We have developed a new sequence composition-based taxonomic classifier using extreme learning machines referred to as TAC-ELM for metagenomic analysis. TAC-ELM uses the framework of extreme learning machines to quickly and accurately learn the weights for a neural network model. The input features consist of GC content and oligonucleotides. TAC-ELM is evaluated on two metagenomic benchmarks with sequence read lengths reflecting the traditional and current sequencing technologies. Our empirical results indicate the strength of the developed approach, which outperforms state-of-the-art taxonomic classifiers in terms of accuracy and implementation complexity. We also perform experiments that evaluate the pervasive case within metagenome analysis, where a species may not have been previously sequenced or discovered and will not exist in the reference genome databases. TAC-ELM was also combined with BLAST to show improved classification results. Code and Supplementary Results: http://www.cs.gmu.edu/~mlbio/TAC-ELM (BSD License).  相似文献   

11.
The multispecies coalescent (MSC) is a statistical framework that models how gene genealogies grow within the branches of a species tree. The field of computational phylogenetics has witnessed an explosion in the development of methods for species tree inference under MSC, owing mainly to the accumulating evidence of incomplete lineage sorting in phylogenomic analyses. However, the evolutionary history of a set of genomes, or species, could be reticulate due to the occurrence of evolutionary processes such as hybridization or horizontal gene transfer. We report on a novel method for Bayesian inference of genome and species phylogenies under the multispecies network coalescent (MSNC). This framework models gene evolution within the branches of a phylogenetic network, thus incorporating reticulate evolutionary processes, such as hybridization, in addition to incomplete lineage sorting. As phylogenetic networks with different numbers of reticulation events correspond to points of different dimensions in the space of models, we devise a reversible-jump Markov chain Monte Carlo (RJMCMC) technique for sampling the posterior distribution of phylogenetic networks under MSNC. We implemented the methods in the publicly available, open-source software package PhyloNet and studied their performance on simulated and biological data. The work extends the reach of Bayesian inference to phylogenetic networks and enables new evolutionary analyses that account for reticulation.  相似文献   

12.
13.
The phytoplasmas are currently named using the Candidatus category, as the inability to grow them in vitro prevented (i) the performance of tests, such as DNA-DNA hybridization, that are regarded as necessary to establish species boundaries, and (ii) the deposition of type strains in culture collections. The recent accession to complete or nearly complete genome sequence information disclosed the opportunity to apply to the uncultivable phytoplasmas the same taxonomic approaches used for other bacteria. In this work, the genomes of 14 strains, belonging to the 16SrI, 16SrIII, 16SrV and 16SrX groups, including the species “Ca. P. asteris”, “Ca. P. mali”, “Ca. P. pyri”, “Ca. P. pruni”, and “Ca. P. australiense” were analyzed along with Acholeplasma laidlawi, to determine their taxonomic relatedness. Average nucleotide index (ANIm), tetranucleotide signature frequency correlation index (Tetra), and multilocus sequence analysis of 107 shared genes using both phylogenetic inference of concatenated (DNA and amino acid) sequences and consensus networks, were carried out. The results were in large agreement with the previously established 16S rDNA based classification schemes. Moreover, the taxonomic relationships within the 16SrI, 16SrIII and 16SrX groups, that represent clusters of strains whose relatedness could not be determined by 16SrDNA analysis, could be comparatively evaluated with non-subjective criteria. “Ca. P. mali” and “Ca. P. pyri” were found to meet the genome characteristics for the retention into two different, yet strictly related species; representatives of subgroups 16SrI-A and 16SrI-B were also found to meet the standards used in other bacteria to distinguish separate species; the genomes of the strains belonging to 16SrIII were found more closely related, suggesting that their subdivision into Candidatus species should be approached with caution.  相似文献   

14.
Doubly uniparental inheritance (DUI) is an exception to the typical maternal inheritance of mitochondrial (mt) DNA in Metazoa, and found only in some bivalves. In species with DUI, there are two highly divergent gender-associated mt genomes: maternal (F) and paternal (M), which transmit independently and show different tissue localization. Solenaia carinatus is an endangered freshwater mussel species exclusive to Poyang Lake basin, China. Anthropogenic events in the watershed greatly threaten the survival of this species. Nevertheless, the taxonomy of S. carinatus based on shell morphology is confusing, and the subfamilial placement of the genus Solenaia remains unclear. In order to clarify the taxonomic status and discuss the phylogenetic implications of family Unionidae, the entire F and M mt genomes of S. carinatus were sequenced and compared with the mt genomes of diverse freshwater mussel species. The complete F and M mt genomes of S. carinatus are 16716 bp and 17102 bp in size, respectively. The F and M mt genomes of S. carinatus diverge by about 40% in nucleotide sequence and 48% in amino acid sequence. Compared to F counterparts, the M genome shows a more compact structure. Different gene arrangements are found in these two gender-associated mt genomes. Among these, the F genome cox2-rrnS gene order is considered to be a genome-level synapomorphy for female lineage of the subfamily Gonideinae. From maternal and paternal mtDNA perspectives, the phylogenetic analyses of Unionoida indicate that S. carinatus belongs to Gonideinae. The F and M clades in freshwater mussels are reciprocal monophyly. The phylogenetic trees advocate the classification of sampled Unionidae species into four subfamilies: Gonideinae, Ambleminae, Anodontinae, and Unioninae, which is supported by the morphological characteristics of glochidia.  相似文献   

15.

Background

Kinesins, a superfamily of molecular motors, use microtubules as tracks and transport diverse cellular cargoes. All kinesins contain a highly conserved ~350 amino acid motor domain. Previous analysis of the completed genome sequence of one flowering plant (Arabidopsis) has resulted in identification of 61 kinesins. The recent completion of genome sequencing of several photosynthetic and non-photosynthetic eukaryotes that belong to divergent lineages offers a unique opportunity to conduct a comprehensive comparative analysis of kinesins in plant and non-plant systems and infer their evolutionary relationships.

Results

We used the kinesin motor domain to identify kinesins in the completed genome sequences of 19 species, including 13 newly sequenced genomes. Among the newly analyzed genomes, six represent photosynthetic eukaryotes. A total of 529 kinesins was used to perform comprehensive analysis of kinesins and to construct gene trees using the Bayesian and parsimony approaches. The previously recognized 14 families of kinesins are resolved as distinct lineages in our inferred gene tree. At least three of the 14 kinesin families are not represented in flowering plants. Chlamydomonas, a green alga that is part of the lineage that includes land plants, has at least nine of the 14 known kinesin families. Seven of ten families present in flowering plants are represented in Chlamydomonas, indicating that these families were retained in both the flowering-plant and green algae lineages.

Conclusion

The increase in the number of kinesins in flowering plants is due to vast expansion of the Kinesin-14 and Kinesin-7 families. The Kinesin-14 family, which typically contains a C-terminal motor, has many plant kinesins that have the motor domain at the N terminus, in the middle, or the C terminus. Several domains in kinesins are present exclusively either in plant or animal lineages. Addition of novel domains to kinesins in lineage-specific groups contributed to the functional diversification of kinesins. Results from our gene-tree analyses indicate that there was tremendous lineage-specific duplication and diversification of kinesins in eukaryotes. Since the functions of only a few plant kinesins are reported in the literature, this comprehensive comparative analysis will be useful in designing functional studies with photosynthetic eukaryotes.  相似文献   

16.
17.

Background

The rhomboid family of polytopic membrane proteins shows a level of evolutionary conservation unique among membrane proteins. They are present in nearly all the sequenced genomes of archaea, bacteria and eukaryotes, with the exception of several species with small genomes. On the basis of experimental studies with the developmental regulator rhomboid from Drosophila and the AarA protein from the bacterium Providencia stuartii, the rhomboids are thought to be intramembrane serine proteases whose signaling function is conserved in eukaryotes and prokaryotes.

Results

Phylogenetic tree analysis carried out using several independent methods for tree constructions and the corresponding statistical tests suggests that, despite its broad distribution in all three superkingdoms, the rhomboid family was not present in the last universal common ancestor of extant life forms. Instead, we propose that rhomboids evolved in bacteria and have been acquired by archaea and eukaryotes through several independent horizontal gene transfers. In eukaryotes, two distinct, ancient acquisitions apparently gave rise to the two major subfamilies, typified by rhomboid and PARL (presenilins-associated rhomboid-like protein), respectively. Subsequent evolution of the rhomboid family in eukaryotes proceeded by multiple duplications and functional diversification through the addition of extra transmembrane helices and other domains in different orientations relative to the conserved core that harbors the protease activity.

Conclusions

Although the near-universal presence of the rhomboid family in bacteria, archaea and eukaryotes appears to suggest that this protein is part of the heritage of the last universal common ancestor, phylogenetic tree analysis indicates a likely bacterial origin with subsequent dissemination by horizontal gene transfer. This emphasizes the importance of explicit phylogenetic analysis for the reconstruction of ancestral life forms. A hypothetical scenario for the origin of intracellular membrane proteases from membrane transporters is proposed.
  相似文献   

18.
Comparative chloroplast genome analyses are mostly carried out at lower taxonomic levels, such as the family and genus levels. At higher taxonomic levels, chloroplast genomes are generally used to reconstruct phylogenies. However, little attention has been paid to chloroplast genome evolution within orders. Here, we present the chloroplast genome of Sedum sarmentosum and take advantage of several available (or elucidated) chloroplast genomes to examine the evolution of chloroplast genomes in Saxifragales. The chloroplast genome of S. sarmentosum is 150,448 bp long and includes 82,212 bp of a large single-copy (LSC) region, 16.670 bp of a small single-copy (SSC) region, and a pair of 25,783 bp sequences of inverted repeats (IRs).The genome contains 131 unique genes, 18 of which are duplicated within the IRs. Based on a comparative analysis of chloroplast genomes from four representative Saxifragales families, we observed two gene losses and two pseudogenes in Paeonia obovata, and the loss of an intron was detected in the rps16 gene of Penthorum chinense. Comparisons among the 72 common protein-coding genes confirmed that the chloroplast genomes of S. sarmentosum and Paeonia obovata exhibit accelerated sequence evolution. Furthermore, a strong correlation was observed between the rates of genome evolution and genome size. The detected genome size variations are predominantly caused by the length of intergenic spacers, rather than losses of genes and introns, gene pseudogenization or IR expansion or contraction. The genome sizes of these species are negatively correlated with nucleotide substitution rates. Species with shorter duration of the life cycle tend to exhibit shorter chloroplast genomes than those with longer life cycles.  相似文献   

19.
Annexins are Ca2+-binding, membrane-interacting proteins, widespread among eukaryotes, consisting usually of four structurally similar repeated domains. It is accepted that vertebrate annexins derive from a double genome duplication event. It has been postulated that a single domain annexin, if found, might represent a molecule related to the hypothetical ancestral annexin. The recent discovery of a single-domain annexin in a bacterium, Cytophaga hutchinsonii, apparently confirmed this hypothesis. Here, we present a more complex picture. Using remote sequence similarity detection tools, a survey of bacterial genomes was performed in search of annexin-like proteins. In total, we identified about thirty annexin homologues, including single-domain and multi-domain annexins, in seventeen bacterial species. The thorough search yielded, besides the known annexin homologue from C. hutchinsonii, homologues from the Bacteroidetes/Chlorobi phylum, from Gemmatimonadetes, from beta- and delta-Proteobacteria, and from Actinobacteria. The sequences of bacterial annexins exhibited remote but statistically significant similarity to sequence profiles built of the eukaryotic ones. Some bacterial annexins are equipped with additional, different domains, for example those characteristic for toxins. The variation in bacterial annexin sequences, much wider than that observed in eukaryotes, and different domain architectures suggest that annexins found in bacteria may actually descend from an ancestral bacterial annexin, from which eukaryotic annexins also originate. The hypothesis of an ancient origin of bacterial annexins has to be reconciled with the fact that remarkably few bacterial strains possess annexin genes compared to the thousands of known bacterial genomes and with the patchy, anomalous phylogenetic distribution of bacterial annexins. Thus, a massive annexin gene loss in several bacterial lineages or very divergent evolution would appear a likely explanation. Alternative evolutionary scenarios, involving horizontal gene transfer between bacteria and protozoan eukaryotes, in either direction, appear much less likely. Altogether, current evidence does not allow unequivocal judgement as to the origin of bacterial annexins.  相似文献   

20.
Bacteria of the genus Shewanella can thrive in different environments and demonstrate significant variability in their metabolic and ecophysiological capabilities including cold and salt tolerance. Genomic characteristics underlying this variability across species are largely unknown. In this study, we address the problem by a comparison of the physiological, metabolic, and genomic characteristics of 19 sequenced Shewanella species. We have employed two novel approaches based on association of a phenotypic trait with the number of the trait-specific protein families (Pfam domains) and on the conservation of synteny (order in the genome) of the trait-related genes. Our first approach is top-down and involves experimental evaluation and quantification of the species?? cold tolerance followed by identification of the correlated Pfam domains and genes with a conserved synteny. The second, a bottom-up approach, predicts novel phenotypes of the species by calculating profiles of each Pfam domain among their genomes and following pair-wise correlation of the profiles and their network clustering. Using the first approach, we find a link between cold and salt tolerance of the species and the presence in the genome of a Na+/H+ antiporter gene cluster. Other cold-tolerance-related genes include peptidases, chemotaxis sensory transducer proteins, a cysteine exporter, and helicases. Using the bottom-up approach, we found several novel phenotypes in the newly sequenced Shewanella species, including degradation of aromatic compounds by an aerobic hybrid pathway in Shewanella woodyi, degradation of ethanolamine by Shewanella benthica, and propanediol degradation by Shewanella putrefaciens CN32 and Shewanella sp. W3-18-1.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号