首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Phylogenies involving nonmodel species are based on a few genes, mostly chosen following historical or practical criteria. Because gene trees are sometimes incongruent with species trees, the resulting phylogenies may not accurately reflect the evolutionary relationships among species. The increase in availability of genome sequences now provides large numbers of genes that could be used for building phylogenies. However, for practical reasons only a few genes can be sequenced for a wide range of species. Here we asked whether we can identify a few genes, among the single-copy genes common to most fungal genomes, that are sufficient for recovering accurate and well-supported phylogenies. Fungi represent a model group for phylogenomics because many complete fungal genomes are available. An automated procedure was developed to extract single-copy orthologous genes from complete fungal genomes using a Markov Clustering Algorithm (Tribe-MCL). Using 21 complete, publicly available fungal genomes with reliable protein predictions, 246 single-copy orthologous gene clusters were identified. We inferred the maximum likelihood trees using the individual orthologous sequences and constructed a reference tree from concatenated protein alignments. The topologies of the individual gene trees were compared to that of the reference tree using three different methods. The performance of individual genes in recovering the reference tree was highly variable. Gene size and the number of variable sites were highly correlated and significantly affected the performance of the genes, but the average substitution rate did not. Two genes recovered exactly the same topology as the reference tree, and when concatenated provided high bootstrap values. The genes typically used for fungal phylogenies did not perform well, which suggests that current fungal phylogenies based on these genes may not accurately reflect the evolutionary relationships among species. Analyses on subsets of species showed that the phylogenetic performance did not seem to depend strongly on the sample. We expect that the best-performing genes identified here will be very useful for phylogenetic studies of fungi, at least at a large taxonomic scale. Furthermore, we compare the method developed here for finding genes for building robust phylogenies with previous ones and we advocate that our method could be applied to other groups of organisms when more complete genomes are available.  相似文献   

2.
3.

Background  

The large amount of completely sequenced genomes allows genomic context analysis to predict reliable functional associations between prokaryotic proteins. Major methods rely on the fact that genes encoding physically interacting partners or members of shared metabolic pathways tend to be proximate on the genome, to evolve in a correlated manner and to be fused as a single sequence in another organism.  相似文献   

4.
Use of whole genome sequence data to infer baculovirus phylogeny   总被引:18,自引:0,他引:18       下载免费PDF全文
Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance analysis and a novel method developed here, termed neighbor pair analysis. The third set recorded gene content by scoring gene presence or absence in each genome. All three data sets yielded phylogenies supporting the separation of the Nucleopolyhedrovirus (NPV) and Granulovirus (GV) genera, the division of the NPVs into groups I and II, and species relationships within group I NPVs. Generation of phylogenies based on the combined sequences of all 63 shared genes proved to be the most effective approach to resolving the relationships among the group II NPVs and the GVs. The history of gene acquisitions and losses that have accompanied baculovirus diversification was visualized by mapping the gene content data onto the phylogenetic tree. This analysis highlighted the fluid nature of baculovirus genomes, with evidence of frequent genome rearrangements and multiple gene content changes during their evolution. Of more than 416 genes identified in the genomes analyzed, only 63 are present in all nine genomes, and 200 genes are found only in a single genome. Despite this fluidity, the whole genome-based methods we describe are sufficiently powerful to recover the underlying phylogeny of the viruses.  相似文献   

5.
Chlorarachniophytes are amoeboflagellate cercozoans that acquired a plastid by secondary endosymbiosis. Chlorarachniophytes are the last major group of algae for which there is no completely sequenced plastid genome. Here we describe the 69.2-kbp chloroplast genome of the model chlorarachniophyte Bigelowiella natans. The genome is highly reduced in size compared with plastids of other photosynthetic algae and is closer in size to genomes of several nonphotosynthetic plastids. Unlike nonphotosynthetic plastids, however, the B. natans chloroplast genome has not sustained a massive loss of genes, and it retains nearly all of the functional photosynthesis-related genes represented in the genomes of other green algae. Instead, the genome is highly compacted and gene dense. The genes are organized with a strong strand bias, and several unusual rearrangements and inversions also characterize the genome; notably, an inversion in the small-subunit rRNA gene, a translocation of 3 genes in the major ribosomal protein operon, and the fragmentation of the cluster encoding the large photosystem proteins PsaA and PsaB. The chloroplast endosymbiont is known to be a green alga, but its evolutionary origin and relationship to other primary and secondary green plastids has been much debated. A recent hypothesis proposes that the endosymbionts of chlorarachniophytes and euglenids share a common origin (the Cabozoa hypothesis). We inferred phylogenies using individual and concatenated gene sequences for all genes in the genome. Concatenated gene phylogenies show a relationship between the B. natans plastid and the ulvophyte-trebouxiophyte-chlorophyte clade of green algae to the exclusion of Euglena. The B. natans plastid is thus not closely related to that of Euglena, which suggests that plastids originated independently in these 2 groups and the Cabozoa hypothesis is false.  相似文献   

6.
The celC gene codifies for a cellulase that fulfils a very significant role in the infection process of clover by Rhizobium leguminosarum. This gene is located in the celABC operon present in the chromosome of strains representing R. leguminosarum, Rhizobium etli and Rhizobium radiobacter whose genomes have been completely sequenced. Nevertheless, the existence of this gene in other species of the genus Rhizobium had not been investigated to date. In this study, the celC gene was analysed for the first time in several species of this genus isolated from legume nodules and plant tumours, in order to compare the celC phylogeny to those of other chromosomal and plasmidic genes. The results obtained showed that phylogenies of celC and chromosomal genes, such as rrs, recA and atpD, were completely congruent, whereas no relation was found with symbiotic or virulence genes. Therefore, the suitability and usefulness of the celC gene to differentiate species of the genus Rhizobium, especially those with closely related rrs genes, was highlighted. Consequently, the taxonomic status of several strains of the genus Rhizobium with completely sequenced genomes is also discussed.  相似文献   

7.
Liu QH  Guo ZG  Ren JH 《遗传》2012,34(7):907-918
多基因系统发育研究方法是系统发育分析中的一个重要手段,基因树冲突已成为分子系统发育研究中日益突出的问题。烯醇化酶基因(eno)及其编码的蛋白广泛存在于五界系统中,烯醇化酶为糖酵解途径中重要酶类。文章选取原核生物已注释的eno基因序列进行了系统发育分析。对其中的138个模式菌株的eno基因序列进行系统发育分析和同源性搜索,发现19个模式菌株的eno基因是通过水平转移而来;并通过核苷酸组成、密码子偏好性和基因排列等基因特征分析,进一步验证了水平转移基因的外源性。结果表明:原核生物eno序列具有较高保守性,其大小适中,是研究原核生物系统发育的良好材料。文章在对基因水平转移的供体和受体菌株生活习性、进化历史以及烯醇化酶的结构和功能的研究过程中提供重要参考价值。  相似文献   

8.
Phages play a key role in the marine environment by regulating the transfer of energy between trophic levels and influencing global carbon and nutrient cycles. The diversity of marine phage communities remains difficult to characterize because of the lack of a signature gene common to all phages. Recent studies have demonstrated the presence of host-derived auxiliary metabolic genes in phage genomes, such as those belonging to the Pho regulon, which regulates phosphate uptake and metabolism under low-phosphate conditions. Among the completely sequenced phage genomes in GenBank, this study identified Pho regulon genes in nearly 40% of the marine phage genomes, while only 4% of nonmarine phage genomes contained these genes. While several Pho regulon genes were identified, phoH was the most prevalent, appearing in 42 out of 602 completely sequenced phage genomes. Phylogenetic analysis demonstrated that phage phoH sequences formed a cluster distinct from those of their bacterial hosts. PCR primers designed to amplify a region of the phoH gene were used to determine the diversity of phage phoH sequences throughout a depth profile in the Sargasso Sea and at six locations worldwide. phoH was present at all sites examined, and a high diversity of phoH sequences was recovered. Most phoH sequences belonged to clusters without any cultured representatives. Each depth and geographic location had a distinct phoH composition, although most phoH clusters were recovered from multiple sites. Overall, phoH is an effective signature gene for examining phage diversity in the marine environment.  相似文献   

9.
The whole mitochondrial genome (14,915 nt) of Pollicipes mitella (Crustacea, Maxillopoda, Cirripedia, Thoracica) was sequenced and characterized. It is the shortest of the 31 completely sequenced crustacean mitochondrial genomes, with the exception of a copepod Tigriopus japonicus (14,628 nt). It consists of the usual 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes, and 1 relatively short non-coding region (294 nt). The thoracican cirripeds apart from Megabalanus volcano have the same arrangement of protein-coding genes as Limulus polypemus, but there are frequent tRNA gene translocations (at least 8). Some interesting translocation features that may be specific to the thoracican cirriped lineage are as follows: 1) trnK-trnQ lies between the control region and trnI, 2) trnA-trnE lies between trnN and trnS1, 3) trnP lies between ND4L and trnT, and 4) trnY-trnC lies between trnS2 and ND1. In P. mitella there are two trnL genes (L1 and L2) in the typical crustacean positions (ND1-L1-LrRNA and CO1-L2-CO2). The present result is compared and discussed with the other three cirriped mitochondrial genomes from one pedunculate (Pollicipes polymerus) and two sessiles (Tetraclita japonica and M. volcano) published so far. Mitochondrial protein phylogenies reconstructed by the BI and ML algorithms show that the thoracican Cirripedia is monophyletic (BPP 100/BP 100) and associated with Remipedia (BPP 98/BP 35). In addition, Oligostraca, including Ostracoda, Branchiura, and Pentastomida, is a monophyletic group (BPP 99/BP 68), and is basal to all the other examined arthropods. Remipedia + Cirripedia appears as an independent lineage within Arthropoda, apart from Thoracopoda (Malacostraca, Branchiopda, and Cephalocarida). The Thoracopoda is paraphyletic to Hexapoda. The present result suggests that the monophylies of Crustacea and Maxillopoda should be reconsidered.  相似文献   

10.
11.
The rapidly emerging field of comparative genomics has yielded dramatic results. Comparative genome analysis has become feasible with the availability of a number of completely sequenced genomes. Comparison of complete genomes between organisms allow for global views on genome evolution and the availability of many completely sequenced genomes increases the predictive power in deciphering the hidden information in genome design, function and evolution. Thus, comparison of human genes with genes from other genomes in a genomic landscape could help assign novel functions for un-annotated genes. Here, we discuss the recently used techniques for comparative genomics and their derived inferences in genome biology.  相似文献   

12.
Gene content has been shown to contain a strong phylogenetic signal, yet its usage for phylogenetic questions is hampered by horizontal gene transfer and parallel gene loss and until now required completely sequenced genomes. Here, we introduce an approach that allows the phylogenetic signal in gene content to be applied to any set of sequences, using signature genes for phylogenetic classification. The hundreds of publicly available genomes allow us to identify signature genes at various taxonomic depths, and we show how the presence of signature genes in an unspecified sample can be used to characterize its taxonomic composition. We identify 8,362 signature genes specific for 112 prokaryotic taxa. We show that these signature genes can be used to address phylogenetic questions on the basis of gene content in cases where classic gene content or sequence analyses provide an ambiguous answer, such as for Nanoarchaeum equitans, and even in cases where complete genomes are not available, such as for metagenomics data. Cross-validation experiments leaving out up to 30% of the species show that approximately 92% of the signature genes correctly place the species in a related clade. Analyses of metagenomics data sets with the signature gene approach are in good agreement with the previously reported species distributions based on phylogenetic analysis of marker genes. Summarizing, signature genes can complement traditional sequence-based methods in addressing taxonomic questions.  相似文献   

13.
Lactic acid bacteria (LAB) have been used in fermentation processes for centuries. More recent applications including the use of LAB as probiotics have significantly increased industrial interest. Here we present a comparative genomic analysis of four completely sequenced Lactobacillus strains, isolated from the human gastrointestinal tract, versus 25 lactic acid bacterial genomes present in the public database at the time of analysis. Lactobacillus acidophilus NCFM, Lactobacillus johnsonii NCC533, Lactobacillus gasseri ATCC33323, and Lactobacillus plantarum WCFS1are all considered probiotic and widely used in industrial applications. Using Differential Blast Analysis (DBA), each genome was compared to the respective remaining three other Lactobacillus and 25 other LAB genomes. DBA highlighted strain-specific genes that were not represented in any other LAB used in this analysis and also identified group-specific genes shared within lactobacilli. Initial comparative analyses highlighted a significant number of genes involved in cell adhesion, stress responses, DNA repair and modification, and metabolic capabilities. Furthermore, the range of the recently identified potential autonomous units (PAUs) was broadened significantly, indicating the possibility of distinct families within this genetic element. Based on in silico results obtained for the model organism L. acidophilus NCFM, DBA proved to be a valuable tool to identify new key genetic regions for functional genomics and also suggested re-classification of previously annotated genes.  相似文献   

14.
There has been a dramatic increase in the number of completely sequenced bacterial genomes during the past two years as a result of the efforts both of public genome agencies and the pharmaceutical industry. The availability of completely sequenced genomes permits more systematic analyses of genes, evolution and genome function than was otherwise possible. Using computational methods - which are used to identify genes and their functions including statistics, sequence similarity, motifs, profiles, protein folds and probabilistic models - it is possible to develop characteristic genome signatures, assign functions to genes, identify pathogenic genes, identify metabolic pathways, develop diagnostic probes and discover potential drug-binding sites. All of these directions are critical to understanding bacterial growth, pathogenicity and host-pathogen interactions.  相似文献   

15.
Qi M  Wang D  Bradley CA  Zhao Y 《PloS one》2011,6(1):e16451
Bacterial blight, caused by Pseudomonas savastanoi pv. glycinea (Psg), is a common disease of soybean. In an effort to compare a current field isolate with one isolated in the early 1960s, the genomes of two Psg strains, race 4 and B076, were sequenced using 454 pyrosequencing. The genomes of both Psg strains share more than 4,900 highly conserved genes, indicating very low genetic diversity between Psg genomes. Though conserved, genome rearrangements and recombination events occur commonly within the two Psg genomes. When compared to each other, 437 and 163 specific genes were identified in B076 and race 4, respectively. Most specific genes are plasmid-borne, indicating that acquisition and maintenance of plasmids may represent a major mechanism to change the genetic composition of the genome and even acquire new virulence factors. Type three secretion gene clusters of Psg strains are near identical with that of P. savastanoi pv. phaseolicola (Pph) strain 1448A and they shared 20 common effector genes. Furthermore, the coronatine biosynthetic cluster is present on a large plasmid in strain B076, but not in race 4. In silico subtractive hybridization-based comparative genomic analyses with nine sequenced phytopathogenic pseudomonads identified dozens of specific islands (SIs), and revealed that the genomes of Psg strains are more similar to those belonging to the same genomospecies such as Pph 1448A than to other phytopathogenic pseudomonads. The number of highly conserved genes (core genome) among them decreased dramatically when more genomes were included in the subtraction, suggesting the diversification of pseudomonads, and further indicating the genome heterogeneity among pseudomonads. However, the number of specific genes did not change significantly, suggesting these genes are indeed specific in Psg genomes. These results reinforce the idea of a species complex of P. syringae and support the reclassification of P. syringae into different species.  相似文献   

16.
In this study, a set of 80 completely sequenced procaryotic genomes has been analysed by an alignment-free method, namely the expectancy-rectified frequency of bigrams or 2-tuples, representing the 16 combinations of A, T, G, C. It demonstrates that all genomes exhibit periodic oscillations of their nucleotide sequence, with a period close to 11 phosphodiester bonds, and resembling in shape an exponentially dampened sinusoid at the distance from 5 to 49 bonds. Interestingly, the amplitude of nucleotide oscillation (but not the period) can differ drastically from one species to another. I show that these differences are due neither to the (G + C) content, nor to the size of the genome. They are not directly related to phylogeny, since specific genomes from Archaea and Bacteria can display large as well as small amplitudes. I have compared also a set of genes coding for proteins rich in alpha helical structures (as determined by X-ray diffraction) with a set of genes coding for proteins devoid of alpha helices. The first set has periodic oscillations of large amplitude, with an 11-bond period, while the second has none. Furthermore, I analysed a large number of sets of homologous genes from several different species. They exhibit very different amplitudes of oscillations. Altogether, the data with their statistical analyses strongly suggest that the nucleotide oscillations are due to the 'genomic style of proteins', which means that homologous proteins, having the same biochemical function in different organisms, may have different secondary structures or may use different ways to be constructed. I realize that this idea is a heterodox one, but I believe that it can shed a new light both on phylogenies and on constraints between proteins and their coding sequences.  相似文献   

17.

Background  

Plastid-bearing cryptophytes like Cryptomonas contain four genomes in a cell, the nucleus, the nucleomorph, the plastid genome and the mitochondrial genome. Comparative phylogenetic analyses encompassing DNA sequences from three different genomes were performed on nineteen photosynthetic and four colorless Cryptomonas strains. Twenty-three rbc L genes and fourteen nuclear SSU rDNA sequences were newly sequenced to examine the impact of photosynthesis loss on codon usage in the rbc L genes, and to compare the rbc L gene phylogeny in terms of tree topology and evolutionary rates with phylogenies inferred from nuclear ribosomal DNA (concatenated SSU rDNA, ITS2 and partial LSU rDNA), and nucleomorph SSU rDNA.  相似文献   

18.
Until the recent discovery of pRF in Rickettsia felis, the obligate intracellular bacteria of the genus Rickettsia (Rickettsiales: Rickettsiaceae) were thought not to possess plasmids. We describe pRM, a plasmid from Rickettsia monacensis, which was detected by pulsed-field gel electrophoresis and Southern blot analyses of DNA from two independent R. monacensis populations transformed by transposon-mediated insertion of coupled green fluorescent protein and chloramphenicol acetyltransferase marker genes into pRM. Two-dimensional electrophoresis showed that pRM was present in rickettsial cells as circular and linear isomers. The 23,486-nucleotide (31.8% G/C) pRM plasmid was cloned from the transformant populations by chloramphenicol marker rescue of restriction enzyme-digested transformant DNA fragments and PCR using primers derived from sequences of overlapping restriction fragments. The plasmid was sequenced. Based on BLAST searches of the GenBank database, pRM contained 23 predicted genes or pseudogenes and was remarkably similar to the larger pRF plasmid. Two of the 23 genes were unique to pRM and pRF among sequenced rickettsial genomes, and 4 of the genes shared by pRM and pRF were otherwise found only on chromosomes of R. felis or the ancestral group rickettsiae R. bellii and R. canadensis. We obtained pulsed-field gel electrophoresis and Southern blot evidence for a plasmid in R. amblyommii isolate WB-8-2 that contained genes conserved between pRM and pRF. The pRM plasmid may provide a basis for the development of a rickettsial transformation vector.  相似文献   

19.
SUMMARY: We make available a large cross-comparison for 16 of the completely sequenced genomes and additional eukaryotic genes. The alignments were performed at the protein level using liberal similarity bounds in order to capture as many significant alignments as possible. This dataset will be updated as new genomes become available.  相似文献   

20.
Historically, fungal multigene phylogenies have been reconstructed based on a small number of commonly used genes. The availability of complete fungal genomes has given rise to a new wave of model organisms that provide large number of genes potentially useful for building robust gene genealogies. Unfortunately, cross-utilization of these resources to study phylogenetic relationships in the vast majority of non-model fungi (i.e. "orphan" species) remains an unexamined question. To address this problem, we developed a method coupled with a program named "PHYLORPH" (PHYLogenetic markers for ORPHans). The method screens fungal genomic databases (107 fungal genomes fully sequenced) for single copy genes that might be easily transferable and well suited for studies at low taxonomic levels (for example, in species complexes) in non-model fungal species. To maximize the chance to target genes with informative regions, PHYLORPH displays a graphical evaluation system based on the estimation of nucleotide divergence relative to substitution type. The usefulness of this approach was tested by developing markers in four non-model groups of fungal pathogens. For each pathogen considered, 7 to 40% of the 10-15 best candidate genes proposed by PHYLORPH yielded sequencing success. Levels of polymorphism of these genes were compared with those obtained for some genes traditionally used to build fungal phylogenies (e.g. nuclear rDNA, β-tubulin, γ-actin, Elongation factor EF-1α). These genes were ranked among the best-performing ones and resolved accurately taxa relationships in each of the four non-model groups of fungi considered. We envision that PHYLORPH will constitute a useful tool for obtaining new and accurate phylogenetic markers to resolve relationships between closely related non-model fungal species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号