首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the gamma-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the gamma-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

2.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

3.
The rapid increase in published genomic sequences for bacteria presents the first opportunity to reconstruct evolutionary events on the scale of entire genomes. However, extensive lateral gene transfer (LGT) may thwart this goal by preventing the establishment of organismal relationships based on individual gene phylogenies. The group for which cases of LGT are most frequently documented and for which the greatest density of complete genome sequences is available is the γ-Proteobacteria, an ecologically diverse and ancient group including free-living species as well as pathogens and intracellular symbionts of plants and animals. We propose an approach to multigene phylogeny using complete genomes and apply it to the case of the γ-Proteobacteria. We first applied stringent criteria to identify a set of likely gene orthologs and then tested the compatibilities of the resulting protein alignments with several phylogenetic hypotheses. Our results demonstrate phylogenetic concordance among virtually all (203 of 205) of the selected gene families, with each of the exceptions consistent with a single LGT event. The concatenated sequences of the concordant families yield a fully resolved phylogeny. This topology also received strong support in analyses aimed at excluding effects of heterogeneity in nucleotide base composition across lineages. Our analysis indicates that single-copy orthologous genes are resistant to horizontal transfer, even in ancient bacterial groups subject to high rates of LGT. This gene set can be identified and used to yield robust hypotheses for organismal phylogenies, thus establishing a foundation for reconstructing the evolutionary transitions, such as gene transfer, that underlie diversity in genome content and organization.  相似文献   

4.
In the field of phylogenetics and comparative genomics, it is important to establish orthologous relationships when comparing homologous sequences. Due to the slight sequence dissimilarity between orthologs and paralogs, it is prone to regarding paralogs as orthologs. For this reason, several methods based on evolutionary distance, phylogeny and BLAST have tried to detect orthologs with more precision. Depending on their algorithmic implementations, each of these methods sometimes has increased false negative or false positive rates. Here, we developed a novel algorithm for orthology detection that uses a distance method based on the phylogenetic criterion of minimum evolution. Our algorithm assumes that sets of sequences exhibiting orthologous relationships are evolutionarily less costly than sets that include one or more paralogous relationships. Calculation of evolutionary cost requires the reconstruction of a neighbor-joining (NJ) tree, but calculations are unaffected by the topology of any given NJ tree. Unlike tree reconciliation, our algorithm appears free from the problem of incorrect topologies of species and gene trees. The reliability of the algorithm was tested in a comparative analysis with two other orthology detection methods using 95 manually curated KOG datasets and 21 experimentally verified EXProt datasets. Sensitivity and specificity estimates indicate that the concept of minimum evolution could be valuable for the detection of orthologs.  相似文献   

5.
SUMMARY: ReMark is a fully automatic tool for clustering orthologs by combining a Recursive and a Markov clustering (MCL) algorithms. The ReMark detects and recursively clusters ortholog pairs through reciprocal BLAST best hits between multiple genomes running software program (RecursiveClustering.java) in the first step. Then, it employs MCL algorithm to compute the clusters (score matrices generated from the previous step) and refines the clusters by adjusting an inflation factor running software program (MarkovClustering.java). This method has two key features. One utilizes, to get more reliable results, the diagonal scores in the matrix of the initial ortholog clusters. Another clusters orthologs flexibly through being controlled naturally by MCL with a selected inflation factor. Users can therefore select the fitting state of orthologous protein clusters by regulating the inflation factor according to their research interests. AVAILABILITY AND IMPLEMENTATION: Source code for the orthologous protein clustering software is freely available for non-commercial use at http://dasan.sejong.ac.kr/~wikim/notice.html, implemented in Java 1.6 and supported on Windows and Linux.  相似文献   

6.
戴仁怀  陈学新  李子忠 《昆虫学报》2008,51(10):1055-1064
首次在国内利用28S rDNA D2区段和16S rDNA基因序列,结合50个形态特征对角顶叶蝉亚科(Deltocephalinae)[半翅目(Hemiptera): 叶蝉科(Cicadellidae)]19个属进行系统发育分析研究。从无水乙醇浸泡保存的标本中提取基因组DNA并扩增了19个内群和1种外群Typhlocybinae[半翅目(Hemiptera): 叶蝉科(Cicadellidae)]种类的28S rDNA D2基因片段并测序,同时扩增了16S rDNA基因片段并测序11条,采用了GenBank中1个种类的16S rDNA同源序列。采用PAUP*4.0和MrBayes3.0两个分析软件和3种建树方法,利用同源28S D2 rDNA和16S rDNA两个基因序列与形态特征结合进行系统发育分析研究。分析结果表明,二叉叶蝉族Macrostelini是一个单系,并在角顶叶蝉亚科的系统发育中处于基部的位置,是内群中最原始的族;角顶叶蝉族Deltocephalini中除了纹翅叶蝉属Nakaharanus,其余各属构成单系;殃叶蝉族Euscelini内属的归属比较混乱,可能是一个并系群,属间差异有待进一步研究。隆额叶蝉族Paralimnini与顶带叶蝉族Athysanini是姐妹群。带叶蝉属Scaphoideus与纹翅叶蝉属Nakaharanus是姐妹群,二者与木叶蝉属Phlogotettix的关系最近,三者构成一个单系,建议将三者归为带叶蝉族Scaphoideini。研究结果还表明,小眼叶蝉族Xestocephalini和Balcluthini的系统发育位置不明,有待进一步研究。  相似文献   

7.
MOTIVATION: The complete sequencing of many genomes has made it possible to identify orthologous genes descending from a common ancestor. However, reconstruction of evolutionary history over long time periods faces many challenges due to gene duplications and losses. Identification of orthologous groups shared by multiple proteomes therefore becomes a clustering problem in which an optimal compromise between conflicting evidences needs to be found. RESULTS: Here we present a new proteome-scale analysis program called MultiParanoid that can automatically find orthology relationships between proteins in multiple proteomes. The software is an extension of the InParanoid program that identifies orthologs and inparalogs in pairwise proteome comparisons. MultiParanoid applies a clustering algorithm to merge multiple pairwise ortholog groups from InParanoid into multi-species ortholog groups. To avoid outparalogs in the same cluster, MultiParanoid only combines species that share the same last ancestor. To validate the clustering technique, we compared the results to a reference set obtained by manual phylogenetic analysis. We further compared the results to ortholog groups in KOGs and OrthoMCL, which revealed that MultiParanoid produces substantially fewer outparalogs than these resources. AVAILABILITY: MultiParanoid is a freely available standalone program that enables efficient orthology analysis much needed in the post-genomic era. A web-based service providing access to the original datasets, the resulting groups of orthologs, and the source code of the program can be found at http://multiparanoid.cgb.ki.se.  相似文献   

8.
Based on an overview of progress in molecular systematics of the true fungi (Fungi/Eumycota) since 1990, little overlap was found among single-locus data matrices, which explains why no large-scale multilocus phylogenetic analysis had been undertaken to reveal deep relationships among fungi. As part of the project "Assembling the Fungal Tree of Life" (AFTOL), results of four Bayesian analyses are reported with complementary bootstrap assessment of phylogenetic confidence based on (1) a combined two-locus data set (nucSSU and nucLSU rDNA) with 558 species representing all traditionally recognized fungal phyla (Ascomycota, Basidiomycota, Chytridiomycota, Zygomycota) and the Glomeromycota, (2) a combined three-locus data set (nucSSU, nucLSU, and mitSSU rDNA) with 236 species, (3) a combined three-locus data set (nucSSU, nucLSU rDNA, and RPB2) with 157 species, and (4) a combined four-locus data set (nucSSU, nucLSU, mitSSU rDNA, and RPB2) with 103 species. Because of the lack of complementarity among single-locus data sets, the last three analyses included only members of the Ascomycota and Basidiomycota. The four-locus analysis resolved multiple deep relationships within the Ascomycota and Basidiomycota that were not revealed previously or that received only weak support in previous studies. The impact of this newly discovered phylogenetic structure on supraordinal classifications is discussed. Based on these results and reanalysis of subcellular data, current knowledge of the evolution of septal features of fungal hyphae is synthesized, and a preliminary reassessment of ascomal evolution is presented. Based on previously unpublished data and sequences from GenBank, this study provides a phylogenetic synthesis for the Fungi and a framework for future phylogenetic studies on fungi.  相似文献   

9.
Kuramae EE  Robert V  Snel B  Boekhout T 《Genomics》2006,88(4):387-393
The phylogenetic position of the fission yeast Schizosaccharomyces pombe in the fungal Tree of Life is still controversial. Three alternative phylogenetic positions have been proposed in the literature, namely (1) a position basal to the Hemiascomycetes and Euascomycetes, (2) a position as a sister group to the Euascomycetes with the Hemiascomycetes as a basal branch, or (3) a sister group to the Hemiascomycetes with Euascomycetes as a basal branch. Here we compared 91 clusters of orthologous proteins containing a single orthologue that are shared by 19 eukaryote genomes. The major part of these 91 orthologues supports a phylogenetic position of S. pombe as a basal lineage among the Ascomycota, thus supporting the second proposition. Interestingly, part of the orthologous proteins supported a fourth, not yet described alternative, in which S. pombe is basal to both Basidiomycota and Ascomycota. Both topologies of phylogenetic trees are well supported. We believe that both reflect correctly the phylogenetic history of the species concerned. This apparent paradox may point to a heterogeneous nuclear genome of the fungi. Importantly, this needs to be taken in consideration for a correct understanding of the fungal Tree of Life.  相似文献   

10.
Ge F  Wang LS  Kim J 《PLoS biology》2005,3(10):e316
With the availability of increasing amounts of genomic sequences, it is becoming clear that genomes experience horizontal transfer and incorporation of genetic information. However, to what extent such horizontal gene transfer (HGT) affects the core genealogical history of organisms remains controversial. Based on initial analyses of complete genomic sequences, HGT has been suggested to be so widespread that it might be the “essence of phylogeny” and might leave the treelike form of genealogy in doubt. On the other hand, possible biased estimation of HGT extent and the findings of coherent phylogenetic patterns indicate that phylogeny of life is well represented by tree graphs. Here, we reexamine this question by assessing the extent of HGT among core orthologous genes using a novel statistical method based on statistical comparisons of tree topology. We apply the method to 40 microbial genomes in the Clusters of Orthologous Groups database over a curated set of 297 orthologous gene clusters, and we detect significant HGT events in 33 out of 297 clusters over a wide range of functional categories. Estimates of positions of HGT events suggest a low mean genome-specific rate of HGT (2.0%) among the orthologous genes, which is in general agreement with other quantitative of HGT. We propose that HGT events, even when relatively common, still leave the treelike history of phylogenies intact, much like cobwebs hanging from tree branches.  相似文献   

11.
12.
The mycobactin siderophore system is present in many Mycobacterium species, including M. tuberculosis and other clinically relevant mycobacteria. This siderophore system is believed to be utilized by both pathogenic and nonpathogenic mycobacteria for iron acquisition in both in vivo and ex vivo iron-limiting environments, respectively. Several M. tuberculosis genes located in a so-called mbt gene cluster have been predicted to be required for the biosynthesis of the core scaffold of mycobactin based on sequence analysis. A systematic and controlled mutational analysis probing the hypothesized essential nature of each of these genes for mycobactin production has been lacking. The degree of conservation of mbt gene cluster orthologs remains to be investigated as well. In this study, we sought to conclusively establish whether each of nine mbt genes was required for mycobactin production and to examine the conservation of gene clusters orthologous to the M. tuberculosis mbt gene cluster in other bacteria. We report a systematic mutational analysis of the mbt gene cluster ortholog found in Mycobacterium smegmatis. This mutational analysis demonstrates that eight of the nine mbt genes investigated are essential for mycobactin production. Our genome mining and phylogenetic analyses reveal the presence of orthologous mbt gene clusters in several bacterial species. These gene clusters display significant organizational differences originating from an intricate evolutionary path that might have included horizontal gene transfers. Altogether, the findings reported herein advance our understanding of the genetic requirements for the biosynthesis of an important mycobacterial secondary metabolite with relevance to virulence.  相似文献   

13.
14.
Kim S  Kang J  Chung YJ  Li J  Ryu KH 《Proteins》2008,71(3):1113-1122
The quality of orthologous protein clusters (OPCs) is largely dependent on the results of the reciprocal BLAST (basic local alignment search tool) hits among genomes. The BLAST algorithm is very efficient and fast, but it is very difficult to get optimal solution among phylogenetically distant species because the genomes with large evolutionary distance typically have low similarity in their protein sequences. To reduce the false positives in the OPCs, thresholding is often employed on the BLAST scores. However, the thresholding also eliminates large numbers of true positives as the orthologs from distant species likely have low BLAST scores. To rectify this problem, we introduce a new hybrid method combining the Recursive and the Markov CLuster (MCL) algorithms without using the BLAST thresholding. In the first step, we use InParanoid to produce n(n-1)/2 ortholog tables from n genomes. After combining all the tables into one, our clustering algorithm clusters ortholog pairs recursively in the table. Then, our method employs MCL algorithm to compute the clusters and refines the clusters by adjusting the inflation factor. We tested our method using six different genomes and evaluated the results by comparing against Kegg Orthology (KO) OPCs, which are generated from manually curated pathways. To quantify the accuracy of the results, we introduced a new intuitive similarity measure based on our Least-move algorithm that computes the consistency between two OPCs. We compared the resulting OPCs with the KO OPCs using this measure. We also evaluated the performance of our method using InParanoid as the baseline approach. The experimental results show that, at the inflation factor 1.3, we produced 54% more orthologs than InParanoid sacrificing a little less accuracy (1.7% less) than InParanoid, and at the factor 1.4, produced not only 15% more orthologs than InParanoid but also a higher accuracy (1.4% more) than InParanoid.  相似文献   

15.
Specific plant cellulose synthases (CesA), encoded by a multigene family, are necessary for secondary wall synthesis in vascular tissues and are critical to wood production. We obtained full-length clones for the three CesAs that are highly expressed in developing xylem and examined their phylogenetic relationships and expression patterns in loblolly pine tissues. Full-length CesA clones were isolated from cDNA of developing loblolly pine (Pinus taeda) xylem and phylogenetic inferences made from plant CesA protein sequences. Expression of the three genes was examined by Northern blot analysis and semiquantitative RT-PCR. Each of three PtCesA genes is orthologous to one of the three angiosperm secondary cell wall CesAs. The PtCesAs are coexpressed in tissues of loblolly pine with tissues undergoing secondary cell wall biosynthesis showing the highest levels of expression. Phylogenetic and expression analyses suggest that functional roles for these loblolly pine CesAs are analogous to those of orthologs in angiosperm taxa. Based upon evidence from this and other studies, we suggest division of seed plant CesA genes into six major paralogous groups, each containing orthologs from various taxa. Available evidence suggests that paralogous CesA genes and their distinct functional roles evolved before the divergence of gymnosperm and angiosperm lineages.  相似文献   

16.
Orthologs generally are under selective pressure against loss of function, while paralogs usually accumulate mutations and finally die or deviate in terms of function or regulation. Most ortholog detection methods contaminate the resulting datasets with a substantial amount of paralogs. Therefore we aimed to implement a straightforward method that allows the detection of ortholog clusters with a reduced amount of paralogs from completely sequenced genomes. The described cross-species expansion of the reciprocal best BLAST hit method is a time-effective method for ortholog detection, which results in 68% truly orthologous clusters and the procedure specifically enriches single-copy orthologs. The detection of true orthologs can provide a phylogenetic toolkit to better understand evolutionary processes. In a study across six photosynthetic eukaryotes, nuclear genes of putative mitochondrial origin were shown to be over-represented among single copy orthologs. These orthologs are involved in fundamental biological processes like amino acid metabolism or translation. Molecular clock analyses based on this dataset yielded divergence time estimates for the red/green algae (1,142 MYA), green algae/land plant (725 MYA), mosses/seed plant (496 MYA), gymno-/angiosperm (385 MYA) and monocotyledons/core eudicotyledons (301 MYA) divergence times. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

17.
18.
PhyloGenie: automated phylome generation and analysis   总被引:12,自引:1,他引:11  
Phylogenetic reconstruction is the method of choice to determine the homologous relationships between sequences. Difficulties in producing high-quality alignments, which are the basis of good trees, and in automating the analysis of trees have unfortunately limited the use of phylogenetic reconstruction methods to individual genes or gene families. Due to the large number of sequences involved, phylogenetic analyses of proteomes preclude manual steps and therefore require a high degree of automation in sequence selection, alignment, phylogenetic inference and analysis of the resulting set of trees. We present a set of programs that automates the steps from seed sequence to phylogeny and a utility to extract all phylogenies that match specific topological constraints from a database of trees. Two example applications that show the type of questions that can be answered by phylome analysis are provided. The generation and analysis of the Thermoplasma acidophilum phylome with regard to lateral gene transfer between Thermoplasmata and Sulfolobus, showed best BLAST hits to be far less reliable indicators of lateral transfer than the corresponding protein phylogenies.The generation and analysis of the Danio rerio phylome provided more than twice as many proteins as described previously, supporting the hypothesis of an additional round of genome duplication in the actinopterygian lineage.  相似文献   

19.
Sugiyama J  Hosaka K  Suh SO 《Mycologia》2006,98(6):996-1005
The early diverging Ascomycota lineage, detected primarily from nSSU rDNA sequence-based phylogenetic analyses, includes enigmatic key taxa important to an understanding of the phylogeny and evolution of higher fungi. At the moment six representative genera of early diverging ascomycetes (i.e. Taphrina, Protomyces, Saitoella, Schizosaccharomyces, Pneumocystis and Neolecta) have been assigned to "Archiascomycetes" sensu Nishida and Sugi ama (1994) or the subphylum "Taphrinomycotina" sensu Eriksson and Winka (1997). The group includes fungi that are ecologically and morphologically diverse, and it is difficult therefore to define the group based on common phenotypic characters. Bayesian analyses of nSSU rDNA or combined nSSU and nLSU rDNA sequences supported previously published Ascomycota frameworks that consist of three major lineages (i.e. a group of early diverging Ascomycota. [Taphrinomycotina], Saccharomycotina and Pezizomycotina); Taphrinomycotina is the sister group of Saccharomycotina and Pezizomycotina. The 50% majority rule consensus of 18000 Bayesian MCMCMC-generated trees from multilocus gene sequences of nSSU rDNA, nLSU rDNA (D1/D2), RPB2 and beta-tubulin also showed the monophyly of the three subphyla and the basal position of Taphrinomycotina in Ascomycota with significantly higher statistical support. However to answer controversial questions on the origin, monophyly and evolution of the Taphrinomycotina, additional integrated phylogenetic analyses might be necessary using sequences of more genes with broader taxon sampling from the early diverging Ascomycota.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号