首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Phenotypic behavior of a group of organisms can be studied using a range of molecular evolutionary tools that help to determine evolutionary relationships. Traditionally a gene or a set of gene sequences was used for generating phylogenetic trees. Incomplete evolutionary information in few selected genes causes problems in phylogenetic tree construction. Whole genomes are used as remedy. Now, the task is to identify the suitable parameters to extract the hidden information from whole genome sequences that truly represent evolutionary information. In this study we explored a random anchor (a stretch of 100 nucleotides) based approach (ABWGP) for finding distance between any two genomes, and used the distance estimates to compute evolutionary trees. A number of strains and species of Mycobacteria were used for this study. Anchor-derived parameters, such as cumulative normalized score, anchor order and indels were computed in a pair-wise manner, and the scores were used to compute distance/phylogenetic trees. The strength of branching was determined by bootstrap analysis. The terminal branches are clearly discernable using the distance estimates described here. In general, different measures gave similar trees except the trees based on indels. Overall the tree topology reflected the known biology of the organisms. This was also true for different strains of Escherichia coli. A new whole genome-based approach has been described here for studying evolutionary relationships among bacterial strains and species.  相似文献   

2.
Phylogenies involving nonmodel species are based on a few genes, mostly chosen following historical or practical criteria. Because gene trees are sometimes incongruent with species trees, the resulting phylogenies may not accurately reflect the evolutionary relationships among species. The increase in availability of genome sequences now provides large numbers of genes that could be used for building phylogenies. However, for practical reasons only a few genes can be sequenced for a wide range of species. Here we asked whether we can identify a few genes, among the single-copy genes common to most fungal genomes, that are sufficient for recovering accurate and well-supported phylogenies. Fungi represent a model group for phylogenomics because many complete fungal genomes are available. An automated procedure was developed to extract single-copy orthologous genes from complete fungal genomes using a Markov Clustering Algorithm (Tribe-MCL). Using 21 complete, publicly available fungal genomes with reliable protein predictions, 246 single-copy orthologous gene clusters were identified. We inferred the maximum likelihood trees using the individual orthologous sequences and constructed a reference tree from concatenated protein alignments. The topologies of the individual gene trees were compared to that of the reference tree using three different methods. The performance of individual genes in recovering the reference tree was highly variable. Gene size and the number of variable sites were highly correlated and significantly affected the performance of the genes, but the average substitution rate did not. Two genes recovered exactly the same topology as the reference tree, and when concatenated provided high bootstrap values. The genes typically used for fungal phylogenies did not perform well, which suggests that current fungal phylogenies based on these genes may not accurately reflect the evolutionary relationships among species. Analyses on subsets of species showed that the phylogenetic performance did not seem to depend strongly on the sample. We expect that the best-performing genes identified here will be very useful for phylogenetic studies of fungi, at least at a large taxonomic scale. Furthermore, we compare the method developed here for finding genes for building robust phylogenies with previous ones and we advocate that our method could be applied to other groups of organisms when more complete genomes are available.  相似文献   

3.
Individual genes or regions are still commonly used to estimate the phylogenetic relationships among viral isolates. The genomic regions that can faithfully provide assessments consistent with those predicted with full-length genome sequences would be preferable to serve as good candidates of the phylogenetic markers for molecular epidemiological studies of many viruses. Here we employed a statistical method to evaluate the evolutionary relationships between individual viral genes and full-length genomes without tree construction as a way to determine which gene can match the genome well in phylogenetic analyses. This method was performed by calculation of linear correlations between the genetic distance matrices of aligned individual gene sequences and aligned genome sequences. We applied this method to the phylogenetic analyses of porcine circovirus 2 (PCV2), measles virus (MV), hepatitis E virus (HEV) and Japanese encephalitis virus (JEV). Phylogenetic trees were constructed for comparisons and the possible factors affecting the method accuracy were also discussed in the calculations. The results revealed that this method could produce results consistent with those of previous studies about the proper consensus sequences that could be successfully used as phylogenetic markers. And our results also suggested that these evolutionary correlations could provide useful information for identifying genes that could be used effectively to infer the genetic relationships.  相似文献   

4.
Taxonomy of Cyanobacteria, the oldest phototrophic prokaryotes, is problematic for many years due to their simple morphology, high variability and adaptability to diverse ecological niches. After introduction of the polyphasic approach, which is based on the combination of several criteria (molecular sequencing, morphological and ecological), the whole classification system of these organisms is subject to reorganization. The aim of this study was to evaluate whether the outer membrane efflux protein (OMEP) sequences can be used as a molecular marker for resolving the phylogeny and taxonomic status of closely related cyanobacteria. We have performed phylogenetic analyses based on the amino acid sequences of the OMEP and the DNA sequences of the 16S rRNA gene from 86 cyanobacterial species/strains with completely sequenced genomes. Phylogenetic trees based on the OMEP showed that most of the cyanobacterial species/strains belonging to different genera are clustered in separate clades supported by high bootstrap values. Comparing the OMEP trees with the 16S rDNA tree clearly showed that the OMEP is more suitable marker in resolving phylogenetic relationships within Cyanobacteria at generic and species level.  相似文献   

5.
With the development of genome sequencing more whole genomes of microorganisms were completed, many methods wereintroduced to reconstruct the phylogenetic tree of those microorganismswith the information extracted from the whole genomes through variousways of transforming or mapping the whole genome sequences into otherforms which can describe the evolutionary distance in a new way. We thinkit might be possible that there exists information buried in the wholegenome transferred along lineage, which remains stable and is moreessential than sequence conservation of individual genes or the arrangementof some genes of a selected set. We need to find one measurement that caninvolve as many phylogenetic features as possible that are beyond thegenome sequence itself. We converted each genome sequence of themicroorganisms into another linear sequence to represent the functionalstructure of the sequence, and we used a new information function tocalculate the discrepancy of sequences and to get one distance matrix of thegenomes, and built one phylogenetic tree with a neighbor joining method.The resulting tree shows that the major lineages are consistent with theresult based on their 16srRNA sequences. Our method discovered onephylogenetic feature derived from the genome sequences and the encodedgenes that can rebuild the phylogenetic tree correctly. The mapping of onegenome sequence to its new form representing the relative positions of thefunctional genes provides a new way to measure the phylogeneticrelationships, and with the more specific classification of gene functions theresult could be more sensitive.  相似文献   

6.
Reconstructing a tree of life by inferring evolutionary history is an important focus of evolutionary biology. Phylogenetic reconstructions also provide useful information for a range of scientific disciplines such as botany, zoology, phylogeography, archaeology and biological anthropology. Until the development of protein and DNA sequencing techniques in the 1960s and 1970s, phylogenetic reconstructions were based on fossil records and comparative morphological/physiological analyses. Since then, progress in molecular phylogenetics has compensated for some of the shortcomings of phenotype-based comparisons. Comparisons at the molecular level increase the accuracy of phylogenetic inference because there is no environmental influence on DNA/peptide sequences and evaluation of sequence similarity is not subjective. While the number of morphological/physiological characters that are sufficiently conserved for phylogenetic inference is limited, molecular data provide a large number of datapoints and enable comparisons from diverse taxa. Over the last 20 years, developments in molecular phylogenetics have greatly contributed to our understanding of plant evolutionary relationships. Regions in the plant nuclear and organellar genomes that are optimal for phylogenetic inference have been determined and recent advances in DNA sequencing techniques have enabled comparisons at the whole genome level. Sequences from the nuclear and organellar genomes of thousands of plant species are readily available in public databases, enabling researchers without access to molecular biology tools to investigate phylogenetic relationships by sequence comparisons using the appropriate nucleotide substitution models and tree building algorithms. In the present review, the statistical models and algorithms used to reconstruct phylogenetic trees are introduced and advances in the exploration and utilization of plant genomes for molecular phylogenetic analyses are discussed.  相似文献   

7.
MOTIVATION: Comparative analysis of metabolic pathways in different genomes can give insights into the understanding of evolutionary and organizational relationships among species. This type of analysis allows one to measure the evolution of complete processes (with different functional roles) rather than the individual elements of a conventional analysis. We present a new technique for the phylogenetic analysis of metabolic pathways based on the topology of the underlying graphs. A distance measure between graphs is defined using the similarity between nodes of the graphs and the structural relationship between them. This distance measure is applied to the enzyme-enzyme relational graphs derived from metabolic pathways. Using this approach, pathways and group of pathways of different organisms are compared to each other and the resulting distance matrix is used to obtain a phylogenetic tree. RESULTS: We apply the method to the Citric Acid Cycle and the Glycolysis pathways of different groups of organisms, as well as to the Carbohydrate metabolic networks. Phylogenetic trees obtained from the experiments were close to existing phylogenies and revealed interesting relationships among organisms.  相似文献   

8.
Traditional phylogenetic analysis is based on multiple sequence alignment. With the development of worldwide genome sequencing project, more and more completely sequenced genomes become available. However, traditional sequence alignment tools are impossible to deal with large-scale genome sequence. So, the development of new algorithms to infer phylogenetic relationship without alignment from whole genome information represents a new direction of phylogenetic study in the post-genome era. In the present study, a novel algorithm based on BBC (base-base correlation) is proposed to analyze the phylogenetic relationships of HEV (Hepatitis E virus). When 48 HEV genome sequences are analyzed, the phylogenetic tree that is constructed based on BBC algorithm is well consistent with that of previous study. When compared with methods of sequence alignment, the merit of BBC algorithm appears to be more rapid in calculating evolutionary distances of whole genome sequence and not requires any human intervention, such as gene identification, parameter selection. BBC algorithm can serve as an alternative to rapidly construct phylogenetic trees and infer evolutionary relationships.  相似文献   

9.
Mitochondrial DNA sequences can be used to estimate phylogenetic relationships among animal taxa and for molecular phylogenetic evolution analysis. With the development of sequencing technology, more and more mitochondrial sequences have been made available in public databases, including whole mitochondrial DNA sequences. These data have been used for phylogenetic analysis of animal species, and for studies of evolutionary processes. We made phylogenetic analyses of 19 species of Cervidae, with Bos taurus as the outgroup. We used neighbor joining, maximum likelihood, maximum parsimony, and Bayesian inference methods on whole mitochondrial genome sequences. The consensus phylogenetic trees supported monophyly of the family Cervidae; it was divided into two subfamilies, Plesiometacarpalia and Telemetacarpalia, and four tribes, Cervinae, Muntiacinae, Hydropotinae, and Odocoileinae. The divergence times in these families were estimated by phylogenetic analysis using the Bayesian method with a relaxed molecular clock method; the results were consistent with those of previous studies. We concluded that the evolutionary structure of the family Cervidae can be reconstructed by phylogenetic analysis based on whole mitochondrial genomes; this method could be used broadly in phylogenetic evolutionary analysis of animal taxa.  相似文献   

10.
Species belonging to the phylum Synergistetes are poorly characterized. Though the known species display Gram-negative characteristics and the ability to ferment amino acids, no single characteristic is known which can define this group. For eight Synergistetes species, complete genome sequences or draft genomes have become available. We have used these genomes to construct detailed phylogenetic trees for the Synergistetes species and carried out comprehensive analysis to identify molecular markers consisting of conserved signature indels (CSIs) in protein sequences that are specific for either all Synergistetes or some of their sub-groups. We report here identification of 32 CSIs in widely distributed proteins such as RpoB, RpoC, UvrD, GyrA, PolA, PolC, MraW, NadD, PyrE, RpsA, RpsH, FtsA, RadA, etc., including a large >300 aa insert within the RpoC protein, that are present in various Synergistetes species, but except for isolated bacteria, these CSIs are not found in the protein homologues from any other organisms. These CSIs provide novel molecular markers that distinguish the species of the phylum Synergistetes from all other bacteria. The large numbers of other CSIs discovered in this work provide valuable information that supports and consolidates evolutionary relationships amongst the sequenced Synergistetes species. Of these CSIs, seven are specifically present in Jonquetella, Pyramidobacter and Dethiosulfovibrio species indicating a cladal relationship among them, which is also strongly supported by phylogenetic trees. A further 15 CSIs that are only present in Jonquetella and Pyramidobacter indicate a close association between these two species. Additionally, a previously described phylogenetic relationship between the Aminomonas and Thermanaerovibrio species was also supported by 9 CSIs. The strong relationships indicated by the indel analysis provide incentives for the grouping of species from these clades into higher taxonomic groups such as families or orders. The identified molecular markers, due to their specificity for Synergistetes and presence in highly conserved regions of important proteins suggest novel targets for evolutionary, genetic and biochemical studies on these bacteria as well as for the identification of additional species belonging to this phylum in different environments.  相似文献   

11.
Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.  相似文献   

12.
Phylogenetic tree reconstruction requires construction of a multiple sequence alignment (MSA) from sequences. Computationally, it is difficult to achieve an optimal MSA for many sequences. Moreover, even if an optimal MSA is obtained, it may not be the true MSA that reflects the evolutionary history of the underlying sequences. Therefore, errors can be introduced during MSA construction which in turn affects the subsequent phylogenetic tree construction. In order to circumvent this issue, we extend the application of the k-tuple distance to phylogenetic tree reconstruction. The k-tuple distance between two sequences is the sum of the differences in frequency, over all possible tuples of length k, between the sequences and can be estimated without MSAs. It has been traditionally used to build a fast ‘guide tree’ to assist the construction of MSAs. Using the 1470 simulated sets of sequences generated under different evolutionary scenarios, the neighbor-joining trees and BioNJ trees, we compared the performance of the k-tuple distance with four commonly used distance estimators including Jukes–Cantor, Kimura, F84 and Tamura–Nei. These four distance estimators fall into the category of model-based distance estimators, as each of them takes account of a specific substitution model in order to compute the distance between a pair of already aligned sequences. Results show that trees constructed from the k-tuple distance are more accurate than those from other distances most time; when the divergence between underlying sequences is high, the tree accuracy could be twice or higher using the k-tuple distance than other estimators. Furthermore, as the k-tuple distance voids the need for constructing an MSA, it can save tremendous amount of time for phylogenetic tree reconstructions when the data include a large number of sequences.  相似文献   

13.
In silico genomic fingerprints were produced by virtual hybridization of 191 fully sequenced bacterial genomes using a set of 15,264 13-mer probes specially designed to produce universal whole genome fingerprints. A novel approach for constructing phylogenetic trees, based on comparative analysis of genomic fingerprints, was developed. The resultant bacterial phylogenetic tree had strong similarities to those produced from the alignment of conserved sequences. Notably, the trees derived from the alignment of other conserved COG genes divided the Bacillus and Corynebacterium genera into the same subgroups produced by the novel bacterial tree. A number of discrepancies between both techniques were observed for the grouping of some Lactobacillus species. However, a detailed analysis of the alignment of these genomes using other bioinformatics tools revealed that the grouping of these organisms in the novel tree was more satisfactory than the groupings from previous classifications, which used only a few conserved genes. All these data suggest that the bacterial taxonomy produced by genomic fingerprints is satisfactory, but sometimes different from classical taxonomies. Discrepancies probably arise because the fingerprinting technique analyzes genomic sequences and reveals more information than previously used approaches.  相似文献   

14.
伊珍珍  陈子桂  高珊  宋微波 《动物学报》2007,53(6):1031-1040
以36种旋唇类高等类群纤毛虫的核糖体小亚基核苷酸(Small subunit ribosomal RNA,SS rRNA)基因序列为素材,比较研究了不同条件(包括外类群、内类群的选择,同一基因不同序列长度的组合,不同建树方法和不同分析软件的使用)对纤毛虫分子系统树构建结果的影响。结果表明,上述因素均可不同程度地影响拓扑结构。结果同时提示,在利用有限数据进行相关研究,特别是在对未明类群的系统关系分析中,必须充分考虑因建树条件的不同所带来的影响。作者同时也建议,在当前可用的分子信息欠充分的前提下,对于纤毛虫任何类群的分子系统学探讨而言,慎重形成结论并尽可能地结合和参照形态学、发生学等资讯,仍是需优先考虑的工作路线。  相似文献   

15.
In silico genomic fingerprints were produced by virtual hybridization of 191 fully sequenced bacterial genomes using a set of 15,264 13-mer probes specially designed to produce universal whole genome fingerprints. A novel approach for constructing phylogenetic trees, based on comparative analysis of genomic fingerprints, was developed. The resultant bacterial phylogenetic tree had strong similarities to those produced from the alignment of conserved sequences. Notably, the trees derived from the alignment of other conserved COG genes divided the Bacillus and Corynebacterium genera into the same subgroups produced by the novel bacterial tree. A number of discrepancies between both techniques were observed for the grouping of some Lactobacillus species. However, a detailed analysis of the alignment of these genomes using other bioinformatics tools revealed that the grouping of these organisms in the novel tree was more satisfactory than the groupings from previous classifications, which used only a few conserved genes. All these data suggest that the bacterial taxonomy produced by genomic fingerprints is satisfactory, but sometimes different from classical taxonomies. Discrepancies probably arise because the fingerprinting technique analyzes genomic sequences and reveals more information than previously used approaches.  相似文献   

16.
The traditional knowledge in textbooks indicated that cephalochordates were the closest relatives to vertebrates among all extant organisms. However, this opinion was challenged by several recent phylogenetic studies using hundreds of nuclear genes. The researchers suggested that urochordates, but not cephalochordates, should be the closest living relatives to vertebrates. In the present study, by using data generated from hundreds of mtDNA sequences, we revalue the deuterostome phylogeny in terms of whole mitochondrial genomes (mitogenomes). Our results firmly demonstrate that each of extant deuterostome phyla and chordate subphyla is monophyletic. But the results present several alternative phylogenetic trees depending on different sequence datasets used in the analysis. Although no clear phylogenetic relationships are obtained, those trees indicate that the ancient common ancestor diversified rapidly soon after their appearance in the early Cambrian and generated all major deuterostome lineages during a short historical period, which is consistent with "Cambrian explosion" revealed by paleontologists. It was the 520-million-year's evolution that obscured the phylogenetic relationships of extant deuterostomes. Thus, we conclude that an integrative analysis approach rather than simply using more DNA sequences should be employed to address the distant evolutionary relationship.  相似文献   

17.
Comparative genomic approaches are useful in identifying molecular differences between organisms. Currently available methods fail to identify small changes in genomes, such as expansion of short repetitive motifs and to analyse divergent sequences. In this report, we describe an anchor-based whole genome comparison (ABWGC) method. ABWGC is based on random sampling of anchor sequences from one genome, followed by analysis of sampled and homologous regions from the target genome. The method was applied to compare two strains of Mycobacterium tuberculosis CDC1551 and H37Rv. ABWGC was able to identify a total of 104 indels including 20 expansion of short repetitive sequences and five recombination events. It included 18 new unidentified genomic differences. ABWGC also identified 188 SNPs including eight new ones. The method was also used to compare M. tuberculosis H37Rv and M. avium genomes. ABWGC was able to correctly pick 1002 additional indels (size>100nt) between the two organisms in contrast to MUMmer, a popular tool for comparative genomics. ABWGC was able to identify correctly repeat expansion and indels in a set of simulated sequences. The study also revealed important role of small repeat expansion in the evolution of M. tuberculosis strains.  相似文献   

18.
An evolutionary distance is introduced in order to propose an efficient and feasible procedure for phylogeny studies. Our analysis are based on the strand asymmetry property of mitochondrial DNA, but can be applied to other genomes. Comparison of our results with those reported in conventional phylogenetic trees, gives confidence about our approximation. Our findings support the hypotheses about the origin of the skew and its dependence upon evolutionary pressures, and improves previous efforts on using the strand asymmetry property of genomes for phylogeny inference. For the evolutionary distance introduced here, we observe that the more adequate technique for tree reconstructions correspond to an average link method which employs a sequential clustering algorithm.  相似文献   

19.
随着越来越多基因组的测序完成,基于全基因组的非比对的系统发生分析已成为研究热点。不同的生物物种或个体基因组之间的核酸组分不完全相同。遗传语言-DNA序列的信息很大程度上反映在其k—mer频数中。基于基因组序列k-mer频数的系统发生树则从新的角度为我们提供物种之间的亲缘关系。本文定义基于k-mer,频数的信息参数,并用它表征基因组序列,计算不同基因组之间信息参数的距离,用邻接法对84个病毒构建了系统发生树,发现构建的系统发生树很大程度上与已有的系统发生树相吻合。  相似文献   

20.
Multilocus sequence analysis (MLSA) was used to refine the phylogenetic analysis of the genus Kribbella, which currently contains 17 species with validly-published names. Sequences were obtained for the 16S rRNA, gyrB, rpoB, recA, relA and atpD genes for 16 of the 17 type strains of the genus plus seven non-type strains. A five-gene concatenated sequence of 4099 nt was used to examine the phylogenetic relationships between the species of the genus Kribbella. Using the concatenated sequence of the gyrB-rpoB-recA-relA and atpD genes, most Kribbella type strains can be distinguished by a genetic distance of >0.04. Each single-gene tree had an overall topology similar to that of the concatenated sequence tree. The single-gene relA tree, used here for the first time in MLSA of actinobacteria, had good bootstrap support, comparable to the rpoB and atpD gene trees, which had topologies closest to that of the concatenated sequence tree. This illustrates that relA is a useful addition in MLSA studies of the genus Kribbella. We propose that concatenated gyrB-rpoB-recA-relA-atpD gene sequences be used for examining the phylogenetic relationships within the genus Kribbella and for determining the closest phylogenetic relatives to be used for taxonomic comparisons.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号