首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Traditional phylogenetic analysis is based on multiple sequence alignment. With the development of worldwide genome sequencing project, more and more completely sequenced genomes become available. However, traditional sequence alignment tools are impossible to deal with large-scale genome sequence. So, the development of new algorithms to infer phylogenetic relationship without alignment from whole genome information represents a new direction of phylogenetic study in the post-genome era. In the present study, a novel algorithm based on BBC (base-base correlation) is proposed to analyze the phylogenetic relationships of HEV (Hepatitis E virus). When 48 HEV genome sequences are analyzed, the phylogenetic tree that is constructed based on BBC algorithm is well consistent with that of previous study. When compared with methods of sequence alignment, the merit of BBC algorithm appears to be more rapid in calculating evolutionary distances of whole genome sequence and not requires any human intervention, such as gene identification, parameter selection. BBC algorithm can serve as an alternative to rapidly construct phylogenetic trees and infer evolutionary relationships.  相似文献   

2.
The focus of the research is on the analysis of genome sequences. Based on the inter-nucleotide distance sequence, we propose the conditional multinomial distribution profile for the complete genomic sequence. These profiles can be used to define a very simple, computationally efficient, alignment-free, distance measure that reflects the evolutionary relationships between genomic sequences. We use this distance measure to classify chromosomes according to species of origin, to build the phylogenetic tree of 24 complete genome sequences of coronaviruses. Our results demonstrate the new method is powerful and efficient.  相似文献   

3.
With the development of genome sequencing more whole genomes of microorganisms were completed, many methods wereintroduced to reconstruct the phylogenetic tree of those microorganismswith the information extracted from the whole genomes through variousways of transforming or mapping the whole genome sequences into otherforms which can describe the evolutionary distance in a new way. We thinkit might be possible that there exists information buried in the wholegenome transferred along lineage, which remains stable and is moreessential than sequence conservation of individual genes or the arrangementof some genes of a selected set. We need to find one measurement that caninvolve as many phylogenetic features as possible that are beyond thegenome sequence itself. We converted each genome sequence of themicroorganisms into another linear sequence to represent the functionalstructure of the sequence, and we used a new information function tocalculate the discrepancy of sequences and to get one distance matrix of thegenomes, and built one phylogenetic tree with a neighbor joining method.The resulting tree shows that the major lineages are consistent with theresult based on their 16srRNA sequences. Our method discovered onephylogenetic feature derived from the genome sequences and the encodedgenes that can rebuild the phylogenetic tree correctly. The mapping of onegenome sequence to its new form representing the relative positions of thefunctional genes provides a new way to measure the phylogeneticrelationships, and with the more specific classification of gene functions theresult could be more sensitive.  相似文献   

4.
本文分析了新型冠状病毒(SARS-CoV-2,新冠病毒)的进化来源及刺突蛋白(spike protein,S)基因的突变情况.从GenBank数据库中下载相关病毒全基因组序列及S基因序列,运用DNAMAN9.0、MEGAX等生物信息学软件,进行多序列比对,构建系统进化树,并统计S基因位点突变情况.分析结果提示,新冠病毒...  相似文献   

5.
DoriC: a database of oriC regions in bacterial genomes   总被引:1,自引:0,他引:1  
Replication origins (oriCs) of bacterial genomes currently available in GenBank have been predicted by using a systematic method comprising the Z-curve analysis for nucleotide distribution asymmetry, DnaA box distribution, genes adjacent to candidate oriCs and phylogenetic relationships. These oriCs are organized into a MySQL database, DoriC, which provides extensive information and graphical views of the oriC regions. In addition, users can Blast a query sequence or even a whole genome against DoriC to find a homologous one. DoriC will be updated timely and the latest version is DoriC 1.8, in which oriCs of 425 genomes (468 chromosomes) are identified. AVAILABILITY: DoriC can be accessed from http://tubic.tju.edu.cn/doric/. SUPPLEMENTARY INFORMATION: Supplementary data are available at http://tubic.tju.edu.cn/doric/supplementary.htm.  相似文献   

6.
The E protein is a multifunctional membrane protein of SARS-CoV   总被引:1,自引:0,他引:1  
The E (envelope) protein is the smallest structural protein in all coronaviruses and is the only viral structural protein in which no variation has been detected. We conducted genome sequencing and phylogenetic analyses of SARS-CoV. Based on genome sequencing, we predicted the E protein is a transmembrane (TM) protein characterized by a TM region with strong hydrophobicity and α-helix conformation. We identified a segment (NH2-_L-Cys-A-Y-Cys-Cys-N_-COOH) in the carboxyl-terminal region of the E protein that appears to form three disulfide bonds with another segment of corresponding cysteines in the carboxyl-terminus of the S (spike) protein. These bonds point to a possible structural association between the E and S proteins. Our phylogenetic analyses of the E protein sequences in all published coronaviruses place SARS-CoV in an independent group in Coronaviridae and suggest a non-human animal origin.  相似文献   

7.
The Z-curve is a three-dimensional curve that constitutes a unique representation of a DNA sequence, i.e., both the Z-curve and the given DNA sequence can be uniquely reconstructed from the other. We employed Z-curve analysis to identify one replication origin in the Methanocaldococcus jannaschii genome, two replication origins in the Halobacterium species NRC-1 genome and one replication origin in the Methanosarcina mazei genome. One of the predicted replication origins of Halobacterium species NRC-1 is the same as a replication origin later identified by in vivo experiments. The Z-curve analysis of the Sulfolobus solfataricus P2 genome suggested the existence of three replication origins, which is also consistent with later experimental results. This review aims to summarize applications of the Z-curve in identifying replication origins of archaeal genomes, and to provide clues about the locations of as yet unidentified replication origins of the Aeropyrum pernix K1, Methanococcus maripaludis S2, Picrophilus torridus DSM 9790 and Pyrobaculum aerophilum str. IM2 genomes.  相似文献   

8.
The sudden appearance and potential lethality of severe acute respiratory syndrome (SARS)-associated coronavirus (SARS-CoV) in humans has resulted in a focusing of new attention on the determination of both its origins and evolution. The relationship existing between SARS-CoV and other groups of coronaviruses was determined via analyses of phylogenetic trees and comparative genomic analyses of the coronavirus genes: polymerase (Orf1ab), spike (S), envelope (E), membrane (M) and nucleocapsid (N). Although the coronaviruses are traditionally classed into 3 groups, with SARS-CoV forming a 4th group, the phylogenetic position and origins of SARS-CoV remain a matter of some controversy. Thus, we conducted extensive phylogenetic analyses of the genes common to all coronavirus groups, using the Neighbor-joining, Maximum-likelihood, and Bayesian methods. Our data evidenced largely identical topology for all of the obtained phylogenetic trees, thus supporting the hypothesis that the relationship existing between SARS-CoV and group 2 coronavirus is a monophyletic one. Additional comparative genomic studies, including sequence similarity and protein secondary structure analyses, suggested that SARS-CoV may bear a closer relationship with group 2 than with the other coronavirus groups. Although our data strongly suggest that group 2 coronaviruses are most closely related with SARS-CoV, further and more detailed analyses may provide us with an increased amount of information regarding the origins and evolution of the coronaviruses, most notably SARS-CoV.  相似文献   

9.
Use of whole genome sequence data to infer baculovirus phylogeny   总被引:18,自引:0,他引:18       下载免费PDF全文
Several phylogenetic methods based on whole genome sequence data were evaluated using data from nine complete baculovirus genomes. The utility of three independent character sets was assessed. The first data set comprised the sequences of the 63 genes common to these viruses. The second set of characters was based on gene order, and phylogenies were inferred using both breakpoint distance analysis and a novel method developed here, termed neighbor pair analysis. The third set recorded gene content by scoring gene presence or absence in each genome. All three data sets yielded phylogenies supporting the separation of the Nucleopolyhedrovirus (NPV) and Granulovirus (GV) genera, the division of the NPVs into groups I and II, and species relationships within group I NPVs. Generation of phylogenies based on the combined sequences of all 63 shared genes proved to be the most effective approach to resolving the relationships among the group II NPVs and the GVs. The history of gene acquisitions and losses that have accompanied baculovirus diversification was visualized by mapping the gene content data onto the phylogenetic tree. This analysis highlighted the fluid nature of baculovirus genomes, with evidence of frequent genome rearrangements and multiple gene content changes during their evolution. Of more than 416 genes identified in the genomes analyzed, only 63 are present in all nine genomes, and 200 genes are found only in a single genome. Despite this fluidity, the whole genome-based methods we describe are sufficiently powerful to recover the underlying phylogeny of the viruses.  相似文献   

10.
Reconstructing a tree of life by inferring evolutionary history is an important focus of evolutionary biology. Phylogenetic reconstructions also provide useful information for a range of scientific disciplines such as botany, zoology, phylogeography, archaeology and biological anthropology. Until the development of protein and DNA sequencing techniques in the 1960s and 1970s, phylogenetic reconstructions were based on fossil records and comparative morphological/physiological analyses. Since then, progress in molecular phylogenetics has compensated for some of the shortcomings of phenotype-based comparisons. Comparisons at the molecular level increase the accuracy of phylogenetic inference because there is no environmental influence on DNA/peptide sequences and evaluation of sequence similarity is not subjective. While the number of morphological/physiological characters that are sufficiently conserved for phylogenetic inference is limited, molecular data provide a large number of datapoints and enable comparisons from diverse taxa. Over the last 20 years, developments in molecular phylogenetics have greatly contributed to our understanding of plant evolutionary relationships. Regions in the plant nuclear and organellar genomes that are optimal for phylogenetic inference have been determined and recent advances in DNA sequencing techniques have enabled comparisons at the whole genome level. Sequences from the nuclear and organellar genomes of thousands of plant species are readily available in public databases, enabling researchers without access to molecular biology tools to investigate phylogenetic relationships by sequence comparisons using the appropriate nucleotide substitution models and tree building algorithms. In the present review, the statistical models and algorithms used to reconstruct phylogenetic trees are introduced and advances in the exploration and utilization of plant genomes for molecular phylogenetic analyses are discussed.  相似文献   

11.
SARS-Cov及其他冠状病毒基因组比较分析   总被引:7,自引:0,他引:7  
摘要:对病毒种内和种间基因组的比较分析能获得很多关于病毒起源与演化的信息。对17株SARS-CoV的种内基因组变异分析发现共有137个变异位点,估算出SARS-CoV的突变率为8.04×10-3核苷酸替换/位点/年。变异位点在基因组上的分布不均匀,变异位点最多的是基因组中编码S1蛋白的区域,而在编码依赖于RNA的RNA聚合酶区域中几乎没有变异位点。核苷酸和氨基酸替换的偏性预示变异可能不仅仅是由随机漂变产生。对冠状病毒种间基因组结构比较分析发现,SARS-CoV的基因组结构与IBV很相似;而保守基因系统发育分析表明,SARS-CoV属于冠状病毒的一个新分支,并且与血清型第二组冠状病毒进化关系较近。对其他某些分子特征的分析发现,在不同的方面SARS-CoV和不同组冠状病毒有不同的相似点。进一步对基因组非保守开放阅读框(ORF)的基序(motif)和跨膜区分析发现,各组冠状病毒基因组中位于基因S-E间的非保守ORF可能是同源的,但不是绝对必要的;而IBV和SARS-CoV的基因组中位于基因M-N间ORF可能不是同源的。综合分析SARS-CoV与3组血清型冠状病毒进化关系、宿主分布,以及SARS-CoV和IBV的s2m的进化关系,可以推测SARS-CoV有可能来自禽类。 Abstract:The genome comparison of inter-species and intra-species can give us much information about the origin and evolution of viruses.There are 137 mutation sites in the 17 genomes of SARS-CoV,and the mutation rate is about 8.04×10-3 substitution/site/year.The distribution of the segregating sites is not steady,the most variable region appears in S1 protein,and the nucleotide sequence of RNA-dependent RNA polymerase has very few mutation sites.The substitution bias of nucleotide acids and amino acids indicates the non-random drift products.The comparison of genome structures of SARS-CoV and other coronaviruses shows that SARS-CoV and IBV share the same genome structure.Phylogenetic analyses of conserved genes of coronaviruses indicate that SARS-CoV is a new branch of coronaviruses and appears more close to the group II coronaviruses.Interestingly,SARS-CoV shares some different features with different groups of coronaviruses.Additional analyses show that the first ORFs between S and E genes of some coronaviruses are transmembrane proteins and share the common motif,indicating the possible common ancestor.From the host distribution of different groups of coronaviruses and the phylogeny of s2m,we can deduce that avian is the probable natural host of SARS-CoV.  相似文献   

12.
The coronavirus replicase gene encodes one or two papain-like proteases (termed PL1pro and PL2pro) implicated in the N-terminal processing of the replicase polyprotein and thus contributing to the formation of the viral replicase complex that mediates genome replication. Using consensus fold recognition with the 3D-JURY meta-predictor followed by model building and refinement, we developed a structural model for the single PLpro present in the severe acute respiratory syndrome coronavirus (SCoV) genome, based on significant structural relationships to the catalytic core domain of HAUSP, a ubiquitin-specific protease (USP). By combining the SCoV PLpro model with comparative sequence analyses we show that all currently known coronaviral PLpros can be classified into two groups according to their binding site architectures. One group includes all PL2pros and some of the PL1pros, which are characterized by a restricted USP-like binding site. This group is designated the R-group. The remaining PL1pros from some of the coronaviruses form the other group, featuring a more open papain-like binding site, and is referred to as the O-group. This two-group, binding site-based classification is consistent with experimental data accumulated to date for the specificity of PLpro-mediated polyprotein processing and PLpro inhibition. It also provides an independent evaluation of the similarity-based annotation of PLpro-mediated cleavage sites, as well as a basis for comparison with previous groupings based on phylogenetic analyses.  相似文献   

13.
Zhang YJ  Ma PF  Li DZ 《PloS one》2011,6(5):e20596

Background

Bambusoideae is the only subfamily that contains woody members in the grass family, Poaceae. In phylogenetic analyses, Bambusoideae, Pooideae and Ehrhartoideae formed the BEP clade, yet the internal relationships of this clade are controversial. The distinctive life history (infrequent flowering and predominance of asexual reproduction) of woody bamboos makes them an interesting but taxonomically difficult group. Phylogenetic analyses based on large DNA fragments could only provide a moderate resolution of woody bamboo relationships, although a robust phylogenetic tree is needed to elucidate their evolutionary history. Phylogenomics is an alternative choice for resolving difficult phylogenies.

Methodology/Principal Findings

Here we present the complete nucleotide sequences of six woody bamboo chloroplast (cp) genomes using Illumina sequencing. These genomes are similar to those of other grasses and rather conservative in evolution. We constructed a phylogeny of Poaceae from 24 complete cp genomes including 21 grass species. Within the BEP clade, we found strong support for a sister relationship between Bambusoideae and Pooideae. In a substantial improvement over prior studies, all six nodes within Bambusoideae were supported with ≥0.95 posterior probability from Bayesian inference and 5/6 nodes resolved with 100% bootstrap support in maximum parsimony and maximum likelihood analyses. We found that repeats in the cp genome could provide phylogenetic information, while caution is needed when using indels in phylogenetic analyses based on few selected genes. We also identified relatively rapidly evolving cp genome regions that have the potential to be used for further phylogenetic study in Bambusoideae.

Conclusions/Significance

The cp genome of Bambusoideae evolved slowly, and phylogenomics based on whole cp genome could be used to resolve major relationships within the subfamily. The difficulty in resolving the diversification among three clades of temperate woody bamboos, even with complete cp genome sequences, suggests that these lineages may have diverged very rapidly.  相似文献   

14.
We determined the complete mitochondrial genome sequence of Rhigonema thysanophora, the first representative of Rhigonematomorpha, and used this sequence along with 57 other nematode species for phylogenetic analyses. The R. thysanophora mtDNA is 15 015 bp and identical to all other chromadorean nematode mtDNAs published to date in that it contains 36 genes (lacking atp8) encoded in the same direction. Phylogenetic analyses of nucleotide and amino acid sequence data for the 12 protein‐coding genes recovered Rhigonematomorpha as the sister group to the heterakoid species, Ascaridia columbae (Ascaridomorpha). The organization of R. thysanophora mtDNA resembles the most common pattern for the Rhabditomorpha+Ascaridomorpha+Diplogasteromorpha clade in gene order, but with some substantial gene rearrangements. This similarity in gene order is in agreement with the sequence‐based analyses that indicate a close relationship between Rhigonematomorpha and Rhabditomorpha+Ascaridomorpha+Diplogasteromorpha. These results are consistent with certain analyses of nuclear SSU rDNA for R. thysanophora and some earlier classification systems that asserted phylogenetic affinity between Rhigonematomorpha and Ascaridomorpha, but inconsistent with morphology‐based phylogenetic hypotheses that suggested a close (taxonomic) relationship between rhigonematomorphs and oxyuridomorphs (pinworms). These observations must be tempered by noting that few rhigonematomorph species have been sequenced and included in phylogenetic analyses, and preliminary studies based on SSU rDNA suggest the group is not monophyletic. Additional mitochondrial genome sequences of rhigonematids are needed to characterize their phylogenetic relationships within Chromadorea, and to increase understanding of mitochondrial genome evolution.  相似文献   

15.
Deng M  Yu C  Liang Q  He RL  Yau SS 《PloS one》2011,6(3):e17293

Background

Most existing methods for phylogenetic analysis involve developing an evolutionary model and then using some type of computational algorithm to perform multiple sequence alignment. There are two problems with this approach: (1) different evolutionary models can lead to different results, and (2) the computation time required for multiple alignments makes it impossible to analyse the phylogeny of a whole genome. This motivates us to create a new approach to characterize genetic sequences.

Methodology

To each DNA sequence, we associate a natural vector based on the distributions of nucleotides. This produces a one-to-one correspondence between the DNA sequence and its natural vector. We define the distance between two DNA sequences to be the distance between their associated natural vectors. This creates a genome space with a biological distance which makes global comparison of genomes with same topology possible. We use our proposed method to analyze the genomes of the new influenza A (H1N1) virus, human rhinoviruses (HRV) and mammalian mitochondrial. The result shows that a triple-reassortant swine virus circulating in North America and the Eurasian swine virus belong to the lineage of the influenza A (H1N1) virus. For the HRV and mammalian mitochondrial genomes, the results coincide with biologists'' analyses.

Conclusions

Our approach provides a powerful new tool for analyzing and annotating genomes and their phylogenetic relationships. Whole or partial genomes can be handled more easily and more quickly than using multiple alignment methods. Once a genome space has been constructed, it can be stored in a database. There is no need to reconstruct the genome space for subsequent applications, whereas in multiple alignment methods, realignment is needed to add new sequences. Furthermore, one can make a global comparison of all genomes simultaneously, which no other existing method can achieve.  相似文献   

16.
Comparative studies of chondrocranial morphology in larval anurans are typically qualitative in nature, focusing primarily on discrete variation or gross differences in the size or shape of individual structures. Detailed data on chondrocranial allometry are currently limited to only two species, Rana sylvatica and Bufo americanus. This study uses geometric morphometric and multivariate statistical analyses to examine interspecific variation in both larval chondrocranial shape and patterns of ontogenetic allometry among six species of Rana. Variation is interpreted within the context of hypothesized phylogenetic relationships among these species. Canonical variates analyses of geometric morphometric datasets indicate that species can be clearly discriminated based on chondrocranial shape, even when whole ontogenies are included in the analysis. Ordinations and cluster analyses based on chondrocranial shape data indicate the presence of three primary groupings (R. sylvatica; R. catesbeiana + R. clamitans; and R. palustris + R. pipiens + R. sphenocephala), and patterns of similarity closely reflect phylogenetic relationships. Analysis of chondrocranial allometry reveals that some patterns are conserved across all species (e.g., most measurements scale with negative allometry, those associated with the posterior palatoquadrate tend to scale with isometry or positive allometry). Ontogenetic scaling along similar allometric trajectories, lateral transpositions of individual trajectories, and variable allometric relationships all contribute to shape differences among species. Overall patterns of similarity among ontogenetic trajectories also strongly reflect phylogenetic relationships. Thus, this study demonstrates a tight link between ontogeny, phylogeny, and morphology, and highlights the importance of including both ontogenetic and phylogenetic data in studies of chondrocranial evolution in larval anurans.  相似文献   

17.
Phylogenetic trees have been constructed for a wide range of organisms using gene sequence information, especially through the identification of orthologous genes that have been vertically inherited. The number of available complete genome sequences is rapidly increasing, and many tools for construction of genome trees based on whole genome sequences have been proposed. However, development of a reasonable method of using complete genome sequences for construction of phylogenetic trees has not been established. We have developed a method for construction of phylogenetic trees based on the average sequence similarities of whole genome sequences. We used this method to examine the phylogeny of 115 photosynthetic prokaryotes, i.e., cyanobacteria, Chlorobi, proteobacteria, Chloroflexi, Firmicutes and nonphotosynthetic organisms including Archaea. Although the bootstrap values for the branching order of phyla were low, probably due to lateral gene transfer and saturated mutation, the obtained tree was largely consistent with the previously reported phylogenetic trees, indicating that this method is a robust alternative to traditional phylogenetic methods.  相似文献   

18.
It is at present difficult to accurately position gaps in sequence alignment and to determine substructural homology in structure alignment when reconstructing phylogenies based on highly divergent sequences. Therefore, we have developed a new strategy for inferring phylogenies based on highly divergent sequences. In this new strategy, the whole secondary structure presented as a string in bracket notation is used as phylogenetic characters to infer phylogenetic relationships. It is no longer necessary to decompose the secondary structure into homologous substructural components. In this study, reliable phylogenetic relationships of eight species in Pectinidae were inferred from the structure alignment, but not from sequence alignment, even with the aid of structural information. The results suggest that this new strategy should be useful for inferring phylogenetic relationships based on highly divergent sequences. Moreover, the structural evolution of ITS1 in Pectinidae was also investigated. The whole ITS1 structure could be divided into four structural domains. Compensatory changes were found in all four structural domains. Structural motifs in these domains were identified further. These motifs, especially those in D2 and D3, may have important functions in the maturation of rRNAs.  相似文献   

19.
Simple sequence repeats (SSR) and their flanking regions in the mitochondrial and chloroplast genomes were sequenced in order to reveal DNA sequence variation. This information was used to gain new insights into phylogenetic relationships among species in the genus Oryza. Seven mitochondrial and five chloroplast SSR loci equal to or longer than ten mononucleotide repeats were chosen from known rice mitochondrial and chloroplast genome sequences. A total of 50 accessions of Oryza that represented six different diploid genomes and three different allopolyploid genomes of Oryza species were analyzed. Many base substitutions and deletions/insertions were identified in the SSR loci as well as their flanking regions. Of mononucleotide SSR, G (or C) repeats were more variable than A (or T) repeats. Results obtained by chloroplast and mitochondrial SSR analyses showed similar phylogenetic relationships among species, although chloroplast SSR were more informative because of their higher sequence diversity. The CC genome is suggested to be the maternal parent for the two BBCC genome species (O. punctata and O. minuta) and the CCDD species O. latifolia, based on the high level of sequence conservation between the diploid CC genome species and these allotetraploid species. This is the first report of phylogenetic analysis among plant species, based on mitochondrial and chloroplast SSR and their flanking sequences.  相似文献   

20.
Molecular phylogenetics has benefited tremendously from the advent of next‐generation sequencing, enabling quick and cost‐effective recovery of whole mitogenomes via an approach referred to as ‘genome skimming’. Recently, genome skimming has been utilised to recover highly repetitive nuclear genes such as 18S and 28S ribosomal RNA genes that are useful for inferring deeper evolutionary relationships. To address some outstanding issues in the relationships among Northern Hemisphere freshwater crayfish (Astacoidea), we sequenced the partial genome of crayfish species from Asian, North American and European genera and report the successful recovery of whole mitogenome sequences in addition to three highly repetitive nuclear genes, namely histone H3, 18S and 28S ribosomal RNA. Consistent with some previous studies using short mtDNA and nuclear gene fragments, phylogenetic analyses based on the concatenation of recovered mitochondrial and/or nuclear sequences recovered the Asian cambarid lineage as basal to all astacids and North American cambarids, which conflicts with the current taxonomic classification based on morphological and reproduction‐related characters. Lastly, we show that complete H3, 18S and 28S ribosomal RNA genes can also be consistently recovered from a diverse range of animal taxa, demonstrating the potential wide utility of genome skimming for nuclear markers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号