首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Abstract

Molecular sequence data have become prominent tools for phylogenetic relationship inference, particularly useful in the analysis of highly diverse taxonomic orders. Ribosomal RNA sequences provide markers that can be used in the study of phylogeny, because their function and structure have been conserved to a large extent throughout the evolutionary history of organisms. These sequences are inferred from cloned or enzymatically amplified gene sequences, or determined by direct RNA sequencing. The first step of the phylogenetic interpretation of nucleic acid sequence variations implies proper alignment of corresponding sequences from various organisms. Best alignment based on similarity criteria is greatly reinforced, in the case of ribosomal RNAs, by secondary structure homologies. Distance matrix methods to infer evolutionary trees are based on the assumption that the phylogenetic distance between each pair of organisms is proportional to the number of nucleotide substitution events. Computed tree inference methods usually take into consideration the possibility of unequal mutation rates among lineages. Divergence times can be estimated on the tree, provided that at least one lineage has been dated by fossil records. We have utilized this approach based on ribosomal RNA sequence comparison to investigate the phylogenetic relationship between dinoflagellated and other eukaryote protists, and to refine controverse phylogenies of the class Dinophycae.  相似文献   

2.
Despite the ecological importance of marine pico-size eukaryotes, the study of their in situ diversity using molecular tools started just a few years ago. These studies have revealed that marine picoeukaryotes are very diverse and include many novel taxa. However, the amount and structure of their phylogenetic diversity and the extent of their sequence novelty still remains poorly known, as a systematic analysis has been seldom attempted. In this study, we use a coherent and carefully curated data set of 500 published 18S ribosomal DNA sequences to quantify the diversity and novelty patterns of picoeukaryotes in the Indian Ocean. Our phylogenetic tree showed many distant lineages. We grouped sequences in OTUs (operational taxonomic units) at discrete values delineated by pair-wise Jukes–Cantor (JC) distances and tree patristic distances. At a distance of 0.01, the number of OTUs observed (237/242; using JC or patristic distances, respectively) was half the number of sequences analyzed, indicating the existence of microdiverse clusters of highly related sequences. At this distance level, we estimated 600–800 OTUs using several statistical methods. The number of OTUs observed was still substantial at higher distances (39/82 at 0.20 distance) suggesting a large diversity at high-taxonomic ranks. Most sequences were related to marine clones from other sites and many were distant to cultured organisms, highlighting the huge culturing gap within protists. The novelty analysis indicated the putative presence of pseudogenes and of truly novel high-rank phylogenetic lineages. The identified diversity and novelty patterns among marine picoeukaryotes are of great importance for understanding and interpreting their ecology and evolution.  相似文献   

3.
Phylogenetic tree reconstruction requires construction of a multiple sequence alignment (MSA) from sequences. Computationally, it is difficult to achieve an optimal MSA for many sequences. Moreover, even if an optimal MSA is obtained, it may not be the true MSA that reflects the evolutionary history of the underlying sequences. Therefore, errors can be introduced during MSA construction which in turn affects the subsequent phylogenetic tree construction. In order to circumvent this issue, we extend the application of the k-tuple distance to phylogenetic tree reconstruction. The k-tuple distance between two sequences is the sum of the differences in frequency, over all possible tuples of length k, between the sequences and can be estimated without MSAs. It has been traditionally used to build a fast ‘guide tree’ to assist the construction of MSAs. Using the 1470 simulated sets of sequences generated under different evolutionary scenarios, the neighbor-joining trees and BioNJ trees, we compared the performance of the k-tuple distance with four commonly used distance estimators including Jukes–Cantor, Kimura, F84 and Tamura–Nei. These four distance estimators fall into the category of model-based distance estimators, as each of them takes account of a specific substitution model in order to compute the distance between a pair of already aligned sequences. Results show that trees constructed from the k-tuple distance are more accurate than those from other distances most time; when the divergence between underlying sequences is high, the tree accuracy could be twice or higher using the k-tuple distance than other estimators. Furthermore, as the k-tuple distance voids the need for constructing an MSA, it can save tremendous amount of time for phylogenetic tree reconstructions when the data include a large number of sequences.  相似文献   

4.
5.
From the DNA sequences for N taxa, the (generally unknown) phylogenetic tree T that gave rise to them is to be reconstructed. Various methods give rise, for each quartet J consisting of exactly four taxa, to a predicted tree L(J) based only on the sequences in J, and these are then used to reconstruct T. The author defines an "error-correcting map" (Ec), which replaces each L(J) with a new tree, Ec(L)(J), which has been corrected using other trees, L(K), in the list L. The "quartet distance" between two trees is defined as the number of quartets J on which the two trees differ, and two distinct trees are shown to always have quartet distance of at least N - 3. If L has quartet distance at most (N - 4)/2 from T, then Ec(L) will coincide with the correct list for T; and this result cannot be improved. In general, Ec can correct many more errors in L. Iteration of the map Ec may produce still more accurate lists. Simulations are reported which often show improvement even when the quartet distance considerably exceeds (N - 4)/2. Moreover, the Buneman tree for Ec(L) is shown to refine the Buneman tree for L, so that strongly supported edges for L remain strongly supported for Ec(L). Simulations show that if methods such as the C-tree or hypercleaning are applied to Ec(L), the resulting trees often have more resolution than when the methods are applied only to L.  相似文献   

6.
Comparative and phylogenetic analysis of developmental sequences   总被引:3,自引:0,他引:3  
Event pairing has been proposed for the optimization of developmental sequences (event sequences) on a given phylogenetic hypothesis (cladogram) to determine instances of sequence heterochrony. Here, we show that event pairing is faulty, leading to the optimization of impossible hypothetical ancestors, the underestimation of the lengths of the developmental sequences on the tree, and the proposition of synapomorphies that are not supported by the data. When used for phylogenetic analysis, event pairing can even produce cladograms that are inconsistent with the data. These errors are caused by the fact that event pairing treats dependent features as if they were independent. We present a new method for comparative and phylogenetic analysis of developmental sequences that does not exhibit these errors. Our method applies Search-based character optimization and treats the entire developmental sequence as a single character that is then analyzed by using an edit cost function, which specifies the transformation cost between pairs of observed and unobserved character states, and dynamic programming. In other words, the developmental sequence is directly optimized on the tree. We used event pairing as an edit cost function, but others are possible.  相似文献   

7.
We address the phylogenetic relationships of the drongos (Dicruridae) at the species-level using sequences from two nuclear (myoglobin intron-2 and c-mos) and two mitochondrial (ND2 and cytochrome b) loci. The resulting phylogenetic tree shows that the most basal species is D. aeneus, followed in the tree by a trichotomy including (1) the Asian D. remifer, (2) a clade of all African and Indian Ocean islands species as well as two Asian species (D. macrocercus and D. leucophaeus) and (3) a clade that includes all other Asian species as well as two Australasian species (D. megarhynchus and D. bracteatus). Our phylogenetic hypotheses are compared to [Mayr, E., Vaurie, C., 1948. Evolution of the family Dicruridae (Birds). Evolution 2, 238-265.] hypothetical family "tree" based on traditional phenotypic analysis and biogeography. We point out a general discrepancy between the so-called "primitive" or "unspecialized" species and their position in the phylogenetic tree, although our results for other species are congruent with previous hypotheses. We conduct dating analyses using a relaxed-clock method, and propose a chronology of clades formation. A particular attention is given to the drongo radiation in Indian Ocean islands and to the extinction-invasion processes involved. The first large diversification of the family took place both in Asia and Africa at 11.9 and 13.3Myr, respectively, followed by a dispersal event from Africa to Asia at ca 10.6Myr; dispersal over Wallace line occurred later at ca 6Myr. At 5Myr, Principe and Indian Ocean Islands have been colonized from an African ancestor; the most recent colonization event concerned Anjouan by an immigrating population from Madagascar.  相似文献   

8.
Phylogenetic structure analysis is a novel way to address the relative importance of stochastic and deterministic processes governing species assemblages. Here we investigate the phylogenetic structure of the vegetation of inselbergs located in the African rain forest. Inselbergs combine strong ecological gradients at the local scale due to soil depth variation and insular properties at the regional scale. They are therefore ideal models to assess the influence of ecological sorting and dispersal limitation on the phylogenetic structure of plant communities. On 21 inselbergs separated by up to 200 km where five microhabitat-types were recognized, 311 vegetation plots were inventoried. We found that floristic similarity between plots depended on both microhabitat differentiation and spatial distance, while phylogenetic clustering (i.e. excess of phylogenetic similarity between species from a same plot) only appeared between plots from differentiated microhabitats and increased with ecological distance. Within a microhabitat-type, the absence of phylogenetic structure between inselbergs indicates that species turnover is probably due to dispersal limitation rather than to regional-scale variations in environmental factors. Hence, phylogenetic structure analysis can help disentangle the effects of ecological sorting and dispersal limitation on species assemblages. To estimate the time-scale of the processes generating the phylogenetic structure, we investigated how lineage similarity changes with increasing age in the phylogenetic tree. High lineage similarity levels between ecologically very differentiated plots were only reached at the proximity of the root of the phylogenetic tree. This was observed even when considering only plots sharing no species and indicates that phylogenetic niche conservatism has been important for generating the observed phylogenetic structure. Hence, ancient diversification exerts an impact on the assembly of current plant communities.  相似文献   

9.
DNA sequences can be treated as finite-length symbol strings over a four-letter alphabet (A, C, T, G). As a universal and computable complexity measure, LZ complexity is valid to describe the complexity of DNA sequences. In this study, a concept of conditional LZ complexity between two sequences is proposed according to the principle of LZ complexity measure. An LZ complexity distance metric between two nonnull sequences is defined by utilizing conditional LZ complexity. Based on LZ complexity distance, a phylogenetic tree of 26 species of placental mammals (Eutheria) with three outgroup species was reconstructed from their complete mitochondrial genomes. On the debate that which two of the three main groups of placental mammals, namely Primates, Ferungulates, and Rodents, are more closely related, the phylogenetic tree reconstructed based on LZ complexity distance supports the suggestion that Primates and Ferungulates are more closely related.  相似文献   

10.
The surprising fact that global statistical properties computed on a genomewide scale may reveal species information has first been observed in studies of dinucleotide frequencies. Here we will look at the same phenomenon with a totally different statistical approach. We show that patterns in the short-range statistical correlations in DNA sequences serve as evolutionary fingerprints of eukaryotes. All chromosomes of a species display the same characteristic pattern, markedly different from those of other species. The chromosomes of a species are sorted onto the same branch of a phylogenetic tree due to this correlation pattern. The average correlation between nucleotides at a distance k is quantified in two independent ways: (i) by estimating it from a higher-order Markov process and (ii) by computing the mutual information function at a distance k. We show how the quality of phylogenetic reconstruction depends on the range of correlation strengths and on the length of the underlying sequence segment. This concept of the correlation pattern as a phylogenetic signature of eukaryote species combines two rather distant domains of research, namely phylogenetic analysis based on molecular observation and the study of the correlation structure of DNA sequences.  相似文献   

11.
Two leucine tRNAs from the cyanophyte Anacystis nidulans have been isolated, and their complete nucleotide sequences have been determined by combining data from oligonucleotide fingerprints and sequencing gels. The two sequences are 87 nucleotides long, have the anticodons CAA and CAG, and differ from each other at a total of 28 positions. They have been compared to other known tRNA Leu sequences and incorporated into a phylogenetic tree comprising prokaryotic and chloroplastic tRNA Leu sequences. Mutations inferred from the tree show that some parts of the tRNA molecule are highly variable (the extra arm and the acceptor stem) while others are much more conserved (the D and T arms). The topology of the tree supports the idea that blue-green algae and chloroplasts share a common prokaryotic ancestor and show a basic divergence between XAA and XAG anticodon-containing tRNAs, suggesting that these two subfamilies result from an ancient gene duplication. Finally, comparison of this phylogenetic tree with those of other multi-isoacceptor tRNA families shows no common scheme, which may be due to independent refinement of codon-reading patterns in different tRNA families.  相似文献   

12.
With the development of genome sequencing more whole genomes of microorganisms were completed, many methods wereintroduced to reconstruct the phylogenetic tree of those microorganismswith the information extracted from the whole genomes through variousways of transforming or mapping the whole genome sequences into otherforms which can describe the evolutionary distance in a new way. We thinkit might be possible that there exists information buried in the wholegenome transferred along lineage, which remains stable and is moreessential than sequence conservation of individual genes or the arrangementof some genes of a selected set. We need to find one measurement that caninvolve as many phylogenetic features as possible that are beyond thegenome sequence itself. We converted each genome sequence of themicroorganisms into another linear sequence to represent the functionalstructure of the sequence, and we used a new information function tocalculate the discrepancy of sequences and to get one distance matrix of thegenomes, and built one phylogenetic tree with a neighbor joining method.The resulting tree shows that the major lineages are consistent with theresult based on their 16srRNA sequences. Our method discovered onephylogenetic feature derived from the genome sequences and the encodedgenes that can rebuild the phylogenetic tree correctly. The mapping of onegenome sequence to its new form representing the relative positions of thefunctional genes provides a new way to measure the phylogeneticrelationships, and with the more specific classification of gene functions theresult could be more sensitive.  相似文献   

13.
Erroneous estimates of ingroup relationships can be caused by attributes in the outgroup chosen to root the tree. Phylogenetic analyses of DNA sequences frequently yield incorrect estimates of ingroup relationships when the outgroup used to "root" the tree is highly divergent from the ingroup. This is especially the case when the outgroup has a different base composition than the ingroup. Unfortunately, in many instances, alternative less divergent outgroups are not available. In such cases, investigators must either target genes with attributes that minimize the problem (slowly evolving genes with stationary base compositions--which are often not ideal for estimating relationships among the more closely related ingroup taxa) or use inference models that are explicitly tailored to deal with an attenuated historical signal with a superimposed non-stationary base composition. In this paper we explore the problem both empirically and through simulation. For the empirical component we looked at the phylogenetic relationships among elasmobranch fishes (sharks and rays), a group whose closest living outgroup, the holocephalan Ghost fishes, are separated from the elasmobranchs by more than 100 million years of evolution. We compiled a data set for analysis comprising 10 single-copy nuclear protein-coding genes (12,096 bp) for representatives of the major lineages within elasmobranchs and holocephalans. For the simulation, we used an evolutionary model on a fixed tree topology to generate DNA sequence data sets which varied both in their distance to the outgroup, and in their base compositional difference between ingroup and outgroup. Results from both the empirical data set and the simulation, support the idea that deviation from base compositional stationarity, in conjunction with distance from the root can act in concert to compromise accuracy of estimated relationships within the ingroup. We tested several approaches to mitigate such problems. We found, that excluding genes with overall faster rates and heterogeneous base compositions, while the least sophisticated of the methods evaluated, seemed to be the most effective.  相似文献   

14.
Two phylogenetic comparative methods, independent contrasts and generalized least squares models, can be used to determine the statistical relationship between two or more traits. We show that the two approaches are functionally identical and that either can be used to make statistical inferences about values at internal nodes of a phylogenetic tree (hypothetical ancestors), to estimate relationships between characters, and to predict values for unmeasured species. Regression equations derived from independent contrasts can be placed back onto the original data space, including computation of both confidence intervals and prediction intervals for new observations. Predictions for unmeasured species (including extinct forms) can be made increasingly accurate and precise as the specificity of their placement on a phylogenetic tree increases, which can greatly increase statistical power to detect, for example, deviation of a single species from an allometric prediction. We reexamine published data for basal metabolic rates (BMR) of birds and show that conventional and phylogenetic allometric equations differ significantly. In new results, we show that, as compared with nonpasserines, passerines exhibit a lower rate of evolution in both body mass and mass-corrected BMR; passerines also have significantly smaller body masses than their sister clade. These differences may justify separate, clade-specific allometric equations for prediction of avian basal metabolic rates.  相似文献   

15.
目的对长爪沙鼠线粒体DNA控制区全序列进行测定,并对其进行鉴定及进化分析。方法根据长爪沙鼠已知基因序列设计引物,采用PCR产物测序法,对所得的片段进行测序鉴定。结合已公布啮齿类动物D-loop区序列,分析其碱基组成、遗传距离、并基于最小进化法和UPGMA法构建系统进化树。结果获得长爪沙鼠D-loop区序列,其与家鼠、小家鼠和仓鼠平均同源性为58%;碱基组成分析显示,长爪沙鼠与啮齿类动物有相似的碱基组成和碱基偏离,其A-skew和G-skew分别为0.0047和-0.28。进化分析结果显示,长爪沙鼠与家鼠(0.35)、黑家鼠(0.38)和仓鼠(0.39)具有较近的遗传距离,其分化顺序为跳鼠、蔗鼠、长爪沙鼠、仓鼠、家鼠和小家鼠。结论本研究获得长爪沙鼠D-loop区全序列,确定了长爪沙鼠与仓鼠、家鼠、小家鼠及其它啮齿动物的进化关系,为长爪沙鼠进化研究、线粒体的结构和功能研究奠定基础。  相似文献   

16.
Members of the genusBdellovibrio possess the unifying phenotypic trait of attacking and preying upon other Gram-negative bacteria. It has been suggested that this common lifestyle arose by convergent evolution. Physiological and G + C studies have led to the notion that bdellovibrios are a heterogeneous group of loosely related bacteria. We have inferred the phylogenetic relatedness of 12 strains ofBdellovibrio through the analysis of partial 16S ribosomal RNA sequences. Similarity and degree of homology were assessed, and a phylogenetic tree was constructed by the distance matrix method. One branch of the two-branched tree consisted ofB. bacteriovorus and related strains (W, 6-5-S, 109, 109D, 109J, 114, HI Ox9-2, and HI Ox9-3). The other branch was itself branched, withB. starrii, B. stolpii, and marine strain BM4 in separate sub-branches. AllBdellovibrio strains in turn clustered with representatives of the delta division of theProteobacteria. The results indicate that there are at least two subdivisions of the genusBdellovibrio and that present-day bdellovibrios arose from a common ancestor. The placement of the genusBdellovibrio within the delta division of theProteobacteria was confirmed.  相似文献   

17.
This paper describes the inferential method, an approach for reconstructing protein and nucleotide sequences of ancestral species, starting from known, homologous, contemporary sequences. The method requires knowledge of the topology of the phylogenetic tree, whose nodes are the species to whom the reconstructed sequences belong.The method has been tested by computer simulation of speciation and nucleotide substitutions, starting from a single ancestral sequence, and by subsequent reconstruction of nodal sequences. Results have shown that reconstructions obtained by the inferential method are affected by limited error frequencies, which (1) are proportional to the squares of nucleotide substitution rates and of internodal distances, and (2) are little influenced by non-uniformity of transformation rates of nucleotides.Furthermore, good agreement of the results has been obtained by comparing protein-sequence reconstructions carried out with the inferential method with those obtained using the maximum parsimony method in two different cases: e.g., a reconstruction of simulated sequences and a reconstruction of mammalian ribonuclease sequences.Abbreviations used MP maximum parsimony method - ML maximum likelihood method - IM inferential method - MY millions of years - N-tree natural-like phylogenetic tree - E-tree equibranched phylogenetic tree - EA percentage number of erroneous amino acids in a reconstructed sequence - EC percentage number of erroneous codons in a reconstructed sequence - t n time interval between a P- and its - F-sequence nucleotides and amino acids are indicated by their I.U.B. codes (N.C.-I.U.B., 1985) Correspondence to: A. Di Donato  相似文献   

18.
Armillaria root rot is a serious disease, chiefly of woody plants, caused by many species of Armillaria that occur in temperate, tropical and subtropical regions of the world. Very little is known about Armillaria in South America and Southeast Asia, although Armillaria root rot is well known in these areas. In this study, we consider previously unidentified isolates collected from trees with symptoms of Armillaria root rot in Chile, Indonesia and Malaysia. In addition, isolates from basidiocarps resembling A. novae-zelandiae and A. limonea, originating from Chile and Argentina, respectively, were included in this study because their true identity has been uncertain. All isolates in this study were compared, based on their similarity in ITS sequences with previously sequenced Armillaria species, and their phylogenetic relationship with species from the Southern Hemisphere was considered. ITS sequence data for Armillaria also were compared with those available at GenBank. Parsimony and distance analyses were conducted to determine the phylogenetic relationships between the unknown isolates and the species that showed high ITS sequence similarity. In addition, IGS-1 sequence data were obtained for some of the species to validate the trees obtained from the ITS data set. Results of this study showed that the ITS sequences of the isolates obtained from basidiocarps resembling A. novae-zelandiae are most similar to those for this species. ITS sequences for isolates from Indonesia and Malaysia had the highest similarity to A. novae-zelandiae but were phylogenetically separated from this species. Isolates from Chile, for which basidiocarps were not found, were similar in their ITS and IGS-1 sequences to the isolate from Argentina that resembled A. limonea. These isolates, however, had the highest ITS and IGS-1 sequence similarity to authentic isolates of A. luteobubalina and were phylogenetically more closely related to this species than to A. limonea.  相似文献   

19.
The nucleotide sequences of 280-360-bp domains of lectin genes from 20 legume species belonging to 17 genera have been determined. A computer analysis of the sequences has been performed with the LASERGENE package. Based on this analysis, we constructed the phylogenetic tree of the lectins, which reflects their phylogenetic and evolutionary relationships, and predicted the amino-acid sequences of the corresponding protein domains. Features of the structure of the hydrocarbon-binding lectin domains were elucidated in some species of legume genera from the temperate climatic zone. The domains were highly variable and contained the consensus sequence AspTrePheXxxAsxXxxXxxTrpAspProXxxXxxIns/DelArgHis bearing the bulk of amino acid replacements, insertions, and deletions. An association between legume groups (including species from different genera and tribes) symbiotic with the same rhizobium species and the similarity between the hydrocarbon-binding domains of lectins from these plants was found.  相似文献   

20.
Phenotypic behavior of a group of organisms can be studied using a range of molecular evolutionary tools that help to determine evolutionary relationships. Traditionally a gene or a set of gene sequences was used for generating phylogenetic trees. Incomplete evolutionary information in few selected genes causes problems in phylogenetic tree construction. Whole genomes are used as remedy. Now, the task is to identify the suitable parameters to extract the hidden information from whole genome sequences that truly represent evolutionary information. In this study we explored a random anchor (a stretch of 100 nucleotides) based approach (ABWGP) for finding distance between any two genomes, and used the distance estimates to compute evolutionary trees. A number of strains and species of Mycobacteria were used for this study. Anchor-derived parameters, such as cumulative normalized score, anchor order and indels were computed in a pair-wise manner, and the scores were used to compute distance/phylogenetic trees. The strength of branching was determined by bootstrap analysis. The terminal branches are clearly discernable using the distance estimates described here. In general, different measures gave similar trees except the trees based on indels. Overall the tree topology reflected the known biology of the organisms. This was also true for different strains of Escherichia coli. A new whole genome-based approach has been described here for studying evolutionary relationships among bacterial strains and species.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号