首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
A rooted tree of life provides a framework to answer central questions about the evolution of life. Here we review progress on rooting the tree of life and introduce a new root of life obtained through the analysis of indels, insertions and deletions, found within paralogous gene sets. Through the analysis of indels in eight paralogous gene sets, the root is localized to the branch between the clade consisting of the Actinobacteria and the double-membrane (Gram-negative) prokaryotes and one consisting of the archaebacteria and the firmicutes. This root provides a new perspective on the habitats of early life, including the evolution of methanogenesis, membranes and hyperthermophily, and the speciation of major prokaryotic taxa. Our analyses exclude methanogenesis as a primitive metabolism, in contrast to previous findings. They parsimoniously imply that the ether archaebacterial lipids are not primitive and that the cenancestral prokaryotic population consisted of organisms enclosed by a single, ester-linked lipid membrane, covered by a peptidoglycan layer. These results explain the similarities previously noted by others between the lipid synthesis pathways in eubacteria and archaebacteria. The new root also implies that the last common ancestor was not hyperthermophilic, although moderate thermophily cannot be excluded.  相似文献   

2.
Despite the availability of large molecular data sets, the position of the root of the eutherian tree remains a controversial issue. Depending on source data, taxon sampling and analytical approach, the root can be placed at either Afrotheria, Xenarthra, Afrotheria+Xenarthra, or murid rodents. We explored the phylogenetic potential of indels in four nuclear protein-coding genes (SCA1, PRNP, TNFalpha, and HspB3) with regard to a possible rooting at the murid branch. According to parsimony principles, five indels were interpreted to contradict such a rooting, and one indel to support it. The results illustrate that indels, despite the occurrence of homoplasy, can be convincing sources of independent molecular evidence to distinguish between alternative phylogenetic hypotheses.  相似文献   

3.
The Rooting of the Universal Tree of Life Is Not Reliable   总被引:19,自引:0,他引:19  
Several composite universal trees connected by an ancestral gene duplication have been used to root the universal tree of life. In all cases, this root turned out to be in the eubacterial branch. However, the validity of results obtained from comparative sequence analysis has recently been questioned, in particular, in the case of ancient phylogenies. For example, it has been shown that several eukaryotic groups are misplaced in ribosomal RNA or elongation factor trees because of unequal rates of evolution and mutational saturation. Furthermore, the addition of new sequences to data sets has often turned apparently reasonable phylogenies into confused ones. We have thus revisited all composite protein trees that have been used to root the universal tree of life up to now (elongation factors, ATPases, tRNA synthetases, carbamoyl phosphate synthetases, signal recognition particle proteins) with updated data sets. In general, the two prokaryotic domains were not monophyletic with several aberrant groupings at different levels of the tree. Furthermore, the respective phylogenies contradicted each others, so that various ad hoc scenarios (paralogy or lateral gene transfer) must be proposed in order to obtain the traditional Archaebacteria–Eukaryota sisterhood. More importantly, all of the markers are heavily saturated with respect to amino acid substitutions. As phylogenies inferred from saturated data sets are extremely sensitive to differences in evolutionary rates, present phylogenies used to root the universal tree of life could be biased by the phenomenon of long branch attraction. Since the eubacterial branch was always the longest one, the eubacterial rooting could be explained by an attraction between this branch and the long branch of the outgroup. Finally, we suggested that an eukaryotic rooting could be a more fruitful working hypothesis, as it provides, for example, a simple explanation to the high genetic similarity of Archaebacteria and Eubacteria inferred from complete genome analysis.  相似文献   

4.
Insertions and deletions (indels) in chloroplast noncoding regions are common genetic markers to estimate population structure and gene flow, although relatively little is known about indel evolution among recently diverged lineages such as within plant families. Because indel events tend to occur nonrandomly along DNA sequences, recurrent mutations may generate homoplasy for indel haplotypes. This is a potential problem for population studies, because indel haplotypes may be shared among populations after recurrent mutation as well as gene flow. Furthermore, indel haplotypes may differ in fitness and therefore be subject to natural selection detectable as rate heterogeneity among lineages. Such selection could contribute to the spatial patterning of cpDNA haplotypes, greatly complicating the interpretation of cpDNA population structure. This study examined both nucleotide and indel cpDNA variation and divergence at six noncoding regions (psbB-psbH, atpB-rbcL, trnL-trnH, rpl20-5'rps12, trnS-trnG, and trnH-psbA) in 16 individuals from eight species in the Lecythidaceae and a Sapotaceae outgroup. We described patterns of cpDNA changes, assessed the level of indel homoplasy, and tested for rate heterogeneity among lineages and regions. Although regression analysis of branch lengths suggested some degree of indel homoplasy among the most divergent lineages, there was little evidence for indel homoplasy within the Lecythidaceae. Likelihood ratio tests applied to the entire phylogenetic tree revealed a consistent pattern rejecting a molecular clock. Tajima's 1D and 2D tests revealed two taxa with consistent rate heterogeneity, one showing relatively more and one relatively fewer changes than other taxa. In general, nucleotide changes showed more evidence of rate heterogeneity than did indel changes. The rate of evolution was highly variable among the six cpDNA regions examined, with the trnS-trnG and trnH-psbA regions showing as much as 10% and 15% divergence within the Lecythidaceae. Deviations from rate homogeneity in the two taxa were constant across cpDNA regions, consistent with lineage-specific rates of evolution rather than cpDNA region-specific natural selection. There is no evidence that indels are more likely than nucleotide changes to experience homoplasy within the Lecythidaceae. These results support a neutral interpretation of cpDNA indel and nucleotide variation in population studies within species such as Corythophora alta.  相似文献   

5.
Insertion and deletion (indel)-based analyses have great potential for rooting the tree of life, but their use has been limited because they require ubiquitous sequences that have not been horizontally/laterally transferred. Very few such sequences exist. Here we describe and demonstrate a new algorithm that can use nonubiquitous sequences for rooting. This algorithm, top-down indel rooting, uses the traditional logical framework of indel rooting, but by considering gene gains and losses in addition to indel gains and losses, it is able to analyze incomplete data sets. The method is demonstrated using theoretical examples and incomplete gene sets. In particular, it is applied to the well-studied Hsp70/MreB indel, a sequence set thought to have been compromised by gene transfers from Firmicutes to archaebacteria. By sequentially assigning all observable character states, including gene absences, to the questionable archaebacterial Hsp70 and MreB sequences, we demonstrate that this gene set robustly excludes the root of the tree of life from the Gram-negative, double-membrane prokaryotes independently of the archaeal character states. There are very few ubiquitous paralog gene sets, and most of them contain compromised data. The ability of top-down rooting to use incomplete and/or compromised gene sets promises to make rooting analyses more robust and to greatly increase the number of useful indel sets.  相似文献   

6.
Insertion and deletion events (indels) provide a suite of markers with enormous potential for molecular phylogenetics. Using many more indel characters than those in previous studies, we here for the first time address the impact of indel inclusion on the phylogenetic inferences of Arctoidea (Mammalia: Carnivora). Based on 6843 indel characters from 22 nuclear intron loci of 16 species of Arctoidea, our analyses demonstrate that when the indels were not taken into consideration, the monophyly of Ursidae and Pinnipedia tree and the monophyly of Pinnipedia and Musteloidea tree were both recovered, whereas inclusion of indels by using three different indel coding schemes give identical phylogenetic tree topologies supporting the monophyly of Ursidae and Pinnipedia. Our work brings new perspectives on the previously controversial placements among Arctoidea families, and provides another example demonstrating the importance of identifying and incorporating indels in the phylogenetic analyses of introns. In addition, comparison of indel incorporation methods revealed that the three indel coding methods are all advantageous over treating indels as missing data, given that incorporating indels produces consistent results across methods. This is the first report of the impact of different indel coding schemes on phylogenetic reconstruction at the family level in Carnivora, which indicates that indels should be taken into account in the future phylogenetic analyses.  相似文献   

7.
8.
We are interested in detecting homologous genomic DNA sequences with the goal of locating approximate inverted, interspersed, and tandem repeats. Standard search techniques start by detecting small matching parts, called seeds, between a query sequence and database sequences. Contiguous seed models have existed for many years. Recently, spaced seeds were shown to be more sensitive than contiguous seeds without increasing the random hit rate. To determine the superiority of one seed model over another, a model of homologous sequence alignment must be chosen. Previous studies evaluating spaced and contiguous seeds have assumed that matches and mismatches occur within these alignments, but not insertions and deletions (indels). This is perhaps appropriate when searching for protein coding sequences (<5% of the human genome), but is inappropriate when looking for repeats in the majority of genomic sequence where indels are common. In this paper, we assume a model of homologous sequence alignment which includes indels and we describe a new seed model, called indel seeds, which explicitly allows indels. We present a waiting time formula for computing the sensitivity of an indel seed and show that indel seeds significantly outperform contiguous and spaced seeds when homologies include indels. We discuss the practical aspect of using indel seeds and finally we present results from a search for inverted repeats in the dog genome using both indel and spaced seeds.  相似文献   

9.
We use a multigene data set (the mitochondrial locus and nine nuclear gene regions) to test phylogenetic relationships in the South American "lava lizards" (genus Microlophus) and describe a strategy for aligning noncoding sequences that accounts for differences in tempo and class of mutational events. We focus on seven nuclear introns that vary in size and frequency of multibase length mutations (i.e., indels) and present a manual alignment strategy that incorporates insertions and deletions (indels) for each intron. Our method is based on mechanistic explanations of intron evolution that does not require a guide tree. We also use a progressive alignment algorithm (Probabilistic Alignment Kit; PRANK) and distinguishes insertions from deletions and avoids the "gapcost" conundrum. We describe an approach to selecting a guide tree purged of ambiguously aligned regions and use this to refine PRANK performance. We show that although manual alignment is successful in finding repeat motifs and the most obvious indels, some regions can only be subjectively aligned, and there are limits to the size and complexity of a data matrix for which this approach can be taken. PRANK alignments identified more parsimony-informative indels while simultaneously increasing nucleotide identity in conserved sequence blocks flanking the indel regions. When comparing manual and PRANK with two widely used methods (CLUSTAL, MUSCLE) for the alignment of the most length-variable intron, only PRANK recovered a tree congruent at deeper nodes with the combined data tree inferred from all nuclear gene regions. We take this concordance as an objective function of alignment quality and present a strongly supported phylogenetic hypothesis for Microlophus relationships. From this hypothesis we show that (1) a coded indel data partition derived from the PRANK alignment contributed significantly to nodal support and (2) the indel data set permitted detection of significant conflict between mitochondrial and nuclear data partitions, which we hypothesize arose from secondary contact of distantly related taxa, followed by hybridization and mtDNA introgression.  相似文献   

10.
The nuclease-based gene editing tools are rapidly transforming capabilities for altering the genome of cells and organisms with great precision and in high throughput studies. A major limitation in application of precise gene editing lies in lack of sensitive and fast methods to detect and characterize the induced DNA changes. Precise gene editing induces double-stranded DNA breaks that are repaired by error-prone non-homologous end joining leading to introduction of insertions and deletions (indels) at the target site. These indels are often small and difficult and laborious to detect by traditional methods. Here we present a method for fast, sensitive and simple indel detection that accurately defines indel sizes down to ±1 bp. The method coined IDAA for Indel Detection by Amplicon Analysis is based on tri-primer amplicon labelling and DNA capillary electrophoresis detection, and IDAA is amenable for high throughput analysis.  相似文献   

11.
ALMT1 gene encoding a membrane protein that facilitates an aluminium stimulated malate efflux has been characterised and mapped in wheat (Triticum aestivum L.). Here, we have identified molecular markers targeting insertion/deletion (indel) and SSR repeats within intron 3 region of the ALMT1 gene. Both the markers: ALMT1-SSR3a and ALMT1-SSR3b based on repetitive indels, exhibited complete cosegregation with Al tolerance, malate efflux, and a CAPS marker discriminating ALMT1-1 and ALMT1-2 alleles, in a doubled haploid population derived from Diamondbird (Al-tolerant)/Janz (Al-sensitive). A parental screen of 20 diverse wheat genotypes with repetitive indel markers indicated that six allele variants exist at the ALMT1SSR3 locus. Sequence analysis confirmed that these variations were due to indels, copy number of SSR repeats, and base substitution within SSR repeats. The higher level of variation in intron 3 suggests that this genomic region has been constrained by indels, SSR and single nucleotide polymorphisms. Results have proven that repetitive indel markers cosegregating with the Al tolerance locus will be useful for marker assisted selection and population and evolution studies.  相似文献   

12.
Evidence excluding the root of the tree of life from the actinobacteria   总被引:1,自引:0,他引:1  
The Actinobacteria are found in aquatic and terrestrial habitats throughout the world and are among the most morphologically varied prokaryotes. They manufacture unusual compounds, utilize novel metabolic pathways, and contain unique genes. This diversity may suggest that the root of the tree of life could be within the Actinobacteria, although there is little or no convincing evidence for such a root. Here, using gene insertions and deletions found in the DNA gyrase, GyrA, and in the paralogous DNA topoisomerase, ParC, we present evidence that the root of life is outside the Actinobacteria.  相似文献   

13.
Insertions and deletions (indels) in protein-coding genes are important sources of genetic variation. Their role in creating new proteins may be especially important after gene duplication. However, little is known about how indels affect the divergence of duplicate genes. We here study thousands of duplicate genes in five fish (teleost) species with completely sequenced genomes. The ancestor of these species has been subject to a fish-specific genome duplication (FSGD) event that occurred approximately 350 Ma. We find that duplicate genes contain at least 25% more indels than single-copy genes. These indels accumulated preferentially in the first 40 my after the FSGD. A lack of widespread asymmetric indel accumulation indicates that both members of a duplicate gene pair typically experience relaxed selection. Strikingly, we observe a 30-80% excess of deletions over insertions that is consistent for indels of various lengths and across the five genomes. We also find that indels preferentially accumulate inside loop regions of protein secondary structure and in regions where amino acids are exposed to solvent. We show that duplicate genes with high indel density also show high DNA sequence divergence. Indel density, but not amino acid divergence, can explain a large proportion of the tertiary structure divergence between proteins encoded by duplicate genes. Our observations are consistent across all five fish species. Taken together, they suggest a general pattern of duplicate gene evolution in which indels are important driving forces of evolutionary change.  相似文献   

14.
Indels are increasingly used in phylogenetics and play a major role in genome size evolution, and yet both the phylogenetic information content of indels and their evolutionary significance remain to be better assessed. Using three presumably independently evolving nuclear gene fragments (28S rDNA, β-fibrinogen, ornithine decarboxylase) from 29 families of neognathous birds, we have obtained a topology that is in general agreement with the current molecular consensus tree, supports the monophyly of Metaves, and provides evidence for the unresolved relationships within the Charadriiformes. Based on the retrieved topology, we assess the relative impact of indels and nucleotide substitutions and demonstrate that the superposition of the two kinds of data yields a topology that could not be obtained from either data set alone. Although only two out of three gene fragments reveal the deletion bias, the combined nucleotide insertion-to-deletion ratio is 0.22, indicating a rapid decrease of intron length. The average indel fixation rate in the neognaths is 2.5 times faster than that in therian (placental) mammals of similar geologic age. As in mammals, there is a considerable variation of indel fixation rate that is 1.5 times higher in Galloanseres compared to Neoaves, and 2.4 times higher in the Rallidae compared to the average for Neoaves (8.2 times higher compared to the related Gruidae). Our results add to the evidence that indel fixation rates correlate with lineage-specific evolutionary rates.  相似文献   

15.
Indels in DNA sequences frequently affect more than a single nucleotide, creating problems for alignment, character coding and phylogenetic analysis. However, the size and frequency of multiple‐residue indels is not usually tested, and with popular alignment packages their reconstruction is indirectly acheived by reducing the affine (gap extension) cost. We explored the length distribution of indels in intron sequences of the gene Mp20 by modifying the gap opening and gap extension costs. Given a “known” tree for the study group, global homology levels were greatest under low gap cost, with gap extension costs of roughly 0.4‐fold the opening cost. Different approaches to gap coding and weighting suggested that taxonomic congruence was correlated with high frequencies of multiple‐position indels, with a maximum indel length of 2–5 bp and few indels above 15 bp, but also including a proportion of indels > 100 bp. Only a small minority of indels could be reconstructed as single‐position indels. Consequently, tree topologies improved when homologous multinucleotide indels were recoded as binary characters which are otherwise highly homoplastic and weighted characters in single‐position coding. In tree‐generating alignment procedures as implemented in POY, where gap penalty determines the character weight during tree search, the problem of assigning inappropriately high weight to multiple‐residue indels could partly be overcome by setting the extension costs to about 0.4‐fold lower than gap opening costs. We conclude that multiple consecutive gap positions are not independent characters and hence methods for parsimony reconstruction of long indels are required. Finally, we also observed a general lack of correlation between taxonomic and character congruence, demonstrating the difficulties of applying congruence criteria to decide among competing alignments. This highlights the value of recent model‐based alignment procedures which can implement the statistical distributions of indel size classes, and do not rely on potentially circular strategies for optimizing overall congruence. © The Willi Hennig Society 2006.  相似文献   

16.
Little is known about variation of nucleotide insertion/deletions (indels) within species. In Arabidopsis thaliana, we investigated indel polymorphism patterns between two genome sequences and among 96 accessions at 1215 loci. Our study identified patterns in the variation of indel density, size, GC content and distribution, and a correlation between indels and substitutions. We found that the GC content in indel sequences was lower than that in non-indel sequences and that indels typically occur in regions with lower GC content. Patterns of indel frequency distribution among populations were more consistent with neutral expectation than substitution patterns. We also found that the local level of substitutions is positively correlated with indel density and negatively correlated with their distance to the closed indel, suggesting that indels play an important role in nucleotide variation.  相似文献   

17.
Zhu XY  Feng FY  Xue SY  Hou T  Liu HR 《Génome》2011,54(10):805-811
Two insertion/deletion (indel) polymorphisms of the prion protein gene (PRNP), a 23-bp indel in the putative promoter region and a 12-bp indel within intron I, are associated with the susceptibility to bovine spongiform encephalopathy (BSE) in cattle. In the present study, the polymorphism frequencies of the two indels in four main beef cattle breeds (Hereford, Simmental, Black Angus, and Mongolian) from North China were studied. The results showed that the frequencies of deletion genotypes and alleles of 23- and 12-bp indels were lower, whereas the frequencies of insertion genotypes and alleles of the two indels were higher in Mongolian cattle than in the other three cattle breeds. In Mongolian cattle, the 23-bp insertion / 12-bp insertion was the major haplotype, whereas in Hereford, Simmental, and Black Angus cattle, the 23-bp deletion / 12-bp deletion was the major haplotype. These results demonstrated that Mongolian cattle could be more resistant to BSE, compared with the other three cattle breeds, because of its relatively low frequencies of deletion genotypes and alleles of 23- and 12-bp indel polymorphisms. Thus, this race could be important for selective breeding to improve resistance against BSE in this area.  相似文献   

18.
The Sepsidae is, with approximately 300 described species, a relatively small family of cyclorrhaphan flies whose behaviour, morphology, and development have been extensively studied. However, currently the only available tree for Sepsidae is more than 10 years old and was based entirely on morphological characters. Here, we present the results of parsimony and Bayesian analyses based on 75 species, ten genes, and morphology. Parsimony and Bayesian analyses produce largely congruent and well‐supported topologies regardless of whether indels are coded as 5th character states, as missing values, or all sites with indels are removed. The tree confirms the monophyly of Sepsidae and identifies the Ropalomeridae as its sister group. With regard to higher‐level relationships, we identify widespread conflict between the morphological and the DNA sequence data. The proposed hypothesis based on both partitions largely reflects the signal in the molecular data. Particularly surprising is the rejection of two relationship hypotheses with strong morphological support, namely the sister group relationship between Orygma and the remaining Sepsidae and the monophyly of the Sepsis species group. Our partitioned Bremer support (PBS) analyses imply that indel coding has a stronger effect on the relative performance of individual gene partitions than the exclusion of alignment‐ambiguous sequences or the location of a gene on the mitochondrial or nuclear genome. However, these analyses also reveal unexpectedly strong fluctuations in PBS values given that indel treatment has only a minor effect on tree topology and jacknife support. These unexpected fluctuations highlight the need for a comparative study across multiple data sets that investigates the influence of conflict and indel treatment on PBS values. © The Willi Hennig Society 2008.  相似文献   

19.
Insertions, deletions, and inversions in the chloroplast genome of higher plants have been shown to be extremely useful for resolving phylogenetic relationships both between closely related taxa and among more basal lineages. Introns and intergenic spacers from the chloroplast genome are now increasingly used for phylogenetic and population genetic studies of populations from a single species, and it is therefore interesting to know whether indels can provide useful data and hence increase the power of intraspecific studies. Here, we show that indels in three cpDNA intergenic spacers and one cpDNA intron for two species of Silene evolve at slightly higher rates than base pair substitutions. Repeat indels appear to have the highest rate of evolution and are thus more prone to homoplasy. We show that coded indel data have high information content for phylogenetic analysis, and indels thus provide useful information to infer phylogenetic relationships at the intraspecific level.  相似文献   

20.
Hu J  Ng PC 《Genome biology》2012,13(2):R9-11
Each human has approximately 50 to 280 frameshifting indels, yet their implications are unknown. We created SIFT Indel, a prediction method for frameshifting indels that has 84% accuracy. The percentage of human frameshifting indels predicted to be gene-damaging is negatively correlated with allele frequency. We also show that although the first frameshifting indel in a gene causes loss of function, there is a tendency for the second frameshifting indel to compensate and restore protein function. SIFT Indel is available at http://sift-dna.org/www/SIFT_indels2.html.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号