首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Recently, the study of ancient DNA (aDNA) has been greatly enhanced by the development of second-generation DNA sequencing technologies and targeted enrichment strategies. These developments have allowed the recovery of several complete ancient genomes, a result that would have been considered virtually impossible only a decade ago. Prior to these developments, aDNA research was largely focused on the recovery of short DNA sequences and their use in the study of phylogenetic relationships, molecular rates, species identification and population structure. However, it is now possible to sequence a large number of modern and ancient complete genomes from a single species and thereby study the genomic patterns of evolutionary change over time. Such a study would herald the beginnings of ancient population genomics and its use in the study of evolution. Species that are amenable to such large-scale studies warrant increased research effort. We report here progress on a population genomic study of the Adélie penguin (Pygoscelis adeliae). This species is ideally suited to ancient population genomic research because both modern and ancient samples are abundant in the permafrost conditions of Antarctica. This species will enable us to directly address many of the fundamental questions in ecology and evolution.  相似文献   

2.
Bacteriophage genomes show pervasive mosaicism, indicating the importance of horizontal gene exchange in their evolution. Phage genomes represent unique combinations of modules, each of them with a different phylogenetic history. The traditional classification, based on a variety of criteria such as nucleic acid type (single/double-stranded DNA/RNA), morphology, and host range, appeared inconsistent with sequence analyses. With the genomic era, an ever increasing number of sequenced phages cannot be classified, in part due to a lack of morphological information and in part to the intrinsic incapability of tree-based methods to efficiently deal with mosaicism. This problem led some virologists to call for a moratorium on the creation of additional taxa in the order Caudovirales, in order to let virologists discuss classification schemes that might better suit phage evolution. In this context, we propose a framework for a reticulate classification of phages based on gene content. Starting from gene families, we built a weighted graph, where nodes represent phages and edges represent phage-phage similarities in terms of shared genes. We then apply various measures of graph topology to analyze the resulting graph. Most double-stranded DNA phages are found in a single component. The values of the clustering coefficient and closeness distinguish temperate from virulent phages, whereas chimeric phages are characterized by a high betweenness coefficient. We apply a 2-step clustering method to this graph to generate a reticulate classification of phages: Each phage is associated with a membership vector, which quantitatively characterizes its membership to the set of clusters. Furthermore, we cluster genes based on their "phylogenetic profiles" to define "evolutionary cohesive modules." In virulent phages, evolutionary modules span several functional categories, whereas in temperate phages they correspond better to functional modules. Moreover, despite the fact that modules only cover a fraction of all phage genes, phage groups can be distinguished by their different combination of modules, serving the bases for a higher level reticulate classification. These 2 classification schemes provide an automatic and dynamic way of representing the relationships within the phage population and can be extended to include newly sequenced phage genomes, as well as other types of genetic elements.  相似文献   

3.
The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55–83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57%) of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%). Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.  相似文献   

4.
《Genomics》2019,111(6):1590-1603
Genomes are not random sequences because natural selection has injected information in biological sequences for billions of years. Inspired by this idea, we developed a simple method to compare genomes considering nucleotide counts in subsequences (blocks) instead of their exact sequences.We introduce the Block Alignment method for comparing two genomes and based on this comparison method, define a similarity score and a distance. The presented model ignores nucleotide order in the sequence. On the other hand, in this block comparison method, due to exclusion of point mutations and small size variations, there is no need for high coverage sequencing which is responsible for the high costs of data production and storage; moreover, the sequence comparisons could be performed with higher speed.Phylogenetic trees of two sets of bacterial genomes were constructed and the results were in full agreement with their already constructed phylogenetic trees. Furthermore, a weighted and directed similarity network of each set of bacterial genomes was inferred ab initio by this model. Remarkably, the communities of these networks are in agreement with the clades of the corresponding phylogenetic trees which means these similarity networks also contain phylogenetic information about the genomes. Moreover, the block comparison method was used to distinguish rob(15;21)c-associated iAMP21 and sporadic iAMP21 rearrangements in subgroups of chromosome 21 in acute lymphoblastic leukemia. Our results show a meaningful difference between the number of contigs that mapped to chromosomes 15 and 21 in these cases. Furthermore, the presented block alignment model can select the candidate blocks to perform more accurate analysis and it is capable to find conserved blocks on a set of genomes.  相似文献   

5.
Although new and emerging next-generation sequencing (NGS) technologies have reduced sequencing costs significantly, much work remains to implement them for de novo sequencing of complex and highly repetitive genomes such as the tetraploid genome of Upland cotton (Gossypium hirsutum L.). Herein we report the results from implementing a novel, hybrid Sanger/454-based BAC-pool sequencing strategy using minimum tiling path (MTP) BACs from Ctg-3301 and Ctg-465, two large genomic segments in A12 and D12 homoeologous chromosomes (Ctg). To enable generation of longer contig sequences in assembly, we implemented a hybrid assembly method to process ~35x data from 454 technology and 2.8-3x data from Sanger method. Hybrid assemblies offered higher sequence coverage and better sequence assemblies. Homology studies revealed the presence of retrotransposon regions like Copia and Gypsy elements in these contigs and also helped in identifying new genomic SSRs. Unigenes were anchored to the sequences in Ctg-3301 and Ctg-465 to support the physical map. Gene density, gene structure and protein sequence information derived from protein prediction programs were used to obtain the functional annotation of these genes. Comparative analysis of both contigs with Arabidopsis genome exhibited synteny and microcollinearity with a conserved gene order in both genomes. This study provides insight about use of MTP-based BAC-pool sequencing approach for sequencing complex polyploid genomes with limited constraints in generating better sequence assemblies to build reference scaffold sequences. Combining the utilities of MTP-based BAC-pool sequencing with current longer and short read NGS technologies in multiplexed format would provide a new direction to cost-effectively and precisely sequence complex plant genomes.  相似文献   

6.
MicroRNAs (miRNAs) control many important aspects of plant development, suggesting these molecules may also have played key roles in the evolution of developmental processes in plants. However, evolutionary-developmental (evo-devo) studies of miRNAs have been held back by technical difficulties in gene identification. To help solve this problem, we have developed a two-step procedure for the efficient identification of miRNA genes in any plant species. As a test case, we have studied the evolution of the MIR164 family in the angiosperms. We have identified novel MIR164 genes in three species occupying key phylogenetic positions and used these, together with published sequence data, to partially reconstruct the evolution of the MIR164 family since the last common ancestor of the extant flowering plants. We use our evolutionary reconstruction to discuss potential roles for MIR164 genes in the evolution of leaf shape and carpel closure in the angiosperms. The techniques we describe may be applied to any miRNA family and should thus enable plant evo-devo to begin to investigate the contributions miRNAs have made to the evolution of plant development.  相似文献   

7.
Reconstructing a tree of life by inferring evolutionary history is an important focus of evolutionary biology. Phylogenetic reconstructions also provide useful information for a range of scientific disciplines such as botany, zoology, phylogeography, archaeology and biological anthropology. Until the development of protein and DNA sequencing techniques in the 1960s and 1970s, phylogenetic reconstructions were based on fossil records and comparative morphological/physiological analyses. Since then, progress in molecular phylogenetics has compensated for some of the shortcomings of phenotype-based comparisons. Comparisons at the molecular level increase the accuracy of phylogenetic inference because there is no environmental influence on DNA/peptide sequences and evaluation of sequence similarity is not subjective. While the number of morphological/physiological characters that are sufficiently conserved for phylogenetic inference is limited, molecular data provide a large number of datapoints and enable comparisons from diverse taxa. Over the last 20 years, developments in molecular phylogenetics have greatly contributed to our understanding of plant evolutionary relationships. Regions in the plant nuclear and organellar genomes that are optimal for phylogenetic inference have been determined and recent advances in DNA sequencing techniques have enabled comparisons at the whole genome level. Sequences from the nuclear and organellar genomes of thousands of plant species are readily available in public databases, enabling researchers without access to molecular biology tools to investigate phylogenetic relationships by sequence comparisons using the appropriate nucleotide substitution models and tree building algorithms. In the present review, the statistical models and algorithms used to reconstruct phylogenetic trees are introduced and advances in the exploration and utilization of plant genomes for molecular phylogenetic analyses are discussed.  相似文献   

8.
The human mutation rate is an essential parameter for studying the evolution of our species, interpreting present-day genetic variation, and understanding the incidence of genetic disease. Nevertheless, our current estimates of the rate are uncertain. Most notably, recent approaches based on counting de novo mutations in family pedigrees have yielded significantly smaller values than classical methods based on sequence divergence. Here, we propose a new method that uses the fine-scale human recombination map to calibrate the rate of accumulation of mutations. By comparing local heterozygosity levels in diploid genomes to the genetic distance scale over which these levels change, we are able to estimate a long-term mutation rate averaged over hundreds or thousands of generations. We infer a rate of 1.61 ± 0.13 × 10−8 mutations per base per generation, which falls in between phylogenetic and pedigree-based estimates, and we suggest possible mechanisms to reconcile our estimate with previous studies. Our results support intermediate-age divergences among human populations and between humans and other great apes.  相似文献   

9.
A phylogenomic approach to microbial evolution   总被引:21,自引:2,他引:19       下载免费PDF全文
To study the origin and evolution of biochemical pathways in microorganisms, we have developed methods and software for automatic, large-scale reconstructions of phylogenetic relationships. We define the complete set of phylogenetic trees derived from the proteome of an organism as the phylome and introduce the term phylogenetic connection as a concept that describes the relative relationships between taxa in a tree. A query system has been incorporated into the system so as to allow searches for defined categories of trees within the phylome. As a complement, we have developed the pyphy system for visualising the results of complex queries on phylogenetic connections, genomic locations and functional assignments in a graphical format. Our phylogenomics approach, which links phylogenetic information to the flow of biochemical pathways within and among microbial species, has been used to examine more than 8000 phylogenetic trees from seven microbial genomes. The results have revealed a rich web of phylogenetic connections. However, the separation of Bacteria and Archaea into two separate domains remains robust.  相似文献   

10.
11.
Organellar DNA sequences are widely used in evolutionary and population genetic studies, however, the conservative nature of chloroplast gene and genome evolution often limits phylogenetic resolution and statistical power. To gain maximal access to the historical record contained within chloroplast genomes, we have adapted multiplex sequencing-by-synthesis (MSBS) to simultaneously sequence multiple genomes using the Illumina Genome Analyzer. We PCR-amplified ~120 kb plastomes from eight species (seven Pinus, one Picea) in 35 reactions. Pooled products were ligated to modified adapters that included 3 bp indexing tags and samples were multiplexed at four genomes per lane. Tagged microreads were assembled by de novo and reference-guided assembly methods, using previously published Pinus plastomes as surrogate references. Assemblies for these eight genomes are estimated at 88–94% complete, with an average sequence depth of 55× to 186×. Mononucleotide repeats interrupt contig assembly with increasing repeat length, and we estimate that the limit for their assembly is 16 bp. Comparisons to 37 kb of Sanger sequence show a validated error rate of 0.056%, and conspicuous errors are evident from the assembly process. This efficient sequencing approach yields high-quality draft genomes and should have immediate applicability to genomes with comparable complexity.  相似文献   

12.
Background and Aims Some plant groups, especially on islands, have been shaped by strong ancestral bottlenecks and rapid, recent radiation of phenotypic characters. Single molecular markers are often not informative enough for phylogenetic reconstruction in such plant groups. Whole plastid genomes and nuclear ribosomal DNA (nrDNA) are viewed by many researchers as sources of information for phylogenetic reconstruction of groups in which expected levels of divergence in standard markers are low. Here we evaluate the usefulness of these data types to resolve phylogenetic relationships among closely related Diospyros species.Methods Twenty-two closely related Diospyros species from New Caledonia were investigated using whole plastid genomes and nrDNA data from low-coverage next-generation sequencing (NGS). Phylogenetic trees were inferred using maximum parsimony, maximum likelihood and Bayesian inference on separate plastid and nrDNA and combined matrices.Key Results The plastid and nrDNA sequences were, singly and together, unable to provide well supported phylogenetic relationships among the closely related New Caledonian Diospyros species. In the nrDNA, a 6-fold greater percentage of parsimony-informative characters compared with plastid DNA was found, but the total number of informative sites was greater for the much larger plastid DNA genomes. Combining the plastid and nuclear data improved resolution. Plastid results showed a trend towards geographical clustering of accessions rather than following taxonomic species.Conclusions In plant groups in which multiple plastid markers are not sufficiently informative, an investigation at the level of the entire plastid genome may also not be sufficient for detailed phylogenetic reconstruction. Sequencing of complete plastid genomes and nrDNA repeats seems to clarify some relationships among the New Caledonian Diospyros species, but the higher percentage of parsimony-informative characters in nrDNA compared with plastid DNA did not help to resolve the phylogenetic tree because the total number of variable sites was much lower than in the entire plastid genome. The geographical clustering of the individuals against a background of overall low sequence divergence could indicate transfer of plastid genomes due to hybridization and introgression following secondary contact.  相似文献   

13.
Ferns and lycophytes have remarkably large genomes. However, little is known about how their genome size evolved in fern lineages. To explore the origins and evolution of chromosome numbers and genome size in ferns, we used flow cytometry to measure the genomes of 240 species (255 samples) of extant ferns and lycophytes comprising 27 families and 72 genera, of which 228 species (242 samples) represent new reports. We analyzed correlations among genome size, spore size, chromosomal features, phylogeny, and habitat type preference within a phylogenetic framework. We also applied ANOVA and multinomial logistic regression analysis to preference of habitat type and genome size. Using the phylogeny, we conducted ancestral character reconstruction for habitat types and tested whether genome size changes simultaneously with shifts in habitat preference. We found that 2C values had weak phylogenetic signal, whereas the base number of chromosomes (x) had a strong phylogenetic signal. Furthermore, our analyses revealed a positive correlation between genome size and chromosome traits, indicating that the base number of chromosomes (x), chromosome size, and polyploidization may be primary contributors to genome expansion in ferns and lycophytes. Genome sizes in different habitat types varied significantly and were significantly correlated with habitat types; specifically, multinomial logistic regression indicated that species with larger 2C values were more likely to be epiphytes. Terrestrial habitat is inferred to be ancestral for both extant ferns and lycophytes, whereas transitions to other habitat types occurred as the major clades emerged. Shifts in habitat types appear be followed by periods of genomic stability. Based on these results, we inferred that habitat type changes and multiple whole-genome duplications have contributed to the formation of large genomes of ferns and their allies during their evolutionary history.  相似文献   

14.
Ribonucleases play key, often essential, roles in cellular metabolism. Nineteen ribonuclease activities, from 22 different proteins, have so far been described in bacteria, the majority of them from either Escherichia coli or Bacillus subtilis. Here we examine the phylogenetic distribution of all of these ribonucleases in 50 eubacterial and archaeal species whose genomes have been completely sequenced, with particular emphasis on the endoribonucleases. Although some enzymes are very highly conserved throughout evolution, there appears to be no truly universal ribonuclease. While some organisms, like E.coli, have a large selection of ribonucleases, many with overlapping functions, others seem to have relatively few or have many that remain to be discovered.  相似文献   

15.
Our understanding of the evolutionary history of primates is undergoing continual revision due to ongoing genome sequencing efforts. Bolstered by growing fossil evidence, these data have led to increased acceptance of once controversial hypotheses regarding phylogenetic relationships, hybridization and introgression, and the biogeographical history of primate groups. Among these findings is a pattern of recent introgression between species within all major primate groups examined to date, though little is known about introgression deeper in time. To address this and other phylogenetic questions, here, we present new reference genome assemblies for 3 Old World monkey (OWM) species: Colobus angolensis ssp. palliatus (the black and white colobus), Macaca nemestrina (southern pig-tailed macaque), and Mandrillus leucophaeus (the drill). We combine these data with 23 additional primate genomes to estimate both the species tree and individual gene trees using thousands of loci. While our species tree is largely consistent with previous phylogenetic hypotheses, the gene trees reveal high levels of genealogical discordance associated with multiple primate radiations. We use strongly asymmetric patterns of gene tree discordance around specific branches to identify multiple instances of introgression between ancestral primate lineages. In addition, we exploit recent fossil evidence to perform fossil-calibrated molecular dating analyses across the tree. Taken together, our genome-wide data help to resolve multiple contentious sets of relationships among primates, while also providing insight into the biological processes and technical artifacts that led to the disagreements in the first place.

Combining three newly sequenced primate genomes with other published genomes, this study adapts a little-known method for detecting ancient introgression to genome-scale data, revealing multiple previously unknown examples of hybridization between primate species.  相似文献   

16.

Background  

Completed genomes and environmental genomic sequences are bringing a significant contribution to understanding the evolution of gene families, microbial metabolism and community eco-physiology. Here, we used comparative genomics and phylogenetic analyses in conjunction with enzymatic data to probe the evolution and functions of a microbial nitrilase gene family. Nitrilases are relatively rare in bacterial genomes, their biological function being unclear.  相似文献   

17.
Although massively parallel sequencing has facilitated large-scale DNA sequencing, comparisons among distantly related species rely upon small portions of the genome that are easily aligned. Methods are needed to efficiently obtain comparable DNA fragments prior to massively parallel sequencing, particularly for biologists working with non-model organisms. We introduce a new class of molecular marker, anchored by ultraconserved genomic elements (UCEs), that universally enable target enrichment and sequencing of thousands of orthologous loci across species separated by hundreds of millions of years of evolution. Our analyses here focus on use of UCE markers in Amniota because UCEs and phylogenetic relationships are well-known in some amniotes. We perform an in silico experiment to demonstrate that sequence flanking 2030 UCEs contains information sufficient to enable unambiguous recovery of the established primate phylogeny. We extend this experiment by performing an in vitro enrichment of 2386 UCE-anchored loci from nine, non-model avian species. We then use alignments of 854 of these loci to unambiguously recover the established evolutionary relationships within and among three ancient bird lineages. Because many organismal lineages have UCEs, this type of genetic marker and the analytical framework we outline can be applied across the tree of life, potentially reshaping our understanding of phylogeny at many taxonomic levels.  相似文献   

18.
Gene structure data can substantially advance our understanding of metazoan evolution and deliver an independent approach to resolve conflicts among existing hypotheses. Here, we used changes of spliceosomal intron positions as novel phylogenetic marker to reconstruct the animal tree. This kind of data is inferred from orthologous genes containing mutually exclusive introns at pairs of sequence positions in close proximity, so-called near intron pairs (NIPs). NIP data were collected for 48 species and utilized as binary genome-level characters in maximum parsimony (MP) analyses to reconstruct deep metazoan phylogeny. All groupings that were obtained with more than 80% bootstrap support are consistent with currently supported phylogenetic hypotheses. This includes monophyletic Chordata, Vertebrata, Nematoda, Platyhelminthes and Trochozoa. Several other clades such as Deuterostomia, Protostomia, Arthropoda, Ecdysozoa, Spiralia, and Eumetazoa, however, failed to be recovered due to a few problematic taxa such as the mite Ixodes and the warty comb jelly Mnemiopsis. The corresponding unexpected branchings can be explained by the paucity of synapomorphic changes of intron positions shared between some genomes, by the sensitivity of MP analyses to long-branch attraction (LBA), and by the very unequal evolutionary rates of intron loss and intron gain during evolution of the different subclades of metazoans. In addition, we obtained an assemblage of Cnidaria, Porifera, and Placozoa as sister group of Bilateria + Ctenophora with medium support, a disputable, but remarkable result. We conclude that NIPs can be used as phylogenetic characters also within a broader phylogenetic context, given that they have emerged regularly during evolution irrespective of the large variation of intron density across metazoan genomes.  相似文献   

19.
We introduce a weighted graph model to investigate the self-similarity characteristics of eubacteria genomes. The regular treating in similarity comparison about genome is to discover the evolution distance among different genomes. Few people focus their attention on the overall statistical characteristics of each gene compared with other genes in the same genome. In our model, each genome is attributed to a weighted graph, whose topology describes the similarity relationship among genes in the same genome. Based on the related weighted graph theory, we extract some quantified statistical variables from the topology, and give the distribution of some variables derived from the largest social structure in the topology. The 23 eubacteria recently studied by Sorimachi and Okayasu are markedly classified into two different groups by their double logarithmic point-plots describing the similarity relationship among genes of the largest social structure in genome. The results show that the proposed model may provide us with some new sights to understand the structures and evolution patterns determined from the complete genomes.  相似文献   

20.
This new century's biology promises more of everything--more genes, more organisms, more species and, in short, more data. The flood of data challenges us to find better and quicker ways to summarize and analyse. Here, we present preliminary results and proofs of concept from three of our research projects that are motivated by our search for solutions to the perils of plenty. First, we discuss how models of evolution can accommodate change to better reflect the dynamics of sequence diversity, particularly when it is becoming a lot easier to obtain sequences at different times and across intervals where the probability of new mutations contributing to this diversity is high. Second, we describe our work on the use of a single locus for species delimitation; this research targets the new DNA-barcoding approach that aims to catalogue the entirety of life. We have developed a single-locus test based on the coalescent that tests the null hypothesis of panmixis. Finally, we discuss new sequencing technologies, the types of data available and the efficacy of alignment-free methods to estimate pairwise distances for phylogenetic analyses.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号