首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
The giant grouper (Epinephelus lanceolatus) is the largest coral reef teleost, with a native range that spans temperate and tropical waters in the Pacific and the Indian Oceans. It is cultured artificially and used as a breeding species in aquaculture due to its rapid growth rate. Here we report a giant grouper genome assembled at the chromosome scale from sequences generated using Illumina and high‐throughput chromatin conformation capture (Hi‐C) technology. The assembly comprised 1.086 Gb, with 98.4% of the scaffold sequences anchored into 24 chromosomes. The contig and scaffold N50 values were 119.9 kb and 46.2 Mb, respectively. The assembly is of high integrity, including 96.4% universal single‐copy orthologues based on BUSCO analysis. Through chromosome‐scale evolution analysis, we identified alignments of six giant grouper chromosomes to three stickleback chromosomes and some of the genes located within the breakpoints of reshuffling events may related to development and growth. From the 24,718 protein‐coding genes, we found that several gene families related to innate immunity and glycan biosynthesis were significantly expanded in the giant grouper genome compared to other teleost genomes. In addition, we identified several genes related to the hormone signalling pathway and innate immunity that have experienced positive selection or accelerated evolution, implicating their roles in immune defence and fast growth of the species. The high‐quality genome assembly will provide a valuable genomic resource for further biological and evolutionary studies, and useful genomic tools for breeding of the giant grouper.  相似文献   

2.
3.
Whole‐genome resequencing (WGR) is a powerful method for addressing fundamental evolutionary biology questions that have not been fully resolved using traditional methods. WGR includes four approaches: the sequencing of individuals to a high depth of coverage with either unresolved or resolved haplotypes, the sequencing of population genomes to a high depth by mixing equimolar amounts of unlabelled‐individual DNA (Pool‐seq) and the sequencing of multiple individuals from a population to a low depth (lcWGR). These techniques require the availability of a reference genome. This, along with the still high cost of shotgun sequencing and the large demand for computing resources and storage, has limited their implementation in nonmodel species with scarce genomic resources and in fields such as conservation biology. Our goal here is to describe the various WGR methods, their pros and cons and potential applications in conservation biology. WGR offers an unprecedented marker density and surveys a wide diversity of genetic variations not limited to single nucleotide polymorphisms (e.g., structural variants and mutations in regulatory elements), increasing their power for the detection of signatures of selection and local adaptation as well as for the identification of the genetic basis of phenotypic traits and diseases. Currently, though, no single WGR approach fulfils all requirements of conservation genetics, and each method has its own limitations and sources of potential bias. We discuss proposed ways to minimize such biases. We envision a not distant future where the analysis of whole genomes becomes a routine task in many nonmodel species and fields including conservation biology.  相似文献   

4.
Understanding the genetics of biological diversification across micro‐ and macro‐evolutionary time scales is a vibrant field of research for molecular ecologists as rapid advances in sequencing technologies promise to overcome former limitations. In palms, an emblematic, economically and ecologically important plant family with high diversity in the tropics, studies of diversification at the population and species levels are still hampered by a lack of genomic markers suitable for the genotyping of large numbers of recently diverged taxa. To fill this gap, we used a whole genome sequencing approach to develop target sequencing for molecular markers in 4,184 genome regions, including 4,051 genes and 133 non‐genic putatively neutral regions. These markers were chosen to cover a wide range of evolutionary rates allowing future studies at the family, genus, species and population levels. Special emphasis was given to the avoidance of copy number variation during marker selection. In addition, a set of 149 well‐known sequence regions previously used as phylogenetic markers by the palm biological research community were included in the target regions, to open the possibility to combine and jointly analyse already available data sets with genomic data to be produced with this new toolkit. The bait set was effective for species belonging to all three palm sub‐families tested (Arecoideae, Ceroxyloideae and Coryphoideae), with high mapping rates, specificity and efficiency. The number of high‐quality single nucleotide polymorphisms (SNPs) detected at both the sub‐family and population levels facilitates efficient analyses of genomic diversity across micro‐ and macro‐evolutionary time scales.  相似文献   

5.
The generation of genome‐scale data is critical for a wide range of questions in basic biology using model organisms, but also in questions of applied biology in nonmodel organisms (agriculture, natural resources, conservation and public health biology). Using a genome‐scale approach on a diverse group of nonmodel organisms and with the goal of lowering costs of the method, we modified a multiplexed, high‐throughput genomic scan technique utilizing two restriction enzymes. We analysed several pairs of restriction enzymes and completed double‐digestion RAD sequencing libraries for nine different species and five genera of insects and fish. We found one particular enzyme pair produced consistently higher number of sequence‐able fragments across all nine species. Building libraries off this enzyme pair, we found a range of usable SNPs between 4000 and 37 000 SNPS per species and we found a greater number of usable SNPs using reference genomes than de novo pipelines in STACKS. We also found fewer reads in the Read 2 fragments from the paired‐end Illumina Hiseq run. Overall, the results of this study provide empirical evidence of the utility of this method for producing consistent data for diverse nonmodel species and suggest specific considerations for sequencing analysis strategies.  相似文献   

6.
The Chinese Taihu pig breeds are an invaluable component of the world's pig genetic resources, and they are the most prolific breeds of swine in the world. In this study, the genomes of 252 pigs of the six indigenous breeds in the Taihu Lake region were sequenced using the genotyping by genome reducing and sequencing approach. A total of 950 million good reads were obtained using an Illumina Hiseq2000 at an average depth of 13× (for SNP calling) and an average coverage of 2.3%. In total, 122 632 indels, 31 444 insertions, 44 056 deletions and 455 CNVs (copy number variants) were identified in the genomes of the pigs. Approximately 2.3% of these genetic markers were mapped to gene exon regions, and 25% were in QTL regions related to economically important traits. The KEGG pathway or GO enrichment analyses revealed that genetic variants assumed to be large‐effect mutations were significantly overrepresented in 22 SNP, 56 indel, 26 insertion, 28 deletion and three CNV gene sets. A total of 343 breed‐specific SNPs were also identified in the six Chinese indigenous pigs. The findings from this study can contribute to future investigations of the genetic diversity, population structure, positive selection signals and molecular evolutionary history of these pigs at the genome level and can serve as a valuable reference for improving the breeding and cultivation of these pigs.  相似文献   

7.
Thanks to a dramatic reduction in sequencing costs followed by a rapid development of bioinformatics tools, genome assembly and annotation have become accessible to many researchers in recent years. Among tetrapods, birds have genomes that display many features that facilitate their assembly and annotation, such as small genome size, low number of repeats and highly conserved genomic structure. However, we found that high genomic heterozygosity could have a great impact on the quality of the genome assembly of the thick‐billed murre (Uria lomvia), an arctic colonial seabird. In this study, we tested the performance of three genome assemblers, ray /sscape , soapdenovo 2 and platanus , in assembling the highly heterozygous genome of the thick‐billed murre. Our results show that platanus , an assembler specifically designed for heterozygous genomes, outperforms the other two approaches and produces a highly contiguous (N50 = 15.8 Mb) and complete genome assembly (93% presence of genes from the Benchmarking Universal Single Copy Ortholog [BUSCO] gene set). Additionally, we annotated the thick‐billed murre genome using a homology‐based approach that takes advantage of the genomic resources available for birds and other taxa. Our study will be useful for those researchers who are approaching assembly and annotation of highly heterozygous genomes, or genomes of species of conservation concern, and/or who have limited financial resources.  相似文献   

8.
Ramie, Boehmeria nivea (L.) Gaudich, family Urticaceae, is a plant native to eastern Asia, and one of the world's oldest fibre crops. It is also used as animal feed and for the phytoremediation of heavy metal‐contaminated farmlands. Thus, the genome sequence of ramie was determined to explore the molecular basis of its fibre quality, protein content and phytoremediation. For further understanding ramie genome, different paired‐end and mate‐pair libraries were combined to generate 134.31 Gb of raw DNA sequences using the Illumina whole‐genome shotgun sequencing approach. The highly heterozygous B. nivea genome was assembled using the Platanus Genome Assembler, which is an effective tool for the assembly of highly heterozygous genome sequences. The final length of the draft genome of this species was approximately 341.9 Mb (contig N50 = 22.62 kb, scaffold N50 = 1,126.36 kb). Based on ramie genome annotations, 30,237 protein‐coding genes were predicted, and the repetitive element content was 46.3%. The completeness of the final assembly was evaluated by benchmarking universal single‐copy orthologous genes (BUSCO); 90.5% of the 1,440 expected embryophytic genes were identified as complete, and 4.9% were identified as fragmented. Phylogenetic analysis based on single‐copy gene families and one‐to‐one orthologous genes placed ramie with mulberry and cannabis, within the clade of urticalean rosids. Genome information of ramie will be a valuable resource for the conservation of endangered Boehmeria species and for future studies on the biogeography and characteristic evolution of members of Urticaceae.  相似文献   

9.
10.
In molecular ecology, the development of efficient molecular markers for fungi remains an important research domain. Nuclear ribosomal internal transcribed spacer (ITS) region was proposed as universal DNA barcode marker for fungi, but this marker was criticized for Indel‐induced alignment problems and its potential lack of phylogenetic resolution. Our main aim was to develop a new phylogenetic gene and a putative functional marker, from single‐copy gene, to describe fungal diversity. Thus, we developed a series of primers to amplify a polymorphic region of the Glycoside Hydrolase GH63 gene, encoding exo‐acting α‐glucosidases, in basidiomycetes. These primers were validated on 125 different fungal genomic DNAs, and GH63 amplification yield was compared with that of already published functional markers targeting genes coding for laccases, N‐acetylhexosaminidases, cellobiohydrolases and class II peroxidases. Specific amplicons were recovered for 95% of the fungal species tested, and GH63 amplification success was strikingly higher than rates obtained with other functional genes. We downloaded the GH63 sequences from 483 fungal genomes publicly available at the JGI mycocosm database. GH63 was present in 461 fungal genomes belonging to all phyla, except Microsporidia and Neocallimastigomycota divisions. Moreover, the phylogenetic trees built with both GH63 and Rpb1 protein sequences revealed that GH63 is also a promising phylogenetic marker. Finally, a very high proportion of GH63 proteins was predicted to be secreted. This molecular tool could be a new phylogenetic marker of fungal species as well as potential indicator of functional diversity of basidiomycetes fungal communities in term of secretory capacities.  相似文献   

11.
Molecular biology has entrenched the gene as the basic hereditary unit and genomes are often considered little more than collections of genes. However, new concepts and genomic data have emerged, which suggest that the genome has a unique place in the hierarchy of life. Despite this, a framework for the genome as a major evolutionary transition has not been fully developed. Instead, genome origin and evolution are frequently considered as a series of neutral or nonadaptive events. In this article, we argue for a Darwinian multilevel selection interpretation for the origin of the genome. We base our arguments on the multilevel selection theory of hypercycles of cooperative interacting genes and predictions that gene‐level trade‐offs in viability and reproduction can help drive evolutionary transitions. We consider genomic data involving mobile genetic elements as a test case of our view. A new concept of the genome as a discrete evolutionary unit emerges and the gene–genome juncture is positioned as a major evolutionary transition in individuality. This framework offers a fresh perspective on the origin of macromolecular life and sets the scene for a new, emerging line of inquiry—the evolutionary ecology of the genome.  相似文献   

12.
Microsatellites, also known as simple sequence repeats (SSRs), are among the most commonly used marker types in evolutionary and ecological studies. Next Generation Sequencing techniques such as 454 pyrosequencing allow the rapid development of microsatellite markers in nonmodel organisms. 454 pyrosequencing is a straightforward approach to develop a high number of microsatellite markers. Therefore, developing microsatellites using 454 pyrosequencing has become the method of choice for marker development. Here, we describe a user friendly way of microsatellite development from 454 pyrosequencing data and analyse data sets of 17 nonmodel species (plants, fungi, invertebrates, birds and a mammal) for microsatellite repeats and flanking regions suitable for primer development. We then compare the numbers of successfully lab‐tested microsatellite markers for the various species and furthermore describe diverse challenges that might arise in different study species, for example, large genome size or nonpure extraction of genomic DNA. Successful primer identification was feasible for all species. We found that in species for which large repeat numbers are uncommon, such as fungi, polymorphic markers can nevertheless be developed from 454 pyrosequencing reads containing small repeat numbers (five to six repeats). Furthermore, the development of microsatellite markers for species with large genomes was also with Next Generation Sequencing techniques more cost and time‐consuming than for species with smaller genomes. In this study, we showed that depending on the species, a different amount of 454 pyrosequencing data might be required for successful identification of a sufficient number of microsatellite markers for ecological genetic studies.  相似文献   

13.
Chiang CW  Derti A  Schwartz D  Chou MF  Hirschhorn JN  Wu CT 《Genetics》2008,180(4):2277-2293
Ultraconserved elements (UCEs) are sequences that are identical between reference genomes of distantly related species. As they are under negative selection and enriched near or in specific classes of genes, one explanation for their ultraconservation may be their involvement in important functions. Indeed, many UCEs can drive tissue-specific gene expression. We have demonstrated that nonexonic UCEs are depleted among segmental duplications (SDs) and copy number variants (CNVs) and proposed that their ultraconservation may reflect a mechanism of copy counting via comparison. Here, we report that nonexonic UCEs are also depleted among 10 of 11 recent genomewide data sets of human CNVs, including 3 obtained with strategies permitting greater precision in determining the extents of CNVs. We further present observations suggesting that nonexonic UCEs per se may contribute to this depletion and that their apparent dosage sensitivity was in effect when they became fixed in the last common ancestor of mammals, birds, and reptiles, consistent with dosage sensitivity contributing to ultraconservation. Finally, in searching for the mechanism(s) underlying the function of nonexonic UCEs, we have found that they are enriched in TAATTA, which is also the recognition sequence for the homeodomain DNA-binding module, and bounded by a change in A + T frequency.  相似文献   

14.
15.
Whole‐genome‐shotgun (WGS) sequencing of total genomic DNA was used to recover ~1 Mbp of novel mitochondrial (mtDNA) sequence from Pinus sylvestris (L.) and three members of the closely related Pinus mugo species complex. DNA was extracted from megagametophyte tissue from six mother trees from locations across Europe, and 100‐bp paired‐end sequencing was performed on the Illumina HiSeq platform. Candidate mtDNA sequences were identified by their size and coverage characteristics, and by comparison with published plant mitochondrial genomes. Novel variants were identified, and primers targeting these loci were trialled on a set of 28 individuals from across Europe. In total, 31 SNP loci were successfully resequenced, characterizing 15 unique haplotypes. This approach offers a cost‐effective means of developing marker resources for mitochondrial genomes in other plant species where reference sequences are unavailable.  相似文献   

16.
High‐throughput sequencing (HTS) is central to the study of population genomics and has an increasingly important role in constructing phylogenies. Choices in research design for sequencing projects can include a wide range of factors, such as sequencing platform, depth of coverage and bioinformatic tools. Simulating HTS data better informs these decisions, as users can validate software by comparing output to the known simulation parameters. However, current standalone HTS simulators cannot generate variant haplotypes under even somewhat complex evolutionary scenarios, such as recombination or demographic change. This greatly reduces their usefulness for fields such as population genomics and phylogenomics. Here I present the R package jackalope that simply and efficiently simulates (i) sets of variant haplotypes from a reference genome and (ii) reads from both Illumina and Pacific Biosciences platforms. Haplotypes can be simulated using phylogenies, gene trees, coalescent‐simulation output, population‐genomic summary statistics, and Variant Call Format (VCF) files. jackalope can simulate single, paired‐end or mate‐pair Illumina reads, as well as reads from Pacific Biosciences. These simulations include sequencing errors, mapping qualities, multiplexing and optical/PCR duplicates. It can read reference genomes from fasta files and can simulate new ones, and all outputs can be written to standard file formats. jackalope is available for Mac, Windows and Linux systems.  相似文献   

17.
Targeted capture and enrichment approaches have proven effective for phylogenetic study. Ultraconserved elements (UCEs) in particular have exhibited great utility for phylogenomic analyses, with the software package phyluce being among the most utilized pipelines for UCE phylogenomics, including probe design. Despite the success of UCEs, it is becoming increasing apparent that diverse lineages require probe sets tailored to focal taxa in order to improve locus recovery. However, factors affecting probe design and methods for optimizing probe sets to focal taxa remain underexplored. Here, we use newly available beetle (Coleoptera) genomic resources to investigate factors affecting UCE probe set design using phyluce . In particular, we explore the effects of stringency during initial design steps, as well as base genome choice on resulting probe sets and locus recovery. We found that both base genome choice and initial bait design stringency parameters greatly alter the number of resultant probes included in final probe sets and strongly affect the number of loci detected and recovered during in silico testing of these probe sets. In addition, we identify attributes of base genomes that correlated with high performance in probe design. Ultimately, we provide a recommended workflow for using Phyluce to design an optimized UCE probe set that will work across a targeted lineage, and use our findings to develop a new, open‐source UCE probe set for beetles of the suborder Adephaga.  相似文献   

18.
? Premise of the study: Genome survey sequences (GSS) from massively parallel sequencing have potential to provide large, cost-effective data sets for phylogenetic inference, replace single gene or spacer regions as DNA barcodes, and provide a plethora of data for other comparative molecular evolution studies. Here we report on the application of this method to estimating the molecular phylogeny of core Asparagales, investigating plastid gene losses, assembling complete plastid genomes, and determining the type and quality of assembled genomic data attainable from Illumina 80-120-bp reads. ? Methods: We sequenced total genomic DNA from samples in two lineages of monocotyledonous plants, Poaceae and Asparagales, on the Illumina platform in a multiplex arrangement. We compared reference-based assemblies to de novo contigs, evaluated consistency of assemblies resulting from use of various references sequences, and assessed our methods to obtain sequence assemblies in nonmodel taxa. ? Key results: Our method returned reliable, robust organellar and nrDNA sequences in a variety of plant lineages. High quality assemblies are not dependent on genome size, amount of plastid present in the total genomic DNA template, or relatedness of available reference sequences for assembly. Phylogenetic results revealed familial and subfamilial relationships within Asparagales with high bootstrap support, although placement of the monotypic genus Aphyllanthes was placed with moderate confidence. ? Conclusions: The well-supported molecular phylogeny provides evidence for delineation of subfamilies within core Asparagales. With advances in technology and bioinformatics tools, the use of massively parallel sequencing will continue to become easier and more affordable for phylogenomic and molecular evolutionary biology investigations.  相似文献   

19.
20.
Based on published information, we have identified 991 genes and gene-family clusters for cattle and 764 for pigs that have orthologues in the human genome. The relative linear locations of these genes on human sequence maps were used as "rulers" to annotate bovine and porcine genomes based on a CSAM (contiguous sets of autosomal markers) approach. A CSAM is an uninterrupted set of markers in one genome (primary genome; the human genome in this study) that is syntenic in the other genome (secondary genome; the bovine and porcine genomes in this study). The analysis revealed 81 conserved syntenies and 161 CSAMs between human and bovine autosomes and 50 conserved syntenies and 95 CSAMs between human and porcine autosomes. Using the human sequence map as a reference, these 991 and 764 markers could correlate 72 and 74% of the human genome with the bovine and porcine genomes, respectively. Based on the number of contiguous markers in each CSAM, we classified these CSAMs into five size groups as follows: singletons (one marker only), small (2-4 markers), medium (5-10 markers), large (11-20 markers), and very large (> 20 markers). Several bovine and porcine chromosomes appear to be represented as di-CSAM repeats in a tandem or dispersed way on human chromosomes. The number of potential CSAMs for which no markers are currently available were estimated to be 63 between human and bovine genomes and 18 between human and porcine genomes. These results provide basic guidelines for further gene and QTL mapping of the bovine and porcine genomes, as well as insight into the evolution of mammalian genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号