首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Traditional approaches for sequencing insertion ends of bacterial artificial chromosome (BAC) libraries are laborious and expensive, which are currently some of the bottlenecks limiting a better understanding of the genomic features of auto‐ or allopolyploid species. Here, we developed a highly efficient and low‐cost BAC end analysis protocol, named BAC‐anchor, to identify paired‐end reads containing large internal gaps. Our approach mainly focused on the identification of high‐throughput sequencing reads carrying restriction enzyme cutting sites and searching for large internal gaps based on the mapping locations of both ends of the reads. We sequenced and analysed eight libraries containing over 3 200 000 BAC end clones derived from the BAC library of the tetraploid potato cultivar C88 digested with two restriction enzymes, Cla I and Mlu I. About 25% of the BAC end reads carrying cutting sites generated a 60–100 kb internal gap in the potato DM reference genome, which was consistent with the mapping results of Sanger sequencing of the BAC end clones and indicated large differences between autotetraploid and haploid genotypes in potato. A total of 5341 Cla I‐ and 165 Mlu I‐derived unique reads were distributed on different chromosomes of the DM reference genome and could be used to establish a physical map of target regions and assemble the C88 genome. The reads that matched different chromosomes are especially significant for the further assembly of complex polyploid genomes. Our study provides an example of analysing high‐coverage BAC end libraries with low sequencing cost and is a resource for further genome sequencing studies.  相似文献   

2.
The 1.5 Gbp/2C genome of pedunculate oak (Quercus robur) has been sequenced. A strategy was established for dealing with the challenges imposed by the sequencing of such a large, complex and highly heterozygous genome by a whole‐genome shotgun (WGS) approach, without the use of costly and time‐consuming methods, such as fosmid or BAC clone‐based hierarchical sequencing methods. The sequencing strategy combined short and long reads. Over 49 million reads provided by Roche 454 GS‐FLX technology were assembled into contigs and combined with shorter Illumina sequence reads from paired‐end and mate‐pair libraries of different insert sizes, to build scaffolds. Errors were corrected and gaps filled with Illumina paired‐end reads and contaminants detected, resulting in a total of 17 910 scaffolds (>2 kb) corresponding to 1.34 Gb. Fifty per cent of the assembly was accounted for by 1468 scaffolds (N50 of 260 kb). Initial comparison with the phylogenetically related Prunus persica gene model indicated that genes for 84.6% of the proteins present in peach (mean protein coverage of 90.5%) were present in our assembly. The second and third steps in this project are genome annotation and the assignment of scaffolds to the oak genetic linkage map. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement, the oak genome data have been released into public sequence repositories in advance of publication. In this presubmission paper, the oak genome consortium describes its principal lines of work and future directions for analyses of the nature, function and evolution of the oak genome.  相似文献   

3.
4.
Phylogenetic relationships among temperate species of bamboo are difficult to resolve, owing to both the challenge of detecting sufficiently variable markers and their polyploid history. Here, we use restriction site–associated DNA sequencing to identify candidate loci with fixed allelic differences segregating between and within two temperate species of bamboos: Arundinaria faberi and Yushania brevipaniculata. Approximately 27 million paired‐end sequencing reads were generated across four samples. From pooled data, we assembled 67 685 and 70 668 de novo contigs from partial overlap among paired‐end reads, with an average length of 240 and 241 bp for the two species, respectively, which were used to investigate functional classification of RAD tags in a blastx search. Analysed separately by population, we recovered 29 443 putatively orthologous RAD tags shared across the four sampled populations, containing 28 023 sequence variants, of which c. 13 000 are segregating between species, and c. 3000 segregating between populations within each species. Analyses based on these RAD tags yielded robust phylogenetic inferences, even with data set constructed from surprisingly few loci. This study illustrates the potential for reduced‐representation genome data to resolve difficult phylogenetic relationships in temperate bamboos.  相似文献   

5.
Bivalves, a highly diverse and the most evolutionarily successful class of invertebrates native to aquatic habitats, provide valuable molecular resources for understanding the evolutionary adaptation and aquatic ecology. Here, we reported a high‐quality chromosome‐level genome assembly of the razor clam Sinonovacula constricta using Pacific Bioscience single‐molecule real‐time sequencing, Illumina paired‐end sequencing, 10X Genomics linked‐reads and Hi‐C reads. The genome size was 1,220.85 Mb, containing scaffold N50 of 65.93 Mb and contig N50 of 976.94 Kb. A total of 899 complete (91.92%) and seven partial (0.72%) matches of the 978 metazoa Benchmarking Universal Single‐Copy Orthologs were determined in this genome assembly. And Hi‐C scaffolding of the genome resulted in 19 pseudochromosomes. A total of 28,594 protein‐coding genes were predicted in the S. constricta genome, of which 25,413 genes (88.88%) were functionally annotated. In addition, 39.79% of the assembled genome was composed of repetitive sequences, and 4,372 noncoding RNAs were identified. The enrichment analyses of the significantly expanded and contracted genes suggested an evolutionary adaptation of S. constricta to highly stressful living environments. In summary, the genomic resources generated in this work not only provide a valuable reference genome for investigating the molecular mechanisms of S. constricta biological functions and evolutionary adaptation, but also facilitate its genetic improvement and disease treatment. Meanwhile, the obtained genome greatly improves our understanding of the genetics of molluscs and their comparative evolution.  相似文献   

6.
7.
Research in evolutionary biology involving nonmodel organisms is rapidly shifting from using traditional molecular markers such as mtDNA and microsatellites to higher throughput SNP genotyping methodologies to address questions in population genetics, phylogenetics and genetic mapping. Restriction site associated DNA sequencing (RAD sequencing or RADseq) has become an established method for SNP genotyping on Illumina sequencing platforms. Here, we developed a protocol and adapters for double‐digest RAD sequencing for Ion Torrent (Life Technologies; Ion Proton, Ion PGM) semiconductor sequencing. We sequenced thirteen genomic libraries of three different nonmodel vertebrate species on Ion Proton with PI chips: Arctic charr Salvelinus alpinus, European whitefish Coregonus lavaretus and common lizard Zootoca vivipara. This resulted in ~962 million single‐end reads overall and a mean of ~74 million reads per library. We filtered the genomic data using Stacks, a bioinformatic tool to process RAD sequencing data. On average, we obtained ~11 000 polymorphic loci per library of 6–30 individuals. We validate our new method by technical and biological replication, by reconstructing phylogenetic relationships, and using a hybrid genetic cross to track genomic variants. Finally, we discuss the differences between using the different sequencing platforms in the context of RAD sequencing, assessing possible advantages and disadvantages. We show that our protocol can be used for Ion semiconductor sequencing platforms for the rapid and cost‐effective generation of variable and reproducible genetic markers.  相似文献   

8.
Soybean cyst nematode (SCN, Heterodera glycines) is a major pest of soybean that is spreading across major soybean production regions worldwide. Increased SCN virulence has recently been observed in both the United States and China. However, no study has reported a genome assembly for H. glycines at the chromosome scale. Herein, the first chromosome‐level reference genome of X12, an unusual SCN race with high infection ability, is presented. Using whole‐genome shotgun (WGS) sequencing, Pacific Biosciences (PacBio) sequencing, Illumina paired‐end sequencing, 10X Genomics linked reads and high‐throughput chromatin conformation capture (Hi‐C) genome scaffolding techniques, a 141.01‐megabase (Mb) assembled genome was obtained with scaffold and contig N50 sizes of 16.27 Mb and 330.54 kilobases (kb), respectively. The assembly showed high integrity and quality, with over 90% of Illumina reads mapped to the genome. The assembly quality was evaluated using Core Eukaryotic Genes Mapping Approach and Benchmarking Universal Single‐Copy Orthologs. A total of 11,882 genes were predicted using de novo, homolog and RNAseq data generated from eggs, second‐stage juveniles (J2), third‐stage juveniles (J3) and fourth‐stage juveniles (J4) of X12, and 79.0% of homologous sequences were annotated in the genome. These high‐quality X12 genome data will provide valuable resources for research in a broad range of areas, including fundamental nematode biology, SCN–plant interactions and co‐evolution, and also contribute to the development of technology for overall SCN management.  相似文献   

9.
Many economically important crops have large and complex genomes that hamper their sequencing by standard methods such as whole genome shotgun (WGS). Large tracts of methylated repeats occur in plant genomes that are interspersed by hypomethylated gene‐rich regions. Gene‐enrichment strategies based on methylation profiles offer an alternative to sequencing repetitive genomes. Here, we have applied methyl filtration with McrBC endonuclease digestion to enrich for euchromatic regions in the sugarcane genome. To verify the efficiency of methylation filtration and the assembly quality of sequences submitted to gene‐enrichment strategy, we have compared assemblies using methyl‐filtered (MF) and unfiltered (UF) libraries. The use of methy filtration allowed a better assembly by filtering out 35% of the sugarcane genome and by producing 1.5× more scaffolds and 1.7× more assembled Mb in length compared with unfiltered dataset. The coverage of sorghum coding sequences (CDS) by MF scaffolds was at least 36% higher than by the use of UF scaffolds. Using MF technology, we increased by 134× the coverage of gene regions of the monoploid sugarcane genome. The MF reads assembled into scaffolds that covered all genes of the sugarcane bacterial artificial chromosomes (BACs), 97.2% of sugarcane expressed sequence tags (ESTs), 92.7% of sugarcane RNA‐seq reads and 98.4% of sorghum protein sequences. Analysis of MF scaffolds from encoded enzymes of the sucrose/starch pathway discovered 291 single‐nucleotide polymorphisms (SNPs) in the wild sugarcane species, S. spontaneum and S. officinarum. A large number of microRNA genes was also identified in the MF scaffolds. The information achieved by the MF dataset provides a valuable tool for genomic research in the genus Saccharum and for improvement of sugarcane as a biofuel crop.  相似文献   

10.
The European rabbit (Oryctolagus cuniculus) is a domesticated species with one of the broadest ranges of economic and scientific applications and fields of investigation. Rabbit genome information and assembly are available (oryCun2.0), but so far few studies have investigated its variability, and massive discovery of polymorphisms has not been published yet for this species. Here, we sequenced two reduced representation libraries (RRLs) to identify single nucleotide polymorphisms (SNPs) in the rabbit genome. Genomic DNA of 10 rabbits belonging to different breeds was pooled and digested with two restriction enzymes (HaeIII and RsaI) to create two RRLs which were sequenced using the Ion Torrent Personal Genome Machine. The two RRLs produced 2 917 879 and 4 046 871 reads, for a total of 280.51 Mb (248.49 Mb with quality >20) and 417.28 Mb (360.89 Mb with quality >20) respectively of sequenced DNA. About 90% and 91% respectively of the obtained reads were mapped on the rabbit genome, covering a total of 15.82% of the oryCun2.0 genome version. The mapping and ad hoc filtering procedures allowed to reliably call 62 491 SNPs. SNPs in a few genomic regions were validated by Sanger sequencing. The Variant Effect Predictor Web tool was used to map SNPs on the current version of the rabbit genome. The obtained results will be useful for many applied and basic research programs for this species and will contribute to the development of cost‐effective solutions for high‐throughput SNP genotyping in the rabbit.  相似文献   

11.
Despite recent advances in high‐throughput sequencing, difficulties are often encountered when developing microsatellites for species with large and complex genomes. This probably reflects the close association in many species of microsatellites with cryptic repetitive elements. We therefore developed a novel approach for isolating polymorphic microsatellites from the club‐legged grasshopper (Gomphocerus sibiricus), an emerging quantitative genetic and behavioral model system. Whole genome shotgun Illumina MiSeq sequencing was used to generate over three million 300 bp paired‐end reads, of which 67.75% were grouped into 40,548 clusters within RepeatExplorer. Annotations of the top 468 clusters, which represent 60.5% of the reads, revealed homology to satellite DNA and a variety of transposable elements. Evaluating 96 primer pairs in eight wild‐caught individuals, we found that primers mined from singleton reads were six times more likely to amplify a single polymorphic microsatellite locus than primers mined from clusters. Our study provides experimental evidence in support of the notion that microsatellites associated with repetitive elements are less likely to successfully amplify. It also reveals how advances in high‐throughput sequencing and graph‐based repetitive DNA analysis can be leveraged to isolate polymorphic microsatellites from complex genomes.  相似文献   

12.
13.
14.
Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole‐genome shotgun sequencing of the nuclear genome of flax. Seven paired‐end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep‐coverage (approximately 94× raw, approximately 69× filtered) short‐sequence reads (44–100 bp), produced a set of scaffolds with N50 = 694 kb, including contigs with N50 = 20.1 kb. The contig assembly contained 302 Mb of non‐redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole‐genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis‐assembly of regions at the genome scale. A total of 43 384 protein‐coding genes were predicted in the whole‐genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (Ks) observed within duplicate gene pairs was consistent with a recent (5–9 MYA) whole‐genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam‐A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole‐genome shotgun short‐sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species.  相似文献   

15.
Ramie, Boehmeria nivea (L.) Gaudich, family Urticaceae, is a plant native to eastern Asia, and one of the world's oldest fibre crops. It is also used as animal feed and for the phytoremediation of heavy metal‐contaminated farmlands. Thus, the genome sequence of ramie was determined to explore the molecular basis of its fibre quality, protein content and phytoremediation. For further understanding ramie genome, different paired‐end and mate‐pair libraries were combined to generate 134.31 Gb of raw DNA sequences using the Illumina whole‐genome shotgun sequencing approach. The highly heterozygous B. nivea genome was assembled using the Platanus Genome Assembler, which is an effective tool for the assembly of highly heterozygous genome sequences. The final length of the draft genome of this species was approximately 341.9 Mb (contig N50 = 22.62 kb, scaffold N50 = 1,126.36 kb). Based on ramie genome annotations, 30,237 protein‐coding genes were predicted, and the repetitive element content was 46.3%. The completeness of the final assembly was evaluated by benchmarking universal single‐copy orthologous genes (BUSCO); 90.5% of the 1,440 expected embryophytic genes were identified as complete, and 4.9% were identified as fragmented. Phylogenetic analysis based on single‐copy gene families and one‐to‐one orthologous genes placed ramie with mulberry and cannabis, within the clade of urticalean rosids. Genome information of ramie will be a valuable resource for the conservation of endangered Boehmeria species and for future studies on the biogeography and characteristic evolution of members of Urticaceae.  相似文献   

16.
Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost‐effective approaches to uncover genome‐wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD‐PE (Restriction site Associated DNA Paired‐End sequencing) approach. RAD tags were generated from the PstI‐digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired‐end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N50 = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD‐PE as an inexpensive genome‐wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.  相似文献   

17.
Microsporidia are highly successful parasites that infect virtually all known animal lineages, including the model Danio rerio (zebrafish). The widespread use of this aquatic model for biomedical research has resulted in an unexpected increase in infections from the microsporidium Pseudoloma neurophilia, which can lead to significant physical, behavioral, and immunological modifications, resulting in nonprotocol variation during experimental procedures. Here, we seek to obtain insights into the biology of P. neurophilia by investigating its genome content, which was obtained from only 29 nanograms of DNA using the MiSeq technology and paired‐end Illumina sequencing. We found that the genome of P. neurophilia is phylogenetically and genetically related to other fish‐microsporidians, but features unique to this intracellular parasite are also found. The small 5.25‐Mb genome assembly includes 1,139 unique open‐reading frames and an unusually high number of transposable elements for such a small genome. Investigations of intragenomic diversity also provided strong indications that the mononucleate nucleus of this species is diploid. Overall, our study provides insights into the dynamics of microsporidian genomes and a solid sequence reference to be used in future studies of host–parasite interactions using the zebrafish D. rerio and P. neurophilia as a model.  相似文献   

18.
19.
The Tetraodontidae family are known to have relatively small and compact genomes compared to other vertebrates. The obscure puffer fish Takifugu obscurus is an anadromous species that migrates to freshwater from the sea for spawning. Thus the euryhaline characteristics of T. obscurus have been investigated to gain understanding of their survival ability, osmoregulation, and other homeostatic mechanisms in both freshwater and seawater. In this study, a high quality chromosome‐level reference genome for T. obscurus was constructed using long‐read Pacific Biosciences (PacBio) Sequel sequencing and a Hi‐C‐based chromatin contact map platform. The final genome assembly of T. obscurus is 381 Mb, with a contig N50 length of 3,296 kb and longest length of 10.7 Mb, from a total of 62 Gb of raw reads generated using single‐molecule real‐time sequencing technology from a PacBio Sequel platform. The PacBio data were further clustered into chromosome‐scale scaffolds using a Hi‐C approach, resulting in a 373 Mb genome assembly with a contig N50 length of 15.2 Mb and and longest length of 28 Mb. When we directly compared the 22 longest scaffolds of T. obscurus to the 22 chromosomes of the tiger puffer Takifugu rubripes, a clear one‐to‐one orthologous relationship was observed between the two species, supporting the chromosome‐level assembly of T. obscurus. This genome assembly can serve as a valuable genetic resource for exploring fugu‐specific compact genome characteristics, and will provide essential genomic information for understanding molecular adaptations to salinity fluctuations and the evolution of osmoregulatory mechanisms.  相似文献   

20.
It is well known that parasitoids are attracted to volatiles emitted by host‐damaged plants; however, this tritrophic interaction may change if plants are attacked by more than one herbivore species. The larval parasitoid Cotesia flavipesCameron (Hymenoptera: Braconidae) has been used intensively in Brazil to control the sugarcane borer, Diatraea saccharalisFabricius (Lepidoptera: Pyralidae) in sugarcane crops, where Spodoptera frugiperda (JE Smith) (Lepidoptera: Noctuidae), a non‐stemborer lepidopteran, is also a pest. Here, we investigated the ability of C. flavipes to discriminate between an unsuitable host (S. frugiperda) and a suitable host (D. saccharalis) based on herbivore‐induced plant volatiles (HIPVs) emitted by sugarcane, and whether multiple herbivory (D. saccharalis feeding on stalk + S. frugiperda feeding on leaves) in sugarcane affected the attractiveness of HIPVs to C. flavipes. Olfactometer assays indicated that volatiles of host and non‐host‐damaged plants were attractive to C. flavipes. Even though host‐ and non‐host‐damaged plants emitted considerably different volatile blends, neither naïve nor experienced wasps discriminated suitable and unsuitable hosts by means of HIPVs emitted by sugarcane. With regard to multiple herbivory, wasps innately preferred the odor blend emitted by sugarcane upon non‐host + host herbivory over host‐only damaged plants. Multiple herbivory caused a suppression of some volatiles relative to non‐host‐damaged sugarcane that may have resulted from the unaltered levels of jasmonic acid in host‐damaged plants, or from reduced palatability of host‐damaged plants to S. frugiperda. In conclusion, our study showed that C. flavipes responds to a wide range of plant volatile blends, and does not discriminate host from non‐host and non‐stemborer caterpillars based on HIPVs emitted from sugarcane. Moreover, we showed that multiple herbivory by the sugarcane borer and fall armyworm increases the attractiveness of sugarcane plants to the parasitoids.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号