首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
High‐throughput DNA sequencing technologies make it possible now to sequence entire genomes relatively easily. Complete genomic information obtained by whole‐genome resequencing (WGS) can aid in identifying and delineating species even if they are extremely young, cryptic, or morphologically difficult to discern and closely related. Yet, for taxonomic or conservation biology purposes, WGS can remain cost‐prohibitive, too time‐consuming, and often constitute a “data overkill.” Rapid and reliable identification of species (and populations) that is also cost‐effective is made possible by species‐specific markers that can be discovered by WGS. Based on WGS data, we designed a PCR restriction fragment length polymorphism (PCR‐RFLP) assay for 19 Neotropical Midas cichlid populations (Amphilophus cf. citrinellus), that includes all 13 described species of this species complex. Our work illustrates that identification of species and populations (i.e., fish from different lakes) can be greatly improved by designing genetic markers using available “high resolution” genomic information. Yet, our work also shows that even in the best‐case scenario, when whole‐genome resequencing information is available, unequivocal assignments remain challenging when species or populations diverged very recently, or gene flow persists. In summary, we provide a comprehensive workflow on how to design RFPL markers based on genome resequencing data, how to test and evaluate their reliability, and discuss the benefits and pitfalls of our approach.  相似文献   

2.
3.
Aldabrachelys gigantea (Aldabra giant tortoise) is one of only two giant tortoise species left in the world and survives as a single wild population of over 100,000 individuals on Aldabra Atoll, Seychelles. Despite this large current population size, the species faces an uncertain future because of its extremely restricted distribution range and high vulnerability to the projected consequences of climate change. Captive‐bred A. gigantea are increasingly used in rewilding programs across the region, where they are introduced to replace extinct giant tortoises in an attempt to functionally resurrect degraded island ecosystems. However, there has been little consideration of the current levels of genetic variation and differentiation within and among the islands on Aldabra. As previous microsatellite studies were inconclusive, we combined low‐coverage and double‐digest restriction‐associated DNA (ddRAD) sequencing to analyze samples from 33 tortoises (11 from each main island). Using 5426 variant sites within the tortoise genome, we detected patterns of within‐island population structure, but no differentiation between the islands. These unexpected results highlight the importance of using genome‐wide genetic markers to capture higher‐resolution genetic structure to inform future management plans, even in a seemingly panmictic population. We show that low‐coverage ddRAD sequencing provides an affordable alternative approach to conservation genomic projects of non‐model species with large genomes.  相似文献   

4.
Radish (Raphanus sativus L., n = 9) is one of the major vegetables in Asia. Since the genomes of Brassica and related species including radish underwent genome rearrangement, it is quite difficult to perform functional analysis based on the reported genomic sequence of Brassica rapa. Therefore, we performed genome sequencing of radish. Short reads of genomic sequences of 191.1 Gb were obtained by next-generation sequencing (NGS) for a radish inbred line, and 76,592 scaffolds of ≥300 bp were constructed along with the bacterial artificial chromosome-end sequences. Finally, the whole draft genomic sequence of 402 Mb spanning 75.9% of the estimated genomic size and containing 61,572 predicted genes was obtained. Subsequently, 221 single nucleotide polymorphism markers and 768 PCR-RFLP markers were used together with the 746 markers produced in our previous study for the construction of a linkage map. The map was combined further with another radish linkage map constructed mainly with expressed sequence tag-simple sequence repeat markers into a high-density integrated map of 1,166 cM with 2,553 DNA markers. A total of 1,345 scaffolds were assigned to the linkage map, spanning 116.0 Mb. Bulked PCR products amplified by 2,880 primer pairs were sequenced by NGS, and SNPs in eight inbred lines were identified.  相似文献   

5.
We used synthetic oligonucleotide DNA probes specific for the four-base repetitive core sequences (GACA)n and (AGGC)n to examine human genomic variation. The results of hybridizing these oligonucleotides to human genomic digests indicate that they are useful and accessible markers for ubiquitously repeated regions of DNA in the human genome. Furthermore, these sequences appear to be highly conserved in eukaryotic genomes, but their function remains largely unknown.  相似文献   

6.
All neurodegenerative diseases feature aggregates, which usually contain disease‐specific diagnostic proteins; non‐protein constituents, however, have rarely been explored. Aggregates from SY5Y‐APPSw neuroblastoma, a cell model of familial Alzheimer''s disease, were crosslinked and sequences of linked peptides identified. We constructed a normalized “contactome” comprising 11 subnetworks, centered on 24 high‐connectivity hubs. Remarkably, all 24 are nucleic acid‐binding proteins. This led us to isolate and sequence RNA and DNA from Alzheimer''s and control aggregates. RNA fragments were mapped to the human genome by RNA‐seq and DNA by ChIP‐seq. Nearly all aggregate RNA sequences mapped to specific genes, whereas DNA fragments were predominantly intergenic. These nucleic acid mappings are all significantly nonrandom, making an artifactual origin extremely unlikely. RNA (mostly cytoplasmic) exceeded DNA (chiefly nuclear) by twofold to fivefold. RNA fragments recovered from AD tissue were ~1.5‐to 2.5‐fold more abundant than those recovered from control tissue, similar to the increase in protein. Aggregate abundances of specific RNA sequences were strikingly differential between cultured SY5Y‐APPSw glioblastoma cells expressing APOE3 vs. APOE4, consistent with APOE4 competition for E‐box/CLEAR motifs. We identified many G‐quadruplex and viral sequences within RNA and DNA of aggregates, suggesting that sequestration of viral genomes may have driven the evolution of disordered nucleic acid‐binding proteins. After RNA‐interference knockdown of the translational‐procession factor EEF2 to suppress translation in SY5Y‐APPSw cells, the RNA content of aggregates declined by >90%, while reducing protein content by only 30% and altering DNA content by ≤10%. This implies that cotranslational misfolding of nascent proteins may ensnare polysomes into aggregates, accounting for most of their RNA content.  相似文献   

7.
The aphid Schlechtendalia chinensis is an economically important insect that can induce horned galls, which are valuable for the medicinal and chemical industries. Up to now, more than twenty aphid genomes have been reported. Most of the sequenced genomes are derived from free‐living aphids. Here, we generated a high‐quality genome assembly from a galling aphid. The final genome assembly is 271.52 Mb, representing one of the smallest sequenced genomes of aphids. The genome assembly is based on contig and scaffold N50 values of the genome sequence are 3.77 Mb and 20.41 Mb, respectively. Nine‐seven percent of the assembled sequences was anchored onto 13 chromosomes. Based on BUSCO analysis, the assembly involved 96.9% of conserved arthropod and 98.5% of the conserved Hemiptera single‐copy orthologous genes. A total of 14,089 protein‐coding genes were predicted. Phylogenetic analysis revealed that S. chinensis diverged from the common ancestor of Eriosoma lanigerum approximately 57 million years ago (MYA). In addition, 35 genes encoding salivary gland proteins showed differentially when S. chinensis forms a gall, suggesting they have potential roles in gall formation and plant defense suppression. Taken together, this high‐quality S. chinensis genome assembly and annotation provide a solid genetic foundation for future research to reveal the mechanism of gall formation and to explore the interaction between aphids and their host plants.  相似文献   

8.
The case rate of Q fever in Europe has increased dramatically in recent years, mainly because of an epidemic in the Netherlands in 2009. Consequently, there is a need for more extensive genetic characterization of the disease agent Coxiella burnetii in order to better understand the epidemiology and spread of this disease. Genome reference data are essential for this purpose, but only thirteen genome sequences are currently available. Current methods for typing C. burnetii are criticized for having problems in comparing results across laboratories, require the use of genomic control DNA, and/or rely on markers in highly variable regions. We developed in this work a method for single nucleotide polymorphism (SNP) typing of C. burnetii isolates and tissue samples based on new assays targeting ten phylogenetically stable synonymous canonical SNPs (canSNPs). These canSNPs represent previously known phylogenetic branches and were here identified from sequence comparisons of twenty-one C. burnetii genomes, eight of which were sequenced in this work. Importantly, synthetic control templates were developed, to make the method useful to laboratories lacking genomic control DNA. An analysis of twenty-one C. burnetii genomes confirmed that the species exhibits high sequence identity. Most of its SNPs (7,493/7,559 shared by >1 genome) follow a clonal inheritance pattern and are therefore stable phylogenetic typing markers. The assays were validated using twenty-six genetically diverse C. burnetii isolates and three tissue samples from small ruminants infected during the epidemic in the Netherlands. Each sample was assigned to a clade. Synthetic controls (vector and PCR amplified) gave identical results compared to the corresponding genomic controls and are viable alternatives to genomic DNA. The results from the described method indicate that it could be useful for cheap and rapid disease source tracking at non-specialized laboratories, which requires accurate genotyping, assay accessibility and inter-laboratory comparisons.  相似文献   

9.
High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly problematic, especially for TE families exhibiting high sequence identity, high copy number, or complex genomic arrangements. While TEs strongly affect genome function and evolution, most current de novo assembly approaches cannot resolve long, identical, and abundant families of TEs. Here, we applied a novel Illumina technology called TruSeq synthetic long-reads, which are generated through highly-parallel library preparation and local assembly of short read data and which achieve lengths of 1.5–18.5 Kbp with an extremely low error rate (0.03% per base). To test the utility of this technology, we sequenced and assembled the genome of the model organism Drosophila melanogaster (reference genome strain y; cn, bw, sp) achieving an N50 contig size of 69.7 Kbp and covering 96.9% of the euchromatic chromosome arms of the current reference genome. TruSeq synthetic long-read technology enables placement of individual TE copies in their proper genomic locations as well as accurate reconstruction of TE sequences. We entirely recovered and accurately placed 4,229 (77.8%) of the 5,434 annotated transposable elements with perfect identity to the current reference genome. As TEs are ubiquitous features of genomes of many species, TruSeq synthetic long-reads, and likely other methods that generate long-reads, offer a powerful approach to improve de novo assemblies of whole genomes.  相似文献   

10.
11.
DNA gel-blot and in situ hybridization with genome-specific repeated sequences have proven to be valuable tools in analyzing genome structure and relationships in species with complex allopolyploid genomes such as hexaploid oat (Avena sativa L., 2n = 6x = 42; AACCDD genome). In this report, we describe a systematic approach for isolating genome-, chromosome-, and region-specific repeated and low-copy DNA sequences from oat that can presumably be applied to any complex genome species. Genome-specific DNA sequences were first identified in a random set of A. sativa genomic DNA cosmid clones by gel-blot hybridization using labeled genomic DNA from different Avena species. Because no repetitive sequences were identified that could distinguish between the A and D gneomes, sequences specific to these two genomes are refereed to as A/D genome specific. A/D or C genome specific DNA subfragments were used as screening probes to identify additional genome-specific cosmid clones in the A. sativa genomic library. We identified clustered and dispersed repetitive DNA elements for the A/D and C genomes that could be used as cytogenetic markers for discrimination of the various oat chromosomes. Some analyzed cosmids appeared to be composed entirely of genome-specific elements, whereas others represented regions with genome- and non-specific repeated sequences with interspersed low-copy DNA sequences. Thus, genome-specific hybridization analysis of restriction digests of random and selected A. sativa cosmids also provides insight into the sequence organization of the oat genome.  相似文献   

12.
The value of genome-specific repetitive DNA sequences for use as molecular markers in studying genome differentiation was investigated. Five repetitive DNA sequences from wild species of rice were cloned. Four of the clones, pOm1, pOm4, pOmA536, and pOmPB10, were isolated from Oryza minuta accession 101141 (BBCC genomes), and one clone, pOa237, was isolated from Oryza australiensis accession 100882 (EE genome). Southern blot hybridization to different rice genomes showed strong hybridization of all five clones to O. minuta genomic DNA and no cross hybridization to genomic DNA from Oryza sativa (AA genome). The pOm1 and pOmA536 sequences showed cross hybridization only to all of the wild rice species containing the C genome. However, the pOm4, pOmPB10, and pOa237 sequences showed cross hybridization to O. australiensis genomic DNA in addition to showing hybridization to the O. minuta genomic DNA.  相似文献   

13.
Low-biomass samples from nitrate and heavy metal contaminated soils yield DNA amounts that have limited use for direct, native analysis and screening. Multiple displacement amplification (MDA) using 29 DNA polymerase was used to amplify whole genomes from environmental, contaminated, subsurface sediments. By first amplifying the genomic DNA (gDNA), biodiversity analysis and gDNA library construction of microbes found in contaminated soils were made possible. The MDA method was validated by analyzing amplified genome coverage from approximately five Escherichia coli cells, resulting in 99.2% genome coverage. The method was further validated by confirming overall representative species coverage and also an amplification bias when amplifying from a mix of eight known bacterial strains. We extracted DNA from samples with extremely low cell densities from a U.S. Department of Energy contaminated site. After amplification, small-subunit rRNA analysis revealed relatively even distribution of species across several major phyla. Clone libraries were constructed from the amplified gDNA, and a small subset of clones was used for shotgun sequencing. BLAST analysis of the library clone sequences showed that 64.9% of the sequences had significant similarities to known proteins, and “clusters of orthologous groups” (COG) analysis revealed that more than half of the sequences from each library contained sequence similarity to known proteins. The libraries can be readily screened for native genes or any target of interest. Whole-genome amplification of metagenomic DNA from very minute microbial sources, while introducing an amplification bias, will allow access to genomic information that was not previously accessible.  相似文献   

14.
Simple sequence repeats (SSRs) are widely used genetic markers in ecology, evolution, and conservation even in the genomics era, while a general limitation to their application is the difficulty of developing polymorphic SSR markers. Next‐generation sequencing (NGS) offers the opportunity for the rapid development of SSRs; however, previous studies developing SSRs using genomic data from only one individual need redundant experiments to test the polymorphisms of SSRs. In this study, we designed a pipeline for the rapid development of polymorphic SSR markers from multi‐sample genomic data. We used bioinformatic software to genotype multiple individuals using resequencing data, detected highly polymorphic SSRs prior to experimental validation, significantly improved the efficiency and reduced the experimental effort. The pipeline was successfully applied to a globally threatened species, the brown eared‐pheasant (Crossoptilon mantchuricum), which showed very low genomic diversity. The 20 newly developed SSR markers were highly polymorphic, the average number of alleles was much higher than the genomic average. We also evaluated the effect of the number of individuals and sequencing depth on the SSR mining results, and we found that 10 individuals and ~10X sequencing data were enough to obtain a sufficient number of polymorphic SSRs, even for species with low genetic diversity. Furthermore, the genome assembly of NGS data from the optimal number of individuals and sequencing depth can be used as an alternative reference genome if a high‐quality genome is not available. Our pipeline provided a paradigm for the application of NGS technology to mining and developing molecular markers for ecological and evolutionary studies.  相似文献   

15.
Population genomic analyses have demonstrated power to address major questions in evolutionary and molecular microbiology. Collecting populations of genomes is hindered in many microbial species by the absence of a cost effective and practical method to collect ample quantities of sufficiently pure genomic DNA for next-generation sequencing. Here we present a simple method to amplify genomes of a target microbial species present in a complex, natural sample. The selective whole genome amplification (SWGA) technique amplifies target genomes using nucleotide sequence motifs that are common in the target microbe genome, but rare in the background genomes, to prime the highly processive phi29 polymerase. SWGA thus selectively amplifies the target genome from samples in which it originally represented a minor fraction of the total DNA. The post-SWGA samples are enriched in target genomic DNA, which are ideal for population resequencing. We demonstrate the efficacy of SWGA using both laboratory-prepared mixtures of cultured microbes as well as a natural host–microbe association. Targeted amplification of Borrelia burgdorferi mixed with Escherichia coli at genome ratios of 1:2000 resulted in >105-fold amplification of the target genomes with <6.7-fold amplification of the background. SWGA-treated genomic extracts from Wolbachia pipientis-infected Drosophila melanogaster resulted in up to 70% of high-throughput resequencing reads mapping to the W. pipientis genome. By contrast, 2–9% of sequencing reads were derived from W. pipientis without prior amplification. The SWGA technique results in high sequencing coverage at a fraction of the sequencing effort, thus allowing population genomic studies at affordable costs.  相似文献   

16.
Advances in DNA synthesis and assembly methods over the past decade have made it possible to construct genome-size fragments from oligonucleotides. Early work focused on synthesis of small viral genomes, followed by hierarchical synthesis of wild-type bacterial genomes and subsequently on transplantation of synthesized bacterial genomes into closely related recipient strains. More recently, a synthetic designer version of yeast Saccharomyces cerevisiae chromosome III has been generated, with numerous changes from the wild-type sequence without having an impact on cell fitness and phenotype, suggesting plasticity of the yeast genome. A project to generate the first synthetic yeast genome - the Sc2.0 Project - is currently underway.  相似文献   

17.
To improve the metagenomic analysis of complex microbiomes, we have repurposed restriction endonucleases as methyl specific DNA binding proteins. As an example, we use DpnI immobilized on magnetic beads. The ten minute extraction technique allows specific binding of genomes containing the DpnI Gm6ATC motif common in the genomic DNA of many bacteria including γ-proteobacteria. Using synthetic genome mixtures, we demonstrate 80% recovery of Escherichia coli genomic DNA even when only femtogram quantities are spiked into 10 µg of human DNA background. Binding is very specific with less than 0.5% of human DNA bound. Next Generation Sequencing of input and enriched synthetic mixtures results in over 100-fold enrichment of target genomes relative to human and plant DNA. We also show comparable enrichment when sequencing complex microbiomes such as those from creek water and human saliva. The technique can be broadened to other restriction enzymes allowing for the selective enrichment of trace and unculturable organisms from complex microbiomes and the stratification of organisms according to restriction enzyme enrichment.  相似文献   

18.
Background and AimsTandemly repeated DNA and transposable elements represent most of the DNA in higher plant genomes. High-throughput sequencing allows a survey of the DNA in a genome, but whole-genome assembly can miss a substantial fraction of highly repeated sequence motifs. Chrysanthemum nankingense (2n = 2x = 18; genome size = 3.07 Gb; Asteraceae), a diploid reference for the many auto- and allopolyploids in the genus, was considered as an ancestral species and serves as an ornamental plant and high-value food. We aimed to characterize the major repetitive DNA motifs, understand their structure and identify key features that are shaped by genome and sequence evolution.MethodsGraph-based clustering with RepeatExplorer was used to identify and classify repetitive motifs in 2.14 millions of 250-bp paired-end Illumina reads from total genomic DNA of C. nankingense. Independently, the frequency of all canonical motifs k-bases long was counted in the raw read data and abundant k-mers (16, 21, 32, 64 and 128) were extracted and assembled to generate longer contigs for repetitive motif identification. For comparison, long terminal repeat retrotransposons were checked in the published C. nankingense reference genome. Fluorescent in situ hybridization was performed to show the chromosomal distribution of the main types of repetitive motifs.Key ResultsApart from rDNA (0.86 % of the total genome), a few microsatellites (0.16 %), and telomeric sequences, no highly abundant tandem repeats were identified. There were many transposable elements: 40 % of the genome had sequences with recognizable domains related to transposable elements. Long terminal repeat retrotransposons showed widespread distribution over chromosomes, although different sequence families had characteristic features such as abundance at or exclusion from centromeric or subtelomeric regions. Another group of very abundant repetitive motifs, including those most identified as low-complexity sequences (9.07 %) in the genome, showed no similarity to known sequence motifs or tandemly repeated elements.ConclusionsThe Chrysanthemum genome has an unusual structure with a very low proportion of tandemly repeated sequences (~1.02 %) in the genome, and a high proportion of low-complexity sequences, most likely degenerated remains of transposable elements. Identifying the presence, nature and genomic organization of major genome fractions enables inference of the evolutionary history of sequences, including degeneration and loss, critical to understanding biodiversity and diversification processes in the genomes of diploid and polyploid Chrysanthemum, Asteraceae and plants more widely.  相似文献   

19.
Recent pan-genome studies have revealed an abundance of DNA sequences in human genomes that are not present in the reference genome. A lion’s share of these non-reference sequences (NRSs) cannot be reliably assembled or placed on the reference genome. Improvements in long-read and synthetic long-read (aka linked-read) technologies have great potential for the characterization of NRSs. While synthetic long reads require less input DNA than long-read datasets, they are algorithmically more challenging to use. Except for computationally expensive whole-genome assembly methods, there is no synthetic long-read method for NRS detection. We propose a novel integrated alignment-based and local assembly-based algorithm, Novel-X, that uses the barcode information encoded in synthetic long reads to improve the detection of such events without a whole-genome de novo assembly. Our evaluations demonstrate that Novel-X finds many non-reference sequences that cannot be found by state-of-the-art short-read methods. We applied Novel-X to a diverse set of 68 samples from the Polaris HiSeq 4000 PGx cohort. Novel-X discovered 16 691 NRS insertions of size > 300 bp (total length 18.2 Mb). Many of them are population specific or may have a functional impact.  相似文献   

20.
In sequenced genomes of prokaryotes, anomalous DNA (aDNA) can be recognized, among others, by atypical clustering of dinucleotides. We hypothesized that atypical clustering of hexameric endonuclease recognition sites in aDNA allows the specific isolation of anomalous sequences in vitro. Clustering of endonuclease recognition sites in aDNA regions of eight published prokaryotic genome sequences was demonstrated. In silico digestion of the Neisseria meningitidis MC58 genome, using four selected endonucleases, revealed that out of 27 of the small fragments predicted (<5 kb), 21 were located in known genomic islands. Of the 24 calculated fragments (>300 bp and <5 kb), 22 met our criteria for aDNA, i.e. a high dinucleotide dissimilarity and/or aberrant GC content. The four enzymes also allowed the identification of aDNA fragments from the related Z2491 strain. Similarly, the sequenced genomes of three strains of Escherichia coli assessed by in silico digestion using XbaI yielded strain-specific sets of fragments of anomalous composition. In vitro applicability of the method was demonstrated by using adaptor-linked PCR, yielding the predicted fragments from the N.meningitidis MC58 genome. In conclusion, this strategy allows the selective isolation of aDNA from prokaryotic genomes by a simple restriction digest–amplification–cloning–sequencing scheme.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号