首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Multiplexed high-throughput pyrosequencing is currently limited in complexity (number of samples sequenced in parallel), and in capacity (number of sequences obtained per sample). Physical-space segregation of the sequencing platform into a fixed number of channels allows limited multiplexing, but obscures available sequencing space. To overcome these limitations, we have devised a novel barcoding approach to allow for pooling and sequencing of DNA from independent samples, and to facilitate subsequent segregation of sequencing capacity. Forty-eight forward–reverse barcode pairs are described: each forward and each reverse barcode unique with respect to at least 4 nt positions. With improved read lengths of pyrosequencers, combinations of forward and reverse barcodes may be used to sequence from as many as n2 independent libraries for each set of ‘n’ forward and ‘n’ reverse barcodes, for each defined set of cloning-linkers. In two pilot series of barcoded sequencing using the GS20 Sequencer (454/Roche), we found that over 99.8% of obtained sequences could be assigned to 25 independent, uniquely barcoded libraries based on the presence of either a perfect forward or a perfect reverse barcode. The false-discovery rate, as measured by the percentage of sequences with unexpected perfect pairings of unmatched forward and reverse barcodes, was estimated to be <0.005%.  相似文献   

2.
We propose a general working strategy to deal with incomplete reference libraries in the DNA barcoding identification of species. Considering that (1) queries with a large genetic distance with their best DNA barcode match are more likely to be misidentified and (2) imposing a distance threshold profitably reduces identification errors, we modelled relationships between identification performances and distance thresholds in four DNA barcode libraries of Diptera (n = 4270), Lepidoptera (n = 7577), Hymenoptera (n = 2067) and Tephritidae (n = 602 DNA barcodes). In all cases, more restrictive distance thresholds produced a gradual increase in the proportion of true negatives, a gradual decrease of false positives and more abrupt variations in the proportions of true positives and false negatives. More restrictive distance thresholds improved precision, yet negatively affected accuracy due to the higher proportions of queries discarded (viz. having a distance query-best match above the threshold). Using a simple linear regression we calculated an ad hoc distance threshold for the tephritid library producing an estimated relative identification error <0.05. According to the expectations, when we used this threshold for the identification of 188 independently collected tephritids, less than 5% of queries with a distance query-best match below the threshold were misidentified. Ad hoc thresholds can be calculated for each particular reference library of DNA barcodes and should be used as cut-off mark defining whether we can proceed identifying the query with a known estimated error probability (e.g. 5%) or whether we should discard the query and consider alternative/complementary identification methods.  相似文献   

3.
《Gene》1997,186(1):135-142
The generation of expressed sequenced tags (ESTs) depends on the arbitrary selection of individual cDNA clones from libraries. The efficiency of this process reflects the clonal structure of the library used and can be significantly increased using size selected, directional, normalized cDNA libraries. This strategy, however, is not readily applicable when mRNA is limiting, as is the case in the study of complex microorganisms such as parasites, fetal tissues or tumor biopsies. We show here that the construction and systematic sequencing of minilibraries of cDNAs produced by arbitrarily primed PCR provides an alternative means of efficiently generating ESTs in situations where only nanogram quantities of RNA are available. This methodology greatly compensates for unequal message abundance, avoids the need for complex library construction, is equally applicable to the analysis of abundant or rare biological material and is ideally suited to multicenter programmes.  相似文献   

4.
Goodman AL  Wu M  Gordon JI 《Nature protocols》2011,6(12):1969-1980
Insertion sequencing (INSeq) is a method for determining the insertion site and relative abundance of large numbers of transposon mutants in a mixed population of isogenic mutants of a sequenced microbial species. INSeq is based on a modified mariner transposon containing MmeI sites at its ends, allowing cleavage at chromosomal sites 16-17 bp from the inserted transposon. Genomic regions adjacent to the transposons are amplified by linear PCR with a biotinylated primer. Products are bound to magnetic beads, digested with MmeI and barcoded with sample-specific linkers appended to each restriction fragment. After limited PCR amplification, fragments are sequenced using a high-throughput instrument. The sequence of each read can be used to map the location of a transposon in the genome. Read count measures the relative abundance of that mutant in the population. Solid-phase library preparation makes this protocol rapid (18 h), easy to scale up, amenable to automation and useful for a variety of samples. A protocol for characterizing libraries of transposon mutant strains clonally arrayed in a multiwell format is provided.  相似文献   

5.
Fly larvae living on dead corpses can be used to estimate post-mortem intervals. The identification of these flies is decisive in forensic casework and can be facilitated by using DNA barcodes provided that a representative and comprehensive reference library of DNA barcodes is available.We constructed a local (Belgium and France) reference library of 85 sequences of the COI DNA barcode fragment (mitochondrial cytochrome c oxidase subunit I gene), from 16 fly species of forensic interest (Calliphoridae, Muscidae, Fanniidae). This library was then used to evaluate the ability of two public libraries (GenBank and the Barcode of Life Data Systems – BOLD) to identify specimens from Belgian and French forensic cases. The public libraries indeed allow a correct identification of most specimens. Yet, some of the identifications remain ambiguous and some forensically important fly species are not, or insufficiently, represented in the reference libraries. Several search options offered by GenBank and BOLD can be used to further improve the identifications obtained from both libraries using DNA barcodes.  相似文献   

6.
Two new diphasmid vectors (lambda SK17 and SK22) and a novel procedure to construct linking libraries are described. A partial filling-in reaction provides counter-selection against false linking clones in the library, and obviates the need for supF selection. The diphasmid vectors, in combination with the novel selection procedure, have been used to construct a chromosome 3 specific NotI linking library from a human chromosome 3/mouse microcell hybrid cell line. The application of the new vectors and the strong biochemical and biological selections resulted in a library of 60,000 NotI linking clones. As practically all of them are real NotI linking clones (no false recombinants) the library represents approximately 3,000 human recombinants (equal to 10-15 genomic equivalents of chromosome 3). Previously published methods for construction of linking libraries are compared with the procedure described in the present paper. The advantages of the new vectors and the novel protocol are discussed.  相似文献   

7.
Using an Escherichia coli-Streptomyces shuttle vector derived from a bacterial artificial chromosome (BAC), we developed methodologies for the construction of BAC libraries of filamentous actinomycetes. Libraries of Streptomyces coelicolor, the model actinomycete, and Planobispora rosea, a genetically intractable strain, were constructed. Both libraries have an average insert size of 60 kb, with maximal insert larger than 150 kb. The S. coelicolor library was evaluated by selected hybridisations to DraI fragments and by end sequencing of a few clones. Hybridisation of the P. rosea library to selected probes indicates a good representation of the P. rosea genome and that the library can be used to facilitate the genomic analysis of this actinomycete.  相似文献   

8.
Libraries constructed in bacterial artificial chromosome (BAC) vectors have become the choice for clone sets in high throughput genomic sequencing projects primarily because of their high stability. BAC libraries have been proposed as a source for minimally over-lapping clones for sequencing large genomic regions, and the use of BAC end sequences (i.e. sequences adjoining the insert sites) has been proposed as a primary means for selecting minimally overlapping clones for sequencing large genomic regions. For this strategy to be effective, high throughput methods for BAC end sequencing of all the clones in deep coverage BAC libraries needed to be developed. Here we describe a low cost, efficient, 96 well procedure for BAC end sequencing. These methods allow us to generate BAC end sequences from human and Arabidoposis libraries with an average read length of >450 bases and with a single pass sequencing average accuracy of >98%. Application of BAC end sequences in genomic sequen-cing is discussed.  相似文献   

9.
In an effort to identify and characterize genes expressed during multicellular development ill Dictyostelium, we have undertaken a cDNA sequencing project. Using size-fractionated subsets of cDNA from the first finger stage, two sets of gridded libraries were constructed for cDNA sequencing. One, library S, consisting of 9984 clones, carries relatively short inserts, and the other, library L, which consists of 8448 clones, has longer inserts. We sequenced all the selected clones in library S from their 3'-ends, and this generated 3093 non-redundant, expressed sequence tags (ESTs). Among them, 246 ESTs hit known Dictyostelium genes and 910 showed significant similarity to genes of Dictyostelium and other organisms. For library L, 1132 clones were randomly sequenced and 471 non-redundant ESTs were obtained. In combination, the ESTs from the two libraries represent approximately 40% of genes expressed in late development, assuming that the non-redundant ESTs correspond to independent genes. They will provide a useful resource for investigating the genetic networks that regulate multicellular development of this organism.  相似文献   

10.
Complex microbial communities remain poorly characterized despite their ubiquity and importance to human and animal health, agriculture, and industry. Attempts to describe microbial communities by either traditional microbiological methods or molecular methods have been limited in both scale and precision. The availability of genomics technologies offers an unprecedented opportunity to conduct more comprehensive characterizations of microbial communities. Here we describe the application of an established molecular diagnostic method based on the chaperonin-60 sequence, in combination with high-throughput sequencing, to the profiling of a microbial community: the pig intestinal microbial community. Four libraries of cloned cpn60 sequences were generated by two genomic DNA extraction procedures in combination with two PCR protocols. A total of 1,125 cloned cpn60 sequences from the four libraries were sequenced. Among the 1,125 cloned cpn60 sequences, we identified 398 different nucleotide sequences encoding 280 unique peptide sequences. Pairwise comparisons of the 398 unique nucleotide sequences revealed a high degree of sequence diversity within the library. Identification of the likely taxonomic origins of cloned sequences ranged from imprecise, with clones assigned to a taxonomic subclass, to precise, for cloned sequences with 100% DNA sequence identity with a species in our reference database. The compositions of the four libraries were compared and differences related to library construction parameters were observed. Our results indicate that this method is an alternative to 16S rRNA sequence-based studies which can be scaled up for the purpose of performing a potentially comprehensive assessment of a given microbial community or for comparative studies.  相似文献   

11.
Next-generation sequencing (NGS) technologies have transformed genomic research and have the potential to revolutionize clinical medicine. However, the background error rates of sequencing instruments and limitations in targeted read coverage have precluded the detection of rare DNA sequence variants by NGS. Here we describe a method, termed CypherSeq, which combines double-stranded barcoding error correction and rolling circle amplification (RCA)-based target enrichment to vastly improve NGS-based rare variant detection. The CypherSeq methodology involves the ligation of sample DNA into circular vectors, which contain double-stranded barcodes for computational error correction and adapters for library preparation and sequencing. CypherSeq is capable of detecting rare mutations genome-wide as well as those within specific target genes via RCA-based enrichment. We demonstrate that CypherSeq is capable of correcting errors incurred during library preparation and sequencing to reproducibly detect mutations down to a frequency of 2.4 × 10−7 per base pair, and report the frequency and spectra of spontaneous and ethyl methanesulfonate-induced mutations across the Saccharomyces cerevisiae genome.  相似文献   

12.
Screening large numbers of target regions in multiple DNA samples for sequence variation is an important application of next-generation sequencing but an efficient method to enrich the samples in parallel has yet to be reported. We describe an advanced method that combines DNA samples using indexes or barcodes prior to target enrichment to facilitate this type of experiment. Sequencing libraries for multiple individual DNA samples, each incorporating a unique 6-bp index, are combined in equal quantities, enriched using a single in-solution target enrichment assay and sequenced in a single reaction. Sequence reads are parsed based on the index, allowing sequence analysis of individual samples. We show that the use of indexed samples does not impact on the efficiency of the enrichment reaction. For three- and nine-indexed HapMap DNA samples, the method was found to be highly accurate for SNP identification. Even with sequence coverage as low as 8x, 99% of sequence SNP calls were concordant with known genotypes. Within a single experiment, this method can sequence the exonic regions of hundreds of genes in tens of samples for sequence and structural variation using as little as 1 μg of input DNA per sample.  相似文献   

13.
Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be “domesticated” for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7–15 kb) cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs) with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.  相似文献   

14.
Rice is an important crop and a model system for monocot genomics, and is a target for whole genome sequencing by the International Rice Genome Sequencing Project (IRGSP). The IRGSP is using a clone by clone approach to sequence rice based on minimum tiles of BAC or PAC clones. For chromosomes 10 and 3 we are using an integrated physical map based on two fingerprinted and end-sequenced BAC libraries to identifying a minimum tiling path of clones. In this study we constructed and tested two rice genomic libraries with an average insert size of 10 kb (10-kb library) to support the gap closure and finishing phases of the rice genome sequencing project. The HaeIII library contains 166,752 clones covering approximately 4.6x rice genome equivalents with an average insert size of 10.5 kb. The Sau3AI library contains 138,960 clones covering 4.2x genome equivalents with an average insert size of 11.6 kb. Both libraries were gridded in duplicate onto 11 high-density filters in a 5 x 5 pattern to facilitate screening by hybridization. The libraries contain an unbiased coverage of the rice genome with less than 5% contamination by clones containing organelle DNA or no insert. An efficient method was developed, consisting of pooled overgo hybridization, the selection of 10-kb gap spanning clones using end sequences, transposon sequencing and utilization of in silico draft sequence, to close relatively small gaps between sequenced BAC clones. Using this method we were able to close a majority of the gaps (up to approximately 50 kb) identified during the finishing phase of chromosome-10 sequencing. This method represents a useful way to close clone gaps and thus to complete the entire rice genome.  相似文献   

15.
Here we demonstrate a method for unbiased multiplexed deep sequencing of RNA and DNA libraries using a novel, efficient and adaptable barcoding strategy called Post Amplification Ligation-Mediated (PALM). PALM barcoding is performed as the very last step of library preparation, eliminating a potential barcode-induced bias and allowing the flexibility to synthesize as many barcodes as needed. We sequenced PALM barcoded micro RNA (miRNA) and DNA reference samples and evaluated the quantitative barcode-induced bias in comparison to the same reference samples prepared using the Illumina TruSeq barcoding strategy. The Illumina TruSeq small RNA strategy introduces the barcode during the PCR step using differentially barcoded primers, while the TruSeq DNA strategy introduces the barcode before the PCR step by ligation of differentially barcoded adaptors. Results show virtually no bias between the differentially barcoded miRNA and DNA samples, both for the PALM and the TruSeq sample preparation methods. We also multiplexed miRNA reference samples using a pre-PCR barcode ligation. This barcoding strategy results in significant bias.  相似文献   

16.
Chloroplast genomes supply indispensable information that helps improve the phylogenetic resolution and even as organelle‐scale barcodes. Next‐generation sequencing technologies have helped promote sequencing of complete chloroplast genomes, but compared with the number of angiosperms, relatively few chloroplast genomes have been sequenced. There are two major reasons for the paucity of completely sequenced chloroplast genomes: (i) massive amounts of fresh leaves are needed for chloroplast sequencing and (ii) there are considerable gaps in the sequenced chloroplast genomes of many plants because of the difficulty of isolating high‐quality chloroplast DNA, preventing complete chloroplast genomes from being assembled. To overcome these obstacles, all known angiosperm chloroplast genomes available to date were analysed, and then we designed nine universal primer pairs corresponding to the highly conserved regions. Using these primers, angiosperm whole chloroplast genomes can be amplified using long‐range PCR and sequenced using next‐generation sequencing methods. The primers showed high universality, which was tested using 24 species representing major clades of angiosperms. To validate the functionality of the primers, eight species representing major groups of angiosperms, that is, early‐diverging angiosperms, magnoliids, monocots, Saxifragales, fabids, malvids and asterids, were sequenced and assembled their complete chloroplast genomes. In our trials, only 100 mg of fresh leaves was used. The results show that the universal primer set provided an easy, effective and feasible approach for sequencing whole chloroplast genomes in angiosperms. The designed universal primer pairs provide a possibility to accelerate genome‐scale data acquisition and will therefore magnify the phylogenetic resolution and species identification in angiosperms.  相似文献   

17.
Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain <1% endogenous DNA, with the majority of sequencing capacity taken up by environmental DNA. Here we present a capture-based method for enriching the endogenous component of aDNA sequencing libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062–147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217–73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples.  相似文献   

18.
SAGE of the developing wheat caryopsis   总被引:2,自引:1,他引:1  
  相似文献   

19.
The Cytophaga-Flavobacterium group is known to be abundant in aquatic ecosystems and to have a potentially unique role in the utilization of organic material. However, relatively little is known about the diversity and abundance of uncultured members of this bacterial group, in part because they are underrepresented in clone libraries of 16S rRNA genes. To circumvent a suspected bias in PCR, a primer set was designed to amplify 16S rRNA genes from the Cytophaga-Flavobacterium group and was used to construct a library of these genes from the Delaware Estuary. This library had several novel Cytophaga-like 16S rRNA genes, of which about 40% could be grouped together into two clusters (DE clusters 1 and 2) defined by sequences initially observed only in the Delaware library; the other 16S rRNA genes were classified into an additional four clades containing sequences from other environments. An oligonucleotide probe was designed for the cluster with the most clones (DE cluster 2) and was used in fluorescence in situ hybridization assays. Bacteria in DE cluster 2 accounted for about 10% of the total prokaryotic abundance in the Delaware Estuary and in a depth profile of the Chukchi Sea (Arctic Ocean). The presence of DE cluster 2 in the Arctic Ocean was confirmed by results from 16S rRNA clone libraries. The contribution of this cluster to the total bacterial biomass is probably larger than is indicated by the abundance of its members, because the average cell volume of bacteria in DE cluster 2 was larger than those of other bacteria and prokaryotes in the Delaware Estuary and Chukchi Sea. DE cluster 2 may be one of the more abundant bacterial groups in the Delaware Estuary and possibly other marine environments.  相似文献   

20.
To help develop an understanding of the genes that govern the developmental characteristics of the potato (Solanum tuberosum), as well as the genes associated with responses to specified pathogens and storage conditions, The Canadian Potato Genome Project (CPGP) carried out 5′ end sequencing of regular, normalized and full-length cDNA libraries of the Shepody potato cultivar, generating over 66,600 expressed sequence tags (ESTs). Libraries sequenced represented tuber developmental stages, pathogen-challenged tubers, as well as leaf, floral developmental stages, suspension cultured cells and roots. All libraries analysed to date have contributed unique sequences, with the normalized libraries high on the list. In addition, a low molecular weight library has enhanced the 3′ ends of our sequence assemblies. Using the combined assembly dataset, unique tuber developmental, cold storage and pathogen-challenged sequences have been identified. A comparison of the ESTs specific to the pathogen-challenged tuber and foliar libraries revealed minimal overlap between these libraries. Mixed assemblies using over 189,000 potato EST sequences from CPGP and The Institute for Genomics Research (TIGR) has revealed common sequences, as well as CPGP- and TIGR-unique sequences. Electronic Supplementary Material Electronic Supplementary material is available for this article at and accessible for authorised users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号