首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
《TARGETS》2003,2(6):245-252
Comparative genomic sequencing and analysis offers new wealth of information for target selection and the development of therapeutics. This article focuses on the following two key innovations in mapping and sequencing: first, shotgun sequencing of clone pools to combine the benefits of whole-genome shotgun and clone-by-clone strategies, and second, the leveraging of newly available assembled genomic sequences to improve the effectiveness of new sequencing projects through comparative mapping and comparative sequence assembly. The following specific sequencing and mapping methods are discussed in detail: clone-array pooled shotgun sequencing (CAPSS); transversal shotgun pooling designs; clone-array pooled shotgun mapping (CAPS-MAP); pooled genomic indexing (PGI); short-tag pooled genomic indexing (ST-PGI); and comparative sequence assembly (the CSA™ method). The methods can be implemented with only modest modifications of current large-scale sequencing pipelines and are highly synergistic with the next generation of sequencing technologies.  相似文献   

2.
Hierarchical shotgun sequencing remains the method of choice for assembling high‐quality reference sequences of complex plant genomes. The efficient exploitation of current high‐throughput technologies and powerful computational facilities for large‐insert clone sequencing necessitates the sequencing and assembly of a large number of clones in parallel. We developed a multiplexed pipeline for shotgun sequencing and assembling individual bacterial artificial chromosomes (BACs) using the Illumina sequencing platform. We illustrate our approach by sequencing 668 barley BACs (Hordeum vulgare L.) in a single Illumina HiSeq 2000 lane. Using a newly designed parallelized computational pipeline, we obtained sequence assemblies of individual BACs that consist, on average, of eight sequence scaffolds and represent >98% of the genomic inserts. Our BAC assemblies are clearly superior to a whole‐genome shotgun assembly regarding contiguity, completeness and the representation of the gene space. Our methods may be employed to rapidly obtain high‐quality assemblies of a large number of clones to assemble map‐based reference sequences of plant and animal species with complex genomes by sequencing along a minimum tiling path.  相似文献   

3.
High throughput sequencing technologies are being applied to an increasing number of model species with a high-quality reference genome. The application and analyses of whole-genome sequence data in non-model species with no prior genomic information are currently under way. Recent sequencing technologies provide new opportunities for gathering genomic data in natural populations, laying the empirical foundation for future research in the field of conservation and population genomics. Here we present the case study of the Bornean elephant, which is the most endangered subspecies of Asian elephant and exhibits very low genetic diversity. We used two different sequencing platforms, the Roche 454 FLX (shotgun) and Illumina, GAIIx (Restriction site associated DNA, RAD) to evaluate the feasibility of the two methodologies for the discovery of de novo markers (single nucleotide polymorphism, SNPs and microsatellites) using low coverage data. Approximately, 6,683 (shotgun) and 14,724 (RAD) SNPs were detected within our elephant sequence dataset. Genotyping of a representative sample of 194 SNPs resulted in a SNP validation rate of ∼ 83 to 94% and 17% of the loci were polymorphic with a low diversity (H o = 0.057). Different numbers of microsatellites were identified through shotgun (27,226) and RAD (868) techniques. Out of all di-, tri-, and tetra-microsatellite loci, 1,706 loci had sufficient flanking regions (shotgun) while only 7 were found with RAD. All microsatellites were monomorphic in the Bornean but polymorphic in another elephant subspecies. Despite using different sample sizes, and the well known differences in the two platforms used regarding sequence length and throughput, the two approaches showed high validation rate. The approaches used here for marker development in a threatened species demonstrate the utility of high throughput sequencing technologies as a starting point for the development of genomic tools in a non-model species and in particular for a species with low genetic diversity.  相似文献   

4.
Physical mapping has been rediscovered as an important component of large-scale sequencing projects. Restriction maps provide landmark sequences at defined intervals, and high-resolution restriction maps can be assembled from ensembles of single molecules by optical means. Such optical maps can be constructed from both large-insert clones and genomic DNA, and are used as a scaffold for accurately aligning sequence contigs generated by shotgun sequencing.  相似文献   

5.
Libraries constructed in bacterial artificial chromosome (BAC) vectors have become the choice for clone sets in high throughput genomic sequencing projects primarily because of their high stability. BAC libraries have been proposed as a source for minimally over-lapping clones for sequencing large genomic regions, and the use of BAC end sequences (i.e. sequences adjoining the insert sites) has been proposed as a primary means for selecting minimally overlapping clones for sequencing large genomic regions. For this strategy to be effective, high throughput methods for BAC end sequencing of all the clones in deep coverage BAC libraries needed to be developed. Here we describe a low cost, efficient, 96 well procedure for BAC end sequencing. These methods allow us to generate BAC end sequences from human and Arabidoposis libraries with an average read length of >450 bases and with a single pass sequencing average accuracy of >98%. Application of BAC end sequences in genomic sequen-cing is discussed.  相似文献   

6.
ToxoDB: accessing the Toxoplasma gondii genome   总被引:1,自引:0,他引:1  
ToxoDB (http://ToxoDB.org) provides a genome resource for the protozoan parasite Toxoplasma gondii. Several sequencing projects devoted to T. gondii have been completed or are in progress: an EST project (http://genome.wustl.edu/est/index.php?toxoplasma=1), a BAC clone end-sequencing project (http://www.sanger.ac.uk/Projects/T_gondii/) and an 8X random shotgun genomic sequencing project (http://www.tigr.org/tdb/e2k1/tga1/). ToxoDB was designed to provide a central point of access for all available T. gondii data, and a variety of data mining tools useful for the analysis of unfinished, un-annotated draft sequence during the early phases of the genome project. In later stages, as more and different types of data become available (microarray, proteomic, SNP, QTL, etc.) the database will provide an integrated data analysis platform facilitating user-defined queries across the different data types.  相似文献   

7.
8.
Common bean (Phaseolus vulgaris L.) is a legume that is an important source of dietary protein in developing countries throughout the world. Utilizing the G19833 BAC library for P. vulgaris from Clemson University, 89,017 BAC-end sequences were generated giving 62,588,675 base pairs of genomic sequence covering approximately 9.54% of the genome. Analysis of these sequences in combination with 1,404 shotgun sequences from the cultivar Bat7 revealed that approximately 49.2% of the genome contains repetitive sequence and 29.3% is genic. Compared to other legume BAC-end sequencing projects, it appears that P. vulgaris has higher predicted levels of repetitive sequence, but this may be due to a more intense identification strategy combining both similarity-based matches as well as de novo identification of repeats. In addition, fingerprints for 41,717 BACs were obtained and assembled into a draft physical map consisting of 1,183 clone contigs and 6,385 singletons with ~9x coverage of the genome.  相似文献   

9.
A data set consisting of DNA sequences from a large-scale shotgun DNA cloning and sequencing project has been collected and posted for public release. The purpose is to propose a standard genomic DNA sequencing data set by which various algorithms and implementations can be tested. This set of data is divided into two subsets, one containing raw DNA sequence data (1023 clones) and the other consisting of the corresponding partially refined or edited DNA sequence data (820 clones). Suggested criteria or guidelines for this data refinement are presented so that algorithms for preprocessing and screening raw sequences may be developed. Development of such preprocessing, screening, aligning, and assembling algorithms will expedite large-scale DNA sequencing projects so that the complete unambiguous consensus DNA sequences will be made available to the general research community in a quicker manner. Smaller scale routine DNA sequencing projects will also be greatly aided by such computational efforts.  相似文献   

10.
Expected-value models have long provided a rudimentary theoretical foundation for random DNA sequencing. Here, we are interested in improving characterization of genome coverage in terms of its underlying probability distributions. We find that the mathematical notion of occupancy serves as a good model for evolution of the coverage distribution function and reveals new insights related to sequence redundancy. Established concepts, such as “full shotgun depth,” have been assumed invariant, but actually depend on project size and decrease over time. For most microbial projects, the full shotgun milestone should be revised downward by about 30%. Accordingly, many already-completed genomes appear to have been over-sequenced. Results also suggest that read lengths for emerging high-throughput sequencing methods must be increased substantially before they can be considered as possible successors to the standard Sanger method. In particular, gains in throughput and sequence depth cannot be made to compensate for diminished read length. Limits are well approximated by a simple logarithmic equation, which should be useful in estimating maximum coverage-based redundancy for future projects.  相似文献   

11.
The improvements in high throughput sequencing technologies (HTS) made clinical sequencing projects such as ClinSeq and Genomics England feasible. Although there are significant improvements in accuracy and reproducibility of HTS based analyses, the usability of these types of data for diagnostic and prognostic applications necessitates a near perfect data generation. To assess the usability of a widely used HTS platform for accurate and reproducible clinical applications in terms of robustness, we generated whole genome shotgun (WGS) sequence data from the genomes of two human individuals in two different genome sequencing centers. After analyzing the data to characterize SNPs and indels using the same tools (BWA, SAMtools, and GATK), we observed significant number of discrepancies in the call sets. As expected, the most of the disagreements between the call sets were found within genomic regions containing common repeats and segmental duplications, albeit only a small fraction of the discordant variants were within the exons and other functionally relevant regions such as promoters. We conclude that although HTS platforms are sufficiently powerful for providing data for first-pass clinical tests, the variant predictions still need to be confirmed using orthogonal methods before using in clinical applications.  相似文献   

12.

Background  

Due to recent advances in whole genome shotgun sequencing and assembly technologies, the financial cost of decoding an organism's DNA has been drastically reduced, resulting in a recent explosion of genomic sequencing projects. This increase in related genomic data will allow for in depth studies of evolution in closely related species through multiple whole genome comparisons.  相似文献   

13.

Background  

Recent high throughput sequencing technologies are capable of generating a huge amount of data for bacterial genome sequencing projects. Although current sequence assemblers successfully merge the overlapping reads, often several contigs remain which cannot be assembled any further. It is still costly and time consuming to close all the gaps in order to acquire the whole genomic sequence.  相似文献   

14.
近几年来五种单细胞生物的基因组计划得以完成。本文介绍了从五种生物的全基因组序列获得的一些成果,包括全基因组鸟枪法测序、基因组分析和新比较基因组等三个方面,并对生物基因组计划的研究方法作一些探讨。  相似文献   

15.
We propose a genome sequencing strategy, which is neither divide-and-conquer (clone by clone) nor the shotgun approach. Random PCR-based and PCR relay sequencing constitute the basis of this novel strategy. Most of the genome is sequenced by the former process that requires only a set of non-specific primers and a template DNA. Random PCR-based sequencing reduces redundancy in sequencing by exploiting known sequence information. The number of primers required for random PCR was significantly diminished by using a combination of primers. The former process can be partially replaced by the shotgun method, if necessary. The gap-filling process can be effectively performed by way of PCR relay. The feasibility of this strategy was demonstrated using the Escherichia coli genome. This strategy enhances the global effort towards genome sequencing by being available through the Internet and by allowing the use of preexisting sequence data.  相似文献   

16.
The increasing availability of high-quality reference genomic sequences has created a demand for ways to survey the sequence differences present in individual genomes. Here we describe a DNA sequencing method based on hybridization of a universal panel of tiling probes. Millions of shotgun fragments are amplified in situ and subjected to sequential hybridization with short fluorescent probes. Long fragments of 200 bp facilitate unique placement even in large genomes. The sequencing chemistry is simple, enzyme-free and consumes only dilute solutions of the probes, resulting in reduced sequencing cost and substantially increased speed. A prototype instrument based on commonly available equipment was used to resequence the Bacteriophage lambda and Escherichia coli genomes to better than 99.93% accuracy with a raw throughput of 320 Mbp/day, albeit with a significant number of small gaps attributed to losses in sample preparation.  相似文献   

17.
Early analytical clone screening is important during Chinese hamster ovary (CHO) cell line development of biotherapeutic proteins to select a clonally derived cell line with most favorable stability and product quality. Sensitive sequence confirmation methods using mass spectrometry have limitations in throughput and turnaround time. Next‐generation sequencing (NGS) technologies emerged as alternatives for CHO clone analytics. We report an efficient NGS workflow applying the targeted locus amplification (TLA) strategy for genomic screening of antibody expressing CHO clones. In contrast to previously reported RNA sequencing approaches, TLA allows for targeted sequencing of genomic integrated transgenic DNA without prior locus information, robust detection of single‐nucleotide variants (SNVs) and transgenic rearrangements. During clone selection, TLA/NGS revealed CHO clones with high‐level SNVs within the antibody gene and we report in another case the utility of TLA/NGS to identify rearrangements at transgenic DNA level. We also determined detection limits for SNVs calling and the potential to identify clone contaminations by TLA/NGS. TLA/NGS also allows to identify genetically identical clones. In summary, we demonstrate that TLA/NGS is a robust screening method useful for routine clone analytics during cell line development with the potential to process up to 24 CHO clones in less than 7 workdays.  相似文献   

18.
微生物全基因组鸟枪法测序   总被引:4,自引:0,他引:4  
罗春清  杨焕明 《遗传》2002,24(3):310-314
全基因组测序主要有二种策略,一种是分级鸟枪法测序,另一种是全基因组鸟枪法测序。微生物是一种十分重要的遗传资源,运用全基因组鸟枪法可以方便、快捷地完成其基因组的测序任务。本文对微生物全基因组鸟枪法测序中文库构建、插入片段的长短比例、反应投入量、拼接以及补洞等问题作了较细致的描述,有些步骤作了举例说明。 Abstract:Two strategies introduced for whole genome sequencing,one is clone by clone method,the other is whole genome shotgun sequencing,for microbes which are very important to us,whole genome shotgun sequencing method is very convenient.In this article we discussed the library construction、long-to-short-ratio of insert,、total number of reads should be sequenced、assembly and gap filling technologies of the whole microbial genome shotgun sequencing method while some examples presented.  相似文献   

19.
Target sequence capture is an efficient technique to enrich specific genomic regions for high‐throughput sequencing in ecological and evolutionary studies. In recent years, many sequence capture approaches have been proposed, but most of them rely on commercial synthetic baits which make the experiment expensive. Here, we present a novel sequence capture approach called AFLP‐based genome sequence capture (AFLP Capture). This method uses the AFLP (amplified fragment length polymorphism) technique to generate homemade capture baits without the need for prior genome information, thus is applicable to any organisms. In this approach, biotinylated AFLP fragments representing a random fraction of the genome are used as baits to capture the homologous fragments from genomic shotgun sequencing libraries. In a trial study, by using AFLP Capture, we successfully obtained 511 orthologous loci (>700,000 bp in total length) from 11 Odorrana species and more than 100,000 single nucleotide polymorphisms (SNPs) in four analyzed individuals of an Odorrana species. This result shows that our method can be used to address questions of various evolutionary depths (from interspecies level to intraspecies level). We also discuss the flexibility in bait preparation and how the sequencing data are analyzed. In summary, AFLP Capture is a rapid and flexible tool and can significantly reduce the experimental cost for phylogenetic studies that require analyzing genome‐scale data (hundreds or thousands of loci).  相似文献   

20.
The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fecal samples. The interpretation of this new kind of data poses a wide variety of exciting and difficult bioinformatics problems. The aim of this review is to introduce the bioinformatics community to this emerging field by surveying existing techniques and promising new approaches for several of the most interesting of these computational problems.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号