首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Background

The domestic pig (Sus scrofa) is both an important livestock species and a model for biomedical research. Exome sequencing has accelerated identification of protein-coding variants underlying phenotypic traits in human and mouse. We aimed to develop and validate a similar resource for the pig.

Results

We developed probe sets to capture pig exonic sequences based upon the current Ensembl pig gene annotation supplemented with mapped expressed sequence tags (ESTs) and demonstrated proof-of-principle capture and sequencing of the pig exome in 96 pigs, encompassing 24 capture experiments. For most of the samples at least 10x sequence coverage was achieved for more than 90% of the target bases. Bioinformatic analysis of the data revealed over 236,000 high confidence predicted SNPs and over 28,000 predicted indels.

Conclusions

We have achieved coverage statistics similar to those seen with commercially available human and mouse exome kits. Exome capture in pigs provides a tool to identify coding region variation associated with production traits, including loss of function mutations which may explain embryonic and neonatal losses, and to improve genomic assemblies in the vicinity of protein coding genes in the pig.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-550) contains supplementary material, which is available to authorized users.  相似文献   

2.

Background

Next-generation sequencing technologies are rapidly generating whole-genome datasets for an increasing number of organisms. However, phylogenetic reconstruction of genomic data remains difficult because de novo assembly for non-model genomes and multi-genome alignment are challenging.

Results

To greatly simplify the analysis, we present an Assembly and Alignment-Free (AAF) method (https://sourceforge.net/projects/aaf-phylogeny) that constructs phylogenies directly from unassembled genome sequence data, bypassing both genome assembly and alignment. Using mathematical calculations, models of sequence evolution, and simulated sequencing of published genomes, we address both evolutionary and sampling issues caused by direct reconstruction, including homoplasy, sequencing errors, and incomplete sequencing coverage. From these results, we calculate the statistical properties of the pairwise distances between genomes, allowing us to optimize parameter selection and perform bootstrapping. As a test case with real data, we successfully reconstructed the phylogeny of 12 mammals using raw sequencing reads. We also applied AAF to 21 tropical tree genome datasets with low coverage to demonstrate its effectiveness on non-model organisms.

Conclusion

Our AAF method opens up phylogenomics for species without an appropriate reference genome or high sequence coverage, and rapidly creates a phylogenetic framework for further analysis of genome structure and diversity among non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1647-5) contains supplementary material, which is available to authorized users.  相似文献   

3.

Background

Generation of long (>5 Kb) DNA sequencing reads provides an approach for interrogation of complex regions in the human genome. Currently, large-insert whole genome sequencing (WGS) technologies from Pacific Biosciences (PacBio) enable analysis of chromosomal structural variations (SVs), but the cost to achieve the required sequence coverage across the entire human genome is high.

Results

We developed a method (termed PacBio-LITS) that combines oligonucleotide-based DNA target-capture enrichment technologies with PacBio large-insert library preparation to facilitate SV studies at specific chromosomal regions. PacBio-LITS provides deep sequence coverage at the specified sites at substantially reduced cost compared with PacBio WGS. The efficacy of PacBio-LITS is illustrated by delineating the breakpoint junctions of low copy repeat (LCR)-associated complex structural rearrangements on chr17p11.2 in patients diagnosed with Potocki–Lupski syndrome (PTLS; MIM#610883). We successfully identified previously determined breakpoint junctions in three PTLS cases, and also were able to discover novel junctions in repetitive sequences, including LCR-mediated breakpoints. The new information has enabled us to propose mechanisms for formation of these structural variants.

Conclusions

The new method leverages the cost efficiency of targeted capture-sequencing as well as the mappability and scaffolding capabilities of long sequencing reads generated by the PacBio platform. It is therefore suitable for studying complex SVs, especially those involving LCRs, inversions, and the generation of chimeric Alu elements at the breakpoints. Other genomic research applications, such as haplotype phasing and small insertion and deletion validation could also benefit from this technology.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1370-2) contains supplementary material, which is available to authorized users.  相似文献   

4.

Background

One of the most significant issues surrounding next generation sequencing is the cost and the difficulty assembling short read lengths. Targeted capture enrichment of longer fragments using single molecule sequencing (SMS) is expected to improve both sequence assembly and base-call accuracy but, at present, there are very few examples of successful application of these technologic advances in translational research and clinical testing. We developed a targeted single molecule sequencing (T-SMS) panel for genes implicated in ovarian response to controlled ovarian hyperstimulation (COH) for infertility.

Results

Target enrichment was carried out using droplet-base multiplex polymerase chain reaction (PCR) technology (RainDance®) designed to yield amplicons averaging 1 kb fragment size from candidate 44 loci (99.8% unique base-pair coverage). The total targeted sequence was 3.18 Mb per sample. SMS was carried out using single molecule, real-time DNA sequencing (SMRT® Pacific Biosciences®), average raw read length = 1178 nucleotides, 5% of the amplicons >6000 nucleotides). After filtering with circular consensus (CCS) reads, the mean read length was 3200 nucleotides (97% CCS accuracy). Primary data analyses, alignment and filtering utilized the Pacific Biosciences® SMRT portal. Secondary analysis was conducted using the Genome Analysis Toolkit for SNP discovery l and wANNOVAR for functional analysis of variants. Filtered functional variants 18 of 19 (94.7%) were further confirmed using conventional Sanger sequencing. CCS reads were able to accurately detect zygosity. Coverage within GC rich regions (i.e.VEGFR; 72% GC rich) was achieved by capturing long genomic DNA (gDNA) fragments and reading into regions that flank the capture regions. As proof of concept, a non-synonymous LHCGR variant captured in two severe OHSS cases, and verified by conventional sequencing.

Conclusions

Combining emulsion PCR-generated 1 kb amplicons and SMRT DNA sequencing permitted greater depth of coverage for T-SMS and facilitated easier sequence assembly. To the best of our knowledge, this is the first report combining emulsion PCR and T-SMS for long reads using human DNA samples, and NGS panel designed for biomarker discovery in OHSS.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1451-2) contains supplementary material, which is available to authorized users.  相似文献   

5.
6.

Background

Human leukocyte antigen (HLA) is a group of genes that are extremely polymorphic among individuals and populations and have been associated with more than 100 different diseases and adverse drug effects. HLA typing is accordingly an important tool in clinical application, medical research, and population genetics. We have previously developed a phase-defined HLA gene sequencing method using MiSeq sequencing.

Results

Here we report a simple, high-throughput, and cost-effective sequencing method that includes normalized library preparation and adjustment of DNA molar concentration. We applied long-range PCR to amplify HLA-B for 96 samples followed by transposase-based library construction and multiplex sequencing with the MiSeq sequencer. After sequencing, we observed low variation in read percentages (0.2% to 1.55%) among the 96 demultiplexed samples. On this basis, all the samples were amenable to haplotype phasing using our phase-defined sequencing method. In our study, a sequencing depth of 800x was necessary and sufficient to achieve full phasing of HLA-B alleles with reliable assignment of the allelic sequence to the 8 digit level.

Conclusions

Our HLA sequencing method optimized for 96 multiplexing samples is highly time effective and cost effective and is especially suitable for automated multi-sample library preparation and sequencing.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-645) contains supplementary material, which is available to authorized users.  相似文献   

7.

Background

Recent developments in deep (next-generation) sequencing technologies are significantly impacting medical research. The global analysis of protein coding regions in genomes of interest by whole exome sequencing is a widely used application. Many technologies for exome capture are commercially available; here we compare the performance of four of them: NimbleGen’s SeqCap EZ v3.0, Agilent’s SureSelect v4.0, Illumina’s TruSeq Exome, and Illumina’s Nextera Exome, all applied to the same human tumor DNA sample.

Results

Each capture technology was evaluated for its coverage of different exome databases, target coverage efficiency, GC bias, sensitivity in single nucleotide variant detection, sensitivity in small indel detection, and technical reproducibility. In general, all technologies performed well; however, our data demonstrated small, but consistent differences between the four capture technologies. Illumina technologies cover more bases in coding and untranslated regions. Furthermore, whereas most of the technologies provide reduced coverage in regions with low or high GC content, the Nextera technology tends to bias towards target regions with high GC content.

Conclusions

We show key differences in performance between the four technologies. Our data should help researchers who are planning exome sequencing to select appropriate exome capture technology for their particular application.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-449) contains supplementary material, which is available to authorized users.  相似文献   

8.

Background

Distinct, partly competing, “waves” have been proposed to explain human migration in(to) today’s Island Southeast Asia and Australia based on genetic (and other) evidence. The paucity of high quality and high resolution data has impeded insights so far. In this study, one of the first in a forensic environment, we used the Ion Torrent Personal Genome Machine (PGM) for generating complete mitogenome sequences via stand-alone massively parallel sequencing and describe a standard data validation practice.

Results

In this first representative investigation on the mitochondrial DNA (mtDNA) variation of East Timor (Timor-Leste) population including >300 individuals, we put special emphasis on the reconstruction of the initial settlement, in particular on the previously poorly resolved haplogroup P1, an indigenous lineage of the Southwest Pacific region. Our results suggest a colonization of southern Sahul (Australia) >37 kya, limited subsequent exchange, and a parallel incubation of initial settlers in northern Sahul (New Guinea) followed by westward migrations <28 kya.

Conclusions

The temporal proximity and possible coincidence of these latter dispersals, which encompassed autochthonous haplogroups, with the postulated “later” events of (South) East Asian origin pinpoints a highly dynamic migratory phase.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-014-1201-x) contains supplementary material, which is available to authorized users.  相似文献   

9.

Background

DNA methylation is a heritable mechanism that acts in response to environmental changes, lifestyle and diseases by influencing gene expression in eukaryotes. Epigenetic studies of wild organisms are mandatory to understand their role in e.g. adaptational processes in the great variety of ecological niches. However, strategies to address those questions on a methylome scale are widely missing. In this study we present such a strategy and describe a whole genome sequence and methylome analysis of the wild guinea pig.

Results

We generated a full Wild guinea pig (Cavia aperea) genome sequence with enhanced coverage of methylated regions, benefiting from the available sequence of the domesticated relative Cavia porcellus. This new genome sequence was then used as reference to map the sequence reads of bisulfite treated Wild guinea pig sequencing libraries to investigate DNA-methylation patterns at nucleotide-specific level, by using our here described method, named ‘DNA-enrichment-bisulfite-sequencing’ (MEBS). The results achieved using MEBS matched those of standard methods in other mammalian model species. The technique is cost efficient, and incorporates both methylation enrichment results and a nucleotide-specific resolution even without a whole genome sequence available. Thus MEBS can be easily applied to extend methylation enrichment studies to a nucleotide-specific level.

Conclusions

The approach is suited to study methylomes of not yet sequenced mammals at single nucleotide resolution. The strategy is transferable to other mammalian species by applying the nuclear genome sequence of a close relative. It is therefore of interest for studies on a variety of wild species trying to answer evolutionary, adaptational, ecological or medical questions by epigenetic mechanisms.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1036) contains supplementary material, which is available to authorized users.  相似文献   

10.

Background

CRISPR-Cas9 is a revolutionary genome editing technique that allows for efficient and directed alterations of the eukaryotic genome. This relatively new technology has already been used in a large number of ‘loss of function’ experiments in cultured cells. Despite its simplicity and efficiency, screening for mutated clones remains time-consuming, laborious and/or expensive.

Results

Here we report a high-throughput screening strategy that allows parallel screening of up to 96 clones, using next-generation sequencing. As a proof of principle, we used CRISPR-Cas9 to disrupt the coding sequence of the homeobox gene, Evx1 in mouse embryonic stem cells. We screened 67 CRISPR-Cas9 transfected clones simultaneously by next-generation sequencing on the Ion Torrent PGM. We were able to identify both homozygous and heterozygous Evx1 mutants, as well as mixed clones, which must be identified to maintain the integrity of subsequent experiments.

Conclusions

Our CRISPR-Cas9 screening strategy could be widely applied to screen for CRISPR-Cas9 mutants in a variety of contexts including the generation of mutant cell lines for in vitro research, the generation of transgenic organisms and for assessing the veracity of CRISPR-Cas9 homology directed repair. This technique is cost and time-effective, provides information on clonal heterogeneity and is adaptable for use on various sequencing platforms.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1002) contains supplementary material, which is available to authorized users.  相似文献   

11.

Background

Turkey is a crossroads of major population movements throughout history and has been a hotspot of cultural interactions. Several studies have investigated the complex population history of Turkey through a limited set of genetic markers. However, to date, there have been no studies to assess the genetic variation at the whole genome level using whole genome sequencing. Here, we present whole genome sequences of 16 Turkish individuals resequenced at high coverage (32 × -48×).

Results

We show that the genetic variation of the contemporary Turkish population clusters with South European populations, as expected, but also shows signatures of relatively recent contribution from ancestral East Asian populations. In addition, we document a significant enrichment of non-synonymous private alleles, consistent with recent observations in European populations. A number of variants associated with skin color and total cholesterol levels show frequency differentiation between the Turkish populations and European populations. Furthermore, we have analyzed the 17q21.31 inversion polymorphism region (MAPT locus) and found increased allele frequency of 31.25% for H1/H2 inversion polymorphism when compared to European populations that show about 25% of allele frequency.

Conclusion

This study provides the first map of common genetic variation from 16 western Asian individuals and thus helps fill an important geographical gap in analyzing natural human variation and human migration. Our data will help develop population-specific experimental designs for studies investigating disease associations and demographic history in Turkey.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-963) contains supplementary material, which is available to authorized users.  相似文献   

12.

Background

Pakistan covers a key geographic area in human history, being both part of the Indus River region that acted as one of the cradles of civilization and as a link between Western Eurasia and Eastern Asia. This region is inhabited by a number of distinct ethnic groups, the largest being the Punjabi, Pathan (Pakhtuns), Sindhi, and Baloch.

Results

We analyzed the first ethnic male Pathan genome by sequencing it to 29.7-fold coverage using the Illumina HiSeq2000 platform. A total of 3.8 million single nucleotide variations (SNVs) and 0.5 million small indels were identified by comparing with the human reference genome. Among the SNVs, 129,441 were novel, and 10,315 nonsynonymous SNVs were found in 5,344 genes. SNVs were annotated for health consequences and high risk diseases, as well as possible influences on drug efficacy. We confirmed that the Pathan genome presented here is representative of this ethnic group by comparing it to a panel of Central Asians from the HGDP-CEPH panels typed for ~650 k SNPs. The mtDNA (H2) and Y haplogroup (L1) of this individual were also typical of his geographic region of origin. Finally, we reconstruct the demographic history by PSMC, which highlights a recent increase in effective population size compatible with admixture between European and Asian lineages expected in this geographic region.

Conclusions

We present a whole-genome sequence and analyses of an ethnic Pathan from the north-west province of Pakistan. It is a useful resource to understand genetic variation and human migration across the whole Asian continent.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1290-1) contains supplementary material, which is available to authorized users.  相似文献   

13.

Background

Endogenous murine leukemia retroviruses (MLVs) are high copy number proviral elements difficult to comprehensively characterize using standard low throughput sequencing approaches. However, high throughput approaches generate data that is challenging to process, interpret and present.

Results

Next generation sequencing (NGS) data was generated for MLVs from two wild caught Mus musculus domesticus (from mainland France and Corsica) and for inbred laboratory mouse strains C3H, LP/J and SJL. Sequence reads were grouped using a novel sequence clustering approach as applied to retroviral sequences. A Markov cluster algorithm was employed, and the sequence reads were queried for matches to specific xenotropic (Xmv), polytropic (Pmv) and modified polytropic (Mpmv) viral reference sequences.

Conclusions

Various MLV subtypes were more widespread than expected among the mice, which may be due to the higher coverage of NGS, or to the presence of similar sequence across many different proviral loci. The results did not correlate with variation in the major MLV receptor Xpr1, which can restrict exogenous MLVs, suggesting that endogenous MLV distribution may reflect gene flow more than past resistance to infection.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1766-z) contains supplementary material, which is available to authorized users.  相似文献   

14.

Background

Molecular characterization of highly diverse gene families can be time consuming, expensive, and difficult, especially when considering the potential for relatively large numbers of paralogs and/or pseudogenes. Here we investigate the utility of Pacific Biosciences single molecule real-time (SMRT) circular consensus sequencing (CCS) as an alternative to traditional cloning and Sanger sequencing PCR amplicons for gene family characterization. We target vomeronasal gene receptors, one of the most diverse gene families in mammals, with the goal of better understanding intra-specific V1R diversity of the gray mouse lemur (Microcebus murinus). Our study compares intragenomic variation for two V1R subfamilies found in the mouse lemur. Specifically, we compare gene copy variation within and between two individuals of M. murinus as characterized by different methods for nucleotide sequencing. By including the same individual animal from which the M. murinus draft genome was derived, we are able to cross-validate gene copy estimates from Sanger sequencing versus CCS methods.

Results

We generated 34,088 high quality circular consensus sequences of two diverse V1R subfamilies (here referred to as V1RI and V1RIX) from two individuals of Microcebus murinus. Using a minimum threshold of 7× coverage, we recovered approximately 90% of V1RI sequences previously identified in the draft M. murinus genome (59% being identical at all nucleotide positions). When low coverage sequences were considered (i.e. < 7× coverage) 100% of V1RI sequences identified in the draft genome were recovered. At least 13 putatively novel V1R loci were also identified using CCS technology.

Conclusions

Recent upgrades to the Pacific Biosciences RS instrument have improved the CCS technology and offer an alternative to traditional sequencing approaches. Our results suggest that the Microcebus murinus V1R repertoire has been underestimated in the draft genome. In addition to providing an improved understanding of V1R diversity in the mouse lemur, this study demonstrates the utility of CCS technology for characterizing complex regions of the genome. We anticipate that long-read sequencing technologies such as PacBio SMRT will allow for the assembly of multigene family clusters and serve to more accurately characterize patterns of gene copy variation in large gene families, thus revealing novel micro-evolutionary patterns within non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-720) contains supplementary material, which is available to authorized users.  相似文献   

15.

Background

Deviations in the amount of genomic content that arise during tumorigenesis, called copy number alterations, are structural rearrangements that can critically affect gene expression patterns. Additionally, copy number alteration profiles allow insight into cancer discrimination, progression and complexity. On data obtained from high-throughput sequencing, improving quality through GC bias correction and keeping false positives to a minimum help build reliable copy number alteration profiles.

Results

We introduce seqCNA, a parallelized R package for an integral copy number analysis of high-throughput sequencing cancer data. The package includes novel methodology on (i) filtering, reducing false positives, and (ii) GC content correction, improving copy number profile quality, especially under great read coverage and high correlation between GC content and copy number. Adequate analysis steps are automatically chosen based on availability of paired-end mapping, matched normal samples and genome annotation.

Conclusions

seqCNA, available through Bioconductor, provides accurate copy number predictions in tumoural data, thanks to the extensive filtering and better GC bias correction, while providing an integrated and parallelized workflow.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-178) contains supplementary material, which is available to authorized users.  相似文献   

16.

Background

Rigorous study of mitochondrial functions and cell biology in the budding yeast, Saccharomyces cerevisiae has advanced our understanding of mitochondrial genetics. This yeast is now a powerful model for population genetics, owing to large genetic diversity and highly structured populations among wild isolates. Comparative mitochondrial genomic analyses between yeast species have revealed broad evolutionary changes in genome organization and architecture. A fine-scale view of recent evolutionary changes within S. cerevisiae has not been possible due to low numbers of complete mitochondrial sequences.

Results

To address challenges of sequencing AT-rich and repetitive mitochondrial DNAs (mtDNAs), we sequenced two divergent S. cerevisiae mtDNAs using a single-molecule sequencing platform (PacBio RS). Using de novo assemblies, we generated highly accurate complete mtDNA sequences. These mtDNA sequences were compared with 98 additional mtDNA sequences gathered from various published collections. Phylogenies based on mitochondrial coding sequences and intron profiles revealed that intraspecific diversity in mitochondrial genomes generally recapitulated the population structure of nuclear genomes. Analysis of intergenic sequence indicated a recent expansion of mobile elements in certain populations. Additionally, our analyses revealed that certain populations lacked introns previously believed conserved throughout the species, as well as the presence of introns never before reported in S. cerevisiae.

Conclusions

Our results revealed that the extensive variation in S. cerevisiae mtDNAs is often population specific, thus offering a window into the recent evolutionary processes shaping these genomes. In addition, we offer an effective strategy for sequencing these challenging AT-rich mitochondrial genomes for small scale projects.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1664-4) contains supplementary material, which is available to authorized users.  相似文献   

17.

Background

Whole-genome sequencing represents a promising approach to pinpoint chemically induced mutations in genetic model organisms, thereby short-cutting time-consuming genetic mapping efforts.

Principal Findings

We compare here the ability of two leading high-throughput platforms for paired-end deep sequencing, SOLiD (ABI) and Genome Analyzer (Illumina; “Solexa”), to achieve the goal of mutant detection. As a test case we used a mutant C. elegans strain that harbors a mutation in the lsy-12 locus which we compare to the reference wild-type genome sequence. We analyzed the accuracy, sensitivity, and depth-coverage characteristics of the two platforms. Both platforms were able to identify the mutation that causes the phenotype of the mutant C. elegans strain, lsy-12. Based on a 4 MB genomic region in which individual variants were validated by Sanger sequencing, we observe tradeoffs between rates of false positives and false negatives when using both platforms under similar coverage and mapping criteria.

Significance

In conclusion, whole-genome sequencing conducted by either platform is a viable approach for the identification of single-nucleotide variations in the C. elegans genome.  相似文献   

18.

Background

Recent studies used the contact data or three-dimensional (3D) genome reconstructions from Hi-C (chromosome conformation capture with next-generation sequencing) to assess the co-localization of functional genomic annotations in the nucleus. These analyses dichotomized data point pairs belonging to a functional annotation as “close” or “far” based on some threshold and then tested for enrichment of “close” pairs. We propose an alternative approach that avoids dichotomization of the data and instead directly estimates the significance of distances within the 3D reconstruction.

Results

We applied this approach to 3D genome reconstructions for Plasmodium falciparum, the causative agent of malaria, and Saccharomyces cerevisiae and compared the results to previous approaches. We found significant 3D co-localization of centromeres, telomeres, virulence genes, and several sets of genes with developmentally regulated expression in P. falciparum; and significant 3D co-localization of centromeres and long terminal repeats in S. cerevisiae. Additionally, we tested the experimental observation that telomeres form three to seven clusters in P. falciparum and S. cerevisiae. Applying affinity propagation clustering to telomere coordinates in the 3D reconstructions yielded six telomere clusters for both organisms.

Conclusions

Distance-based assessment replicated key findings, while avoiding dichotomization of the data (which previously yielded threshold-sensitive results).

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-992) contains supplementary material, which is available to authorized users.  相似文献   

19.

Background

The ~17 Gb hexaploid bread wheat genome is a high priority and a major technical challenge for genomic studies. In particular, the D sub-genome is relatively lacking in genetic diversity, making it both difficult to map genetically, and a target for introgression of agriculturally useful traits. Elucidating its sequence and structure will therefore facilitate wheat breeding and crop improvement.

Results

We generated shotgun sequences from each arm of flow-sorted Triticum aestivum chromosome 5D using 454 FLX Titanium technology, giving 1.34× and 1.61× coverage of the short (5DS) and long (5DL) arms of the chromosome respectively. By a combination of sequence similarity and assembly-based methods, ~74% of the sequence reads were classified as repetitive elements, and coding sequence models of 1314 (5DS) and 2975 (5DL) genes were generated. The order of conserved genes in syntenic regions of previously sequenced grass genomes were integrated with physical and genetic map positions of 518 wheat markers to establish a virtual gene order for chromosome 5D.

Conclusions

The virtual gene order revealed a large-scale chromosomal rearrangement in the peri-centromeric region of 5DL, and a concentration of non-syntenic genes in the telomeric region of 5DS. Although our data support the large-scale conservation of Triticeae chromosome structure, they also suggest that some regions are evolving rapidly through frequent gene duplications and translocations.

Sequence accessions

EBI European Nucleotide Archive, Study no. ERP002330

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-1080) contains supplementary material, which is available to authorized users.  相似文献   

20.

Background

Rapid and accurate retrieval of whole genome sequences of human pathogens from disease vectors or animal reservoirs will enable fine-resolution studies of pathogen epidemiological and evolutionary dynamics. However, next generation sequencing technologies have not yet been fully harnessed for the study of vector-borne and zoonotic pathogens, due to the difficulty of obtaining high-quality pathogen sequence data directly from field specimens with a high ratio of host to pathogen DNA.

Results

We addressed this challenge by using custom probes for multiplexed hybrid capture to enrich for and sequence 30 Borrelia burgdorferi genomes from field samples of its arthropod vector. Hybrid capture enabled sequencing of nearly the complete genome (~99.5 %) of the Borrelia burgdorferi pathogen with 132-fold coverage, and identification of up to 12,291 single nucleotide polymorphisms per genome.

Conclusions

The proprosed culture-independent method enables efficient whole genome capture and sequencing of pathogens directly from arthropod vectors, thus making population genomic study of vector-borne and zoonotic infectious diseases economically feasible and scalable. Furthermore, given the similarities of invertebrate field specimens to other mixed DNA templates characterized by a high ratio of host to pathogen DNA, we discuss the potential applicabilty of hybrid capture for genomic study across diverse study systems.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1634-x) contains supplementary material, which is available to authorized users.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号