首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Mapping-by-sequencing combines genetic mapping with whole-genome sequencing in order to accelerate mutant identification. However, application of mapping-by-sequencing requires decisions on various practical settings on the experimental design that are not intuitively answered. Following an experimentally determined recombination landscape of Arabidopsis and next generation sequencing-specific biases, we simulated more than 400,000 mapping-by-sequencing experiments. This allowed us to evaluate a broad range of different types of experiments and to develop general rules for mapping-by-sequencing in Arabidopsis. Most importantly, this informs about the properties of different crossing scenarios, the number of recombinants and sequencing depth needed for successful mapping experiments.  相似文献   

2.
Rapid development of next generation sequencing (NGS) technologies in recent years has made whole genome sequencing of bacterial genomes widely accessible. However, it is often unnecessary or not feasible to sequence the whole genome for most applications of genetic analyses in bacteria. Selectively capturing defined genomic regions followed by NGS analysis could be a promising approach for high-resolution molecular typing of a large set of strains. In this study, we describe a novel and straightforward PCR-based target-capturing method, hairpin-primed multiplex amplification (HPMA), which allows for simultaneous amplification of numerous target genes. To test the feasibility of NGS-based strain typing using HPMA, 20 target gene sequences were simultaneously amplified with barcode tagging in each of 41 Salmonella strains. The amplicons were then pooled and analyzed by 454 pyrosequencing. Analysis of the sequence data, as an extension of multilocus sequence typing (MLST), demonstrated the utility and potential of this novel typing method, MLST-seq, as a high-resolution strain typing method. With the rapidly increasing sequencing capacity of NGS, MLST-seq or its variations using different target enrichment methods can be expected to become a high-resolution typing method in the near future for high-throughput analysis of a large collection of bacterial strains.  相似文献   

3.
Studies in tunicates such as Ciona have revealed new insights into the evolutionary origins of chordate development. Ciona populations are characterized by high levels of natural genetic variation, between 1 and 5%. This variation has provided abundant material for forward genetic studies. In the current study, we make use of deep sequencing and homozygosity mapping to map spontaneous mutations in outbred populations. With this method we have mapped two spontaneous developmental mutants. In Ciona intestinalis we mapped a short-tail mutation with strong phenotypic similarity to a previously identified mutant in the related species Ciona savignyi. Our bioinformatic approach mapped the mutation to a narrow interval containing a single mutated gene, α-laminin3,4,5, which is the gene previously implicated in C. savignyi. In addition, we mapped a novel genetic mutation disrupting neural tube closure in C. savignyi to a T-type Ca2+ channel gene. The high efficiency and unprecedented mapping resolution of our study is a powerful advantage for developmental genetics in Ciona, and may find application in other outbred species.  相似文献   

4.
5.

Background

A large single nucleotide polymorphism (SNP) dataset was used to analyze genome-wide diversity in a diverse collection of watermelon cultivars representing globally cultivated, watermelon genetic diversity. The marker density required for conducting successful association mapping depends on the extent of linkage disequilibrium (LD) within a population. Use of genotyping by sequencing reveals large numbers of SNPs that in turn generate opportunities in genome-wide association mapping and marker-assisted selection, even in crops such as watermelon for which few genomic resources are available. In this paper, we used genome-wide genetic diversity to study LD, selective sweeps, and pairwise FST distributions among worldwide cultivated watermelons to track signals of domestication.

Results

We examined 183 Citrullus lanatus var. lanatus accessions representing domesticated watermelon and generated a set of 11,485 SNP markers using genotyping by sequencing. With a diverse panel of worldwide cultivated watermelons, we identified a set of 5,254 SNPs with a minor allele frequency of ≥ 0.05, distributed across the genome. All ancestries were traced to Africa and an admixture of various ancestries constituted secondary gene pools across various continents. A sliding window analysis using pairwise FST values was used to resolve selective sweeps. We identified strong selection on chromosomes 3 and 9 that might have contributed to the domestication process. Pairwise analysis of adjacent SNPs within a chromosome as well as within a haplotype allowed us to estimate genome-wide LD decay. LD was also detected within individual genes on various chromosomes. Principal component and ancestry analyses were used to account for population structure in a genome-wide association study. We further mapped important genes for soluble solid content using a mixed linear model.

Conclusions

Information concerning the SNP resources, population structure, and LD developed in this study will help in identifying agronomically important candidate genes from the genomic regions underlying selection and for mapping quantitative trait loci using a genome-wide association study in sweet watermelon.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-767) contains supplementary material, which is available to authorized users.  相似文献   

6.
7.
RNA sequencing (RNA-seq) not only measures total gene expression but may also measure allele-specific gene expression in diploid individuals. RNA-seq data collected from F1 reciprocal crosses in mice can powerfully dissect strain and parent-of-origin effects on allelic imbalance of gene expression. In this article, we develop a novel statistical approach to analyze RNA-seq data from F1 and inbred strains. Method development was motivated by a study of F1 reciprocal crosses derived from highly divergent mouse strains, to which we apply the proposed method. Our method jointly models the total number of reads and the number of allele-specific reads of each gene, which significantly boosts power for detecting strain and particularly parent-of-origin effects. The method deals with the overdispersion problem commonly observed in read counts and can flexibly adjust for the effects of covariates such as sex and read depth. The X chromosome in mouse presents particular challenges. As in other mammals, X chromosome inactivation silences one of the two X chromosomes in each female cell, although the choice of which chromosome to be silenced can be highly skewed by alleles at the X-linked X-controlling element (Xce) and stochastic effects. Our model accounts for these chromosome-wide effects on an individual level, allowing proper analysis of chromosome X expression. Furthermore, we propose a genomic control procedure to properly control type I error for RNA-seq studies. A number of these methodological improvements can also be applied to RNA-seq data from other species as well as other types of next-generation sequencing data sets. Finally, we show through simulations that increasing the number of samples is more beneficial than increasing the library size for mapping both the strain and parent-of-origin effects. Unless sample recruiting is too expensive to conduct, we recommend sequencing more samples with lower coverage.  相似文献   

8.

Background

Long-read sequencing technologies were launched a few years ago, and in contrast with short-read sequencing technologies, they offered a promise of solving assembly problems for large and complex genomes. Moreover by providing long-range information, it could also solve haplotype phasing. However, existing long-read technologies still have several limitations that complicate their use for most research laboratories, as well as in large and/or complex genome projects. In 2014, Oxford Nanopore released the MinION® device, a small and low-cost single-molecule nanopore sequencer, which offers the possibility of sequencing long DNA fragments.

Results

The assembly of long reads generated using the Oxford Nanopore MinION® instrument is challenging as existing assemblers were not implemented to deal with long reads exhibiting close to 30% of errors. Here, we presented a hybrid approach developed to take advantage of data generated using MinION® device. We sequenced a well-known bacterium, Acinetobacter baylyi ADP1 and applied our method to obtain a highly contiguous (one single contig) and accurate genome assembly even in repetitive regions, in contrast to an Illumina-only assembly. Our hybrid strategy was able to generate NaS (Nanopore Synthetic-long) reads up to 60 kb that aligned entirely and with no error to the reference genome and that spanned highly conserved repetitive regions. The average accuracy of NaS reads reached 99.99% without losing the initial size of the input MinION® reads.

Conclusions

We described NaS tool, a hybrid approach allowing the sequencing of microbial genomes using the MinION® device. Our method, based ideally on 20x and 50x of NaS and Illumina reads respectively, provides an efficient and cost-effective way of sequencing microbial or small eukaryotic genomes in a very short time even in small facilities. Moreover, we demonstrated that although the Oxford Nanopore technology is a relatively new sequencing technology, currently with a high error rate, it is already useful in the generation of high-quality genome assemblies.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1519-z) contains supplementary material, which is available to authorized users.  相似文献   

9.
Spiders are the most common and predominant predators in terrestrial ecosystems. The predatory behavior of spiders affects the energy flow across the food web within an ecosystem. Traditiaonal methods for analyzing spider diets such as field observation, anatomy and faeces analysis are not suitable for spider experiments due to spiders’ special dietary behavior. The molecular method based on the specific primers of prey DNA seems to be inefficient either in spite of its wide application in diet analysis. As the next-generation sequencing (NGS) technology becomes prevalent in many different areas, several cases of the NGS-based analysis of mammal diets have been published. This study analyzed the diet differences of Pardosa pseudoannulata (Araneae: Lycosidae) in four habitats (a wetland, a tea plantation, an alpine meadow and a paddy field) by using the NGS technology, combined with the DNA barcode method. The results suggested that the Pardosa pseudoannulata feed on a broad range of prey, and 7 orders and 24 families of insects were detected in the four investigated habitats. Moreover, it is found that the diet diversity of Pardosa pseudoannulata is greatly influenced by their living environments and seasons. In a nutshell, this study established an NGS-based methodology for spider diets analysis, and the results provided some basic materials to inform the protection and utilization of the Pardosa pseudoannulata as a potential eco-friendly predator against pests.  相似文献   

10.
Next‐generation genomic sequencing technologies have made it possible to directly map mutations responsible for phenotypes of interest via direct sequencing. However, most mapping strategies proposed to date require some prior genetic analysis, which can be very time‐consuming even in genetically tractable organisms. Here we present a de novo method for rapidly and robustly mapping the physical location of EMS mutations by sequencing a small pooled F2 population. This method, called Next Generation Mapping (NGM), uses a chastity statistic to quantify the relative contribution of the parental mutant and mapping lines to each SNP in the pooled F2 population. It then uses this information to objectively localize the candidate mutation based on its exclusive segregation with the mutant parental line. A user‐friendly, web‐based tool for performing NGM analysis is available at http://bar.utoronto.ca/NGM . We used NGM to identify three genes involved in cell‐wall biology in Arabidopsis thaliana, and, in a power analysis, demonstrate success in test mappings using as few as ten F2 lines and a single channel of Illumina Genome Analyzer data. This strategy can easily be applied to other model organisms, and we expect that it will also have utility in crops and any other eukaryote with a completed genome sequence.  相似文献   

11.

Background

High light tolerance of microalgae is a desired phenotype for efficient cultivation in large scale production systems under fluctuating outdoor conditions. Outdoor cultivation requires the use of either wild-type or non-GMO derived mutant strains due to safety concerns. The identification and molecular characterization of such mutants derived from untagged forward genetics approaches was limited previously by the tedious and time-consuming methods involving techniques such as classical meiotic mapping. The combination of mapping with next generation sequencing technologies offers alternative strategies to identify genes involved in high light adaptation in untagged mutants.

Results

We used the model alga Chlamydomonas reinhardtii in a non-GMO mutation strategy without any preceding crossing step or pooled progeny to identify genes involved in the regulatory processes of high light adaptation. To generate high light tolerant mutants, wildtype cells were mutagenized only to a low extent, followed by a stringent selection. We performed whole-genome sequencing of two independent mutants hit1 and hit2 and the parental wildtype. The availability of a reference genome sequence and the removal of shared bakground variants between the wildtype strain and each mutant, enabled us to identify two single nucleotide polymorphisms within the same gene Cre02.g085050, hereafter called LRS1 (putative Light Response Signaling protein 1). These two independent single amino acid exchanges are both located in the putative WD40 propeller domain of the corresponding protein LRS1. Both mutants exhibited an increased rate of non-photochemical-quenching (NPQ) and an improved resistance against chemically induced reactive oxygen species. In silico analyses revealed homology of LRS1 to the photoregulatory protein COP1 in plants.

Conclusions

In this work we identified the nuclear encoded gene LRS1 as an essential factor for high light adaptation in C. reinhardtii. The causative random mutation within this gene was identified by a rapid and efficient method, avoiding any preceding crossing step, meiotic mapping, or pooled progeny. Our results open up new insights into mechanisms of high light adaptation in microalgae and at the same time provide a simplified strategy for non-GMO forward genetics, a crucial precondition that could result in the identification of key factors for economically relevant biological processes within algae.  相似文献   

12.
Sequence Analysis of the Genome of an Oil-Bearing Tree, Jatropha curcas L.   总被引:2,自引:0,他引:2  
《DNA research》2011,18(1):65-76
The whole genome of Jatropha curcas was sequenced, using a combination of the conventional Sanger method and new-generation multiplex sequencing methods. Total length of the non-redundant sequences thus obtained was 285 858 490 bp consisting of 120 586 contigs and 29 831 singlets. They accounted for ∼95% of the gene-containing regions with the average G + C content was 34.3%. A total of 40 929 complete and partial structures of protein encoding genes have been deduced. Comparison with genes of other plant species indicated that 1529 (4%) of the putative protein-encoding genes are specific to the Euphorbiaceae family. A high degree of microsynteny was observed with the genome of castor bean and, to a lesser extent, with those of soybean and Arabidopsis thaliana. In parallel with genome sequencing, cDNAs derived from leaf and callus tissues were subjected to pyrosequencing, and a total of 21 225 unigene data have been generated. Polymorphism analysis using microsatellite markers developed from the genomic sequence data obtained was performed with 12 J. curcas lines collected from various parts of the world to estimate their genetic diversity. The genomic sequence and accompanying information presented here are expected to serve as valuable resources for the acceleration of fundamental and applied research with J. curcas, especially in the fields of environment-related research such as biofuel production. Further information on the genomic sequences and DNA markers is available at http://www.kazusa.or.jp/jatropha/.  相似文献   

13.
高通量测序技术及其在微生物学研究中的应用   总被引:18,自引:0,他引:18  
20世纪70年代发明的核酸测序技术为基因组学及其相关学科的发展做出了巨大贡献,本世纪初发展的以Illumina公司的HiSeq 2000,ABI公司的SOLID,和Roche公司的454技术为代表的高通量测序技术又为基因组学的发展注入了新活力.本文在阐述这些技术的基础上,着重讨论了新一代测序技术在微生物领域中的应用.  相似文献   

14.

Background

Molecular characterization of highly diverse gene families can be time consuming, expensive, and difficult, especially when considering the potential for relatively large numbers of paralogs and/or pseudogenes. Here we investigate the utility of Pacific Biosciences single molecule real-time (SMRT) circular consensus sequencing (CCS) as an alternative to traditional cloning and Sanger sequencing PCR amplicons for gene family characterization. We target vomeronasal gene receptors, one of the most diverse gene families in mammals, with the goal of better understanding intra-specific V1R diversity of the gray mouse lemur (Microcebus murinus). Our study compares intragenomic variation for two V1R subfamilies found in the mouse lemur. Specifically, we compare gene copy variation within and between two individuals of M. murinus as characterized by different methods for nucleotide sequencing. By including the same individual animal from which the M. murinus draft genome was derived, we are able to cross-validate gene copy estimates from Sanger sequencing versus CCS methods.

Results

We generated 34,088 high quality circular consensus sequences of two diverse V1R subfamilies (here referred to as V1RI and V1RIX) from two individuals of Microcebus murinus. Using a minimum threshold of 7× coverage, we recovered approximately 90% of V1RI sequences previously identified in the draft M. murinus genome (59% being identical at all nucleotide positions). When low coverage sequences were considered (i.e. < 7× coverage) 100% of V1RI sequences identified in the draft genome were recovered. At least 13 putatively novel V1R loci were also identified using CCS technology.

Conclusions

Recent upgrades to the Pacific Biosciences RS instrument have improved the CCS technology and offer an alternative to traditional sequencing approaches. Our results suggest that the Microcebus murinus V1R repertoire has been underestimated in the draft genome. In addition to providing an improved understanding of V1R diversity in the mouse lemur, this study demonstrates the utility of CCS technology for characterizing complex regions of the genome. We anticipate that long-read sequencing technologies such as PacBio SMRT will allow for the assembly of multigene family clusters and serve to more accurately characterize patterns of gene copy variation in large gene families, thus revealing novel micro-evolutionary patterns within non-model organisms.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2164-15-720) contains supplementary material, which is available to authorized users.  相似文献   

15.
Next generation sequencing (NGS) technologies are being used to generate whole genome sequences for a wide range of crop species. When combined with precise phenotyping methods, these technologies provide a powerful and rapid tool for identifying the genetic basis of agriculturally important traits and for predicting the breeding value of individuals in a plant breeding population. Here we summarize current trends and future prospects for utilizing NGS-based technologies to develop crops with improved trait performance and increase the efficiency of modern plant breeding. It is our hope that the application of NGS technologies to plant breeding will help us to meet the challenge of feeding a growing world population.
This article is part of the PLOS Biology Collection “The Promise of Plant Translational Research.”
  相似文献   

16.
Genome-wide physical protein–protein interaction (PPI) mapping remains a major challenge for current technologies. Here, we reported a high-efficiency BiFC-seq method, yeast-enhanced green fluorescent protein-based bimolecular fluorescence complementation (yEGFP-BiFC) coupled with next-generation DNA sequencing, for interactome mapping. We first applied yEGFP-BiFC method to systematically investigate an intraviral network of the Ebola virus. Two-thirds (9/14) of known interactions of EBOV were recaptured, and five novel interactions were discovered. Next, we used the BiFC-seq method to map the interactome of the tumor protein p53. We identified 97 interactors of p53, more than three-quarters of which were novel. Furthermore, in a more complex background, we screened potential interactors by pooling two BiFC libraries together and revealed a network of 229 interactions among 205 proteins. These results show that BiFC-seq is a highly sensitive, rapid, and economical method for genome-wide interactome mapping.  相似文献   

17.
18.
19.
The pufferfish Takifugu flavidus is an important economic species due to its outstanding flavour and high market value. It has been regarded as an excellent model of genetic study for decades as well. In the present study, three mate-pair libraries of T. flavidus genome were sequenced by the SOLiD 4 next-generation sequencing platform, and the draft genome was constructed with the short reads using an assisted assembly strategy. The draft consists of 50,947 scaffolds with an N50 value of 305.7 kb, and the average GC content was 45.2%. The combined length of repetitive sequences was 26.5 Mb, which accounted for 6.87% of the genome, indicating that the compactness of T. flavidus genome was approximative with that of T. rubripes genome. A total of 1,253 non-coding RNA genes and 30,285 protein-encoding genes were assigned to the genome. There were 132,775 and 394 presumptive genes playing roles in the colour pattern variation, the relatively slow growth and the lipid metabolism, respectively. Among them, genes involved in the microtubule-dependent transport system, angiogenesis, decapentaplegic pathway and lipid mobilization were significantly expanded in the T. flavidus genome. This draft genome provides a valuable resource for understanding and improving both fundamental and applied research with pufferfish in the future.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号