共查询到20条相似文献,搜索用时 15 毫秒
1.
Background
High-throughput DNA sequencing technologies are generating vast amounts of data. Fast, flexible and memory efficient implementations are needed in order to facilitate analyses of thousands of samples simultaneously.Results
We present a multithreaded program suite called ANGSD. This program can calculate various summary statistics, and perform association mapping and population genetic analyses utilizing the full information in next generation sequencing data by working directly on the raw sequencing data or by using genotype likelihoods.Conclusions
The open source c/c++ program ANGSD is available at http://www.popgen.dk/angsd. The program is tested and validated on GNU/Linux systems. The program facilitates multiple input formats including BAM and imputed beagle genotype probability files. The program allow the user to choose between combinations of existing methods and can perform analysis that is not implemented elsewhere.Electronic supplementary material
The online version of this article (doi:10.1186/s12859-014-0356-4) contains supplementary material, which is available to authorized users. 相似文献2.
3.
Minfeng Chen Pengfei Song Dan Zou Xuesong Hu Shancen Zhao Shengjie Gao Fei Ling 《PloS one》2014,9(12)
Single-cell sequencing promotes our understanding of the heterogeneity of cellular populations, including the haplotypes and genomic variability among different generation of cells. Whole-genome amplification is crucial to generate sufficient DNA fragments for single-cell sequencing projects. Using sequencing data from single sperms, we quantitatively compare two prevailing amplification methods that extensively applied in single-cell sequencing, multiple displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Our results show that MALBAC, as a combination of modified MDA and tweaked PCR, has a higher level of uniformity, specificity and reproducibility. 相似文献
4.
Adriana Maria Antunes Júlio Gabriel Nunes Stival Cíntia Pelegrineti Targueta Mariana Pires de Campos Telles Thannya Nascimentos Soares 《Current Genomics》2022,23(3):175
Background: Also known as Simple Sequence Repetitions (SSRs), microsatellites are profoundly informative molecular markers and powerful tools in genetics and ecology studies on plants.Objective: This research presents a workflow for developing microsatellite markers using genome skimming.Methods: The pipeline was proposed in several stages that must be performed sequentially: obtaining DNA sequences, identifying microsatellite regions, designing primers, and selecting candidate microsatellite regions to develop the markers. Our pipeline efficiency was analyzed using Illumina sequencing data from the non-model tree species Pterodon emarginatus Vog.Results: The pipeline revealed 4,382 microsatellite regions and drew 7,411 pairs of primers for P. emarginatus. However, a much larger number of microsatellite regions with the potential to develop markers were discovered from our pipeline. We selected 50 microsatellite regions with high potential for developing markers and organized 29 microsatellite regions in sets for multiplex PCR.Conclusion: The proposed pipeline is a powerful tool for fast and efficient development of microsatellite markers on a large scale in several species, especially nonmodel plant species. 相似文献
5.
6.
7.
8.
Mahesh Adhikari Sang Woo Kim Hyun Seung Kim Ki Young Kim Hyo Bin Park Ki Jung Kim Youn Su Lee 《The Plant Pathology Journal》2021,37(6):521
Knowledge and better understanding of functions of the microbial community are pivotal for crop management. This study was conducted to study bacterial structures including Acidovorax species community structures and diversity from the watermelon cultivated soils in different regions of South Korea. In this study, soil samples were collected from watermelon cultivation areas from various places of South Korea and microbiome analysis was performed to analyze bacterial communities including Acidovorax species community. Next generation sequencing (NGS) was performed by extracting genomic DNA from 92 soil samples from 8 different provinces using a fast genomic DNA extraction kit. NGS data analysis results revealed that, total, 39,367 operational taxonomic unit (OTU), were obtained. NGS data results revealed that, most dominant phylum in all the soil samples was Proteobacteria (37.3%). In addition, most abundant genus was Acidobacterium (1.8%) in all the samples. In order to analyze species diversity among the collected soil samples, OTUs, community diversity, and Shannon index were measured. Shannon (9.297) and inverse Simpson (0.996) were found to have the highest diversity scores in the greenhouse soil sample of Gyeonggi-do province (GG4). Results from NGS sequencing suggest that, most of the soil samples consists of similar trend of bacterial community and diversity. Environmental factors play a key role in shaping the bacterial community and diversity. In order to address this statement, further correlation analysis between soil physical and chemical parameters with dominant bacterial community will be carried out to observe their interactions. 相似文献
9.
Baltasar Mayo Caio T. C. C Rachid ángel Alegría Analy M. O Leite Raquel S Peixoto Susana Delgado 《Current Genomics》2014,15(4):293-309
Understanding the Maxam-Gilbert and Sanger sequencing as the first generation, in recent years there has been an explosion of newly-developed sequencing strategies, which are usually referred to as next generation sequencing (NGS) techniques. NGS techniques have high-throughputs and produce thousands or even millions of sequences at the same time. These sequences allow for the accurate identification of microbial taxa, including uncultivable organisms and those present in small numbers. In specific applications, NGS provides a complete inventory of all microbial operons and genes present or being expressed under different study conditions. NGS techniques are revolutionizing the field of microbial ecology and have recently been used to examine several food ecosystems. After a short introduction to the most common NGS systems and platforms, this review addresses how NGS techniques have been employed in the study of food microbiota and food fermentations, and discusses their limits and perspectives. The most important findings are reviewed, including those made in the study of the microbiota of milk, fermented dairy products, and plant-, meat- and fish-derived fermented foods. The knowledge that can be gained on microbial diversity, population structure and population dynamics via the use of these technologies could be vital in improving the monitoring and manipulation of foods and fermented food products. They should also improve their safety. 相似文献
10.
Background
The assembly of viral or endosymbiont genomes from Next Generation Sequencing (NGS) data is often hampered by the predominant abundance of reads originating from the host organism. These reads increase the memory and CPU time usage of the assembler and can lead to misassemblies.Results
We developed RAMBO-K (Read Assignment Method Based On K-mers), a tool which allows rapid and sensitive removal of unwanted host sequences from NGS datasets. Reaching a speed of 10 Megabases/s on 4 CPU cores and a standard hard drive, RAMBO-K is faster than any tool we tested, while showing a consistently high sensitivity and specificity across different datasets.Conclusions
RAMBO-K rapidly and reliably separates reads from different species without data preprocessing. It is suitable as a straightforward standard solution for workflows dealing with mixed datasets. Binaries and source code (java and python) are available from http://sourceforge.net/projects/rambok/. 相似文献11.
遗传病的防治是公共卫生领域的重大课题,而明确病因是遗传病防治的重要环节。高通量测序技术(又称二代测序技术)具有高通量、低成本、高准确度的优点,为遗传诊断及咨询提供了直接证据,已成为遗传学检测不可或缺的有力工具;第三代测序也凭借其长读长的独特优势在临床应用中占据一席之地。二代及三代测序技术各有特点,互为补充,临床中针对不同的检测需求有多种类型的测序方案可供选择。基于此,对二代及三代测序技术的原理、分类及其在遗传学诊断中的应用进展做一综述,以期为临床测序方案的选择提供思路和指导。 相似文献
12.
Penelope K. Lindeque Helen E. Parry Rachel A. Harmer Paul J. Somerfield Angus Atkinson 《PloS one》2013,8(11)
Background
Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel.Methodology/Principle Findings
Plankton net hauls (200 µm) were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups.Conclusions
Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may become increasingly attractive in future if sequence reference libraries of accurately identified individuals are better populated. 相似文献13.
Aarti Desai Veer Singh Marwah Akshay Yadav Vineet Jha Kishor Dhaygude Ujwala Bangar Vivek Kulkarni Abhay Jere 《PloS one》2013,8(4)
Next Generation Sequencing (NGS) is a disruptive technology that has found widespread acceptance in the life sciences research community. The high throughput and low cost of sequencing has encouraged researchers to undertake ambitious genomic projects, especially in de novo genome sequencing. Currently, NGS systems generate sequence data as short reads and de novo genome assembly using these short reads is computationally very intensive. Due to lower cost of sequencing and higher throughput, NGS systems now provide the ability to sequence genomes at high depth. However, currently no report is available highlighting the impact of high sequence depth on genome assembly using real data sets and multiple assembly algorithms. Recently, some studies have evaluated the impact of sequence coverage, error rate and average read length on genome assembly using multiple assembly algorithms, however, these evaluations were performed using simulated datasets. One limitation of using simulated datasets is that variables such as error rates, read length and coverage which are known to impact genome assembly are carefully controlled. Hence, this study was undertaken to identify the minimum depth of sequencing required for de novo assembly for different sized genomes using graph based assembly algorithms and real datasets. Illumina reads for E.coli (4.6 MB) S.kudriavzevii (11.18 MB) and C.elegans (100 MB) were assembled using SOAPdenovo, Velvet, ABySS, Meraculous and IDBA-UD. Our analysis shows that 50X is the optimum read depth for assembling these genomes using all assemblers except Meraculous which requires 100X read depth. Moreover, our analysis shows that de novo assembly from 50X read data requires only 6–40 GB RAM depending on the genome size and assembly algorithm used. We believe that this information can be extremely valuable for researchers in designing experiments and multiplexing which will enable optimum utilization of sequencing as well as analysis resources. 相似文献
14.
Mechanism of chimera formation during the Multiple Displacement Amplification reaction 总被引:1,自引:0,他引:1
Background
Multiple Displacement Amplification (MDA) is a method used for amplifying limiting DNA sources. The high molecular weight amplified DNA is ideal for DNA library construction. While this has enabled genomic sequencing from one or a few cells of unculturable microorganisms, the process is complicated by the tendency of MDA to generate chimeric DNA rearrangements in the amplified DNA. Determining the source of the DNA rearrangements would be an important step towards reducing or eliminating them. 相似文献15.
Noncoding DNA sequences (NCS) have attracted much attention recently due to their functional potentials. Here we attempted to reveal the functional roles of noncoding sequences from the point of view of natural selection that typically indicates the functional potentials of certain genomic elements. We analyzed nearly 37 million single nucleotide polymorphisms (SNPs) of Phase I data of the 1000 Genomes Project. We estimated a series of key parameters of population genetics and molecular evolution to characterize sequence variations of the noncoding genome within and between populations, and identified the natural selection footprints in NCS in worldwide human populations. Our results showed that purifying selection is prevalent and there is substantial constraint of variations in NCS, while positive selectionis more likely to be specific to some particular genomic regions and regional populations. Intriguingly, we observed larger fraction of non-conserved NCS variants with lower derived allele frequency in the genome, indicating possible functional gain of non-conserved NCS. Notably, NCS elements are enriched for potentially functional markers such as eQTLs, TF motif, and DNase I footprints in the genome. More interestingly, some NCS variants associated with diseases such as Alzheimer''s disease, Type 1 diabetes, and immune-related bowel disorder (IBD) showed signatures of positive selection, although the majority of NCS variants, reported as risk alleles by genome-wide association studies, showed signatures of negative selection. Our analyses provided compelling evidence of natural selection forces on noncoding sequences in the human genome and advanced our understanding of their functional potentials that play important roles in disease etiology and human evolution. 相似文献
16.
17.
Next Generation Sequencing to Define Prokaryotic and Fungal Diversity in the Bovine Rumen 总被引:1,自引:0,他引:1
Derrick E. Fouts Sebastian Szpakowski Janaki Purushe Manolito Torralba Richard C. Waterman Michael D. MacNeil Leeson J. Alexander Karen E. Nelson 《PloS one》2012,7(11)
A combination of Sanger and 454 sequences of small subunit rRNA loci were used to interrogate microbial diversity in the bovine rumen of 12 cows consuming a forage diet. Observed bacterial species richness, based on the V1–V3 region of the 16S rRNA gene, was between 1,903 to 2,432 species-level operational taxonomic units (OTUs) when 5,520 reads were sampled per animal. Eighty percent of species-level OTUs were dominated by members of the order Clostridiales, Bacteroidales, Erysipelotrichales and unclassified TM7. Abundance of Prevotella species varied widely among the 12 animals. Archaeal species richness, also based on 16S rRNA, was between 8 and 13 OTUs, representing 5 genera. The majority of archaeal OTUs (84%) found in this study were previously observed in public databases with only two new OTUs discovered. Observed rumen fungal species richness, based on the 18S rRNA gene, was between 21 and 40 OTUs with 98.4–99.9% of OTUs represented by more than one read, using Good’s coverage. Examination of the fungal community identified numerous novel groups. Prevotella and Tannerella were overrepresented in the liquid fraction of the rumen while Butyrivibrio and Blautia were significantly overrepresented in the solid fraction of the rumen. No statistical difference was observed between the liquid and solid fractions in biodiversity of archaea and fungi. The survey of microbial communities and analysis of cross-domain correlations suggested there is a far greater extent of microbial diversity in the bovine rumen than previously appreciated, and that next generation sequencing technologies promise to reveal novel species, interactions and pathways that can be studied further in order to better understand how rumen microbial community structure and function affects ruminant feed efficiency, biofuel production, and environmental impact. 相似文献
18.
Alan J. Fox Matthew C. Hiemenz David B. Lieberman Shrey Sukhadia Barnett Li Joseph Grubb Patrick Candrea Karthik Ganapathy Jianhua Zhao David Roth Evan Alley Alison Loren Jennifer J. D. Morrissette 《Journal of visualized experiments : JoVE》2016,(115)
As our understanding of the driver mutations necessary for initiation and progression of cancers improves, we gain critical information on how specific molecular profiles of a tumor may predict responsiveness to therapeutic agents or provide knowledge about prognosis. At our institution a tumor genotyping program was established as part of routine clinical care, screening both hematologic and solid tumors for a wide spectrum of mutations using two next-generation sequencing (NGS) panels: a custom, 33 gene hematological malignancies panel for use with peripheral blood and bone marrow, and a commercially produced solid tumor panel for use with formalin-fixed paraffin-embedded tissue that targets 47 genes commonly mutated in cancer. Our workflow includes a pathologist review of the biopsy to ensure there is adequate amount of tumor for the assay followed by customized DNA extraction is performed on the specimen. Quality control of the specimen includes steps for quantity, quality and integrity and only after the extracted DNA passes these metrics an amplicon library is generated and sequenced. The resulting data is analyzed through an in-house bioinformatics pipeline and the variants are reviewed and interpreted for pathogenicity. Here we provide a snapshot of the utility of each panel using two clinical cases to provide insight into how a well-designed NGS workflow can contribute to optimizing clinical outcomes. 相似文献
19.
Douglas G. Ward Laura Baxter Naheema S. Gordon Sascha Ott Richard S. Savage Andrew D. Beggs Jonathan D. James Jennifer Lickiss Shaun Green Yvonne Wallis Wenbin Wei Nicholas D. James Maurice P. Zeegers KK Cheng Glenn M. Mathews Prashant Patel Michael Griffiths Richard T. Bryan 《PloS one》2016,11(2)
Background
Highly sensitive and specific urine-based tests to detect either primary or recurrent bladder cancer have proved elusive to date. Our ever increasing knowledge of the genomic aberrations in bladder cancer should enable the development of such tests based on urinary DNA.Methods
DNA was extracted from urine cell pellets and PCR used to amplify the regions of the TERT promoter and coding regions of FGFR3, PIK3CA, TP53, HRAS, KDM6A and RXRA which are frequently mutated in bladder cancer. The PCR products were barcoded, pooled and paired-end 2 x 250 bp sequencing performed on an Illumina MiSeq. Urinary DNA was analysed from 20 non-cancer controls, 120 primary bladder cancer patients (41 pTa, 40 pT1, 39 pT2+) and 91 bladder cancer patients post-TURBT (89 cancer-free).Results
Despite the small quantities of DNA extracted from some urine cell pellets, 96% of the samples yielded mean read depths >500. Analysing only previously reported point mutations, TERT mutations were found in 55% of patients with bladder cancer (independent of stage), FGFR3 mutations in 30% of patients with bladder cancer, PIK3CA in 14% and TP53 mutations in 12% of patients with bladder cancer. Overall, these previously reported bladder cancer mutations were detected in 86 out of 122 bladder cancer patients (70% sensitivity) and in only 3 out of 109 patients with no detectable bladder cancer (97% specificity).Conclusion
This simple, cost-effective approach could be used for the non-invasive surveillance of patients with non-muscle-invasive bladder cancers harbouring these mutations. The method has a low DNA input requirement and can detect low levels of mutant DNA in a large excess of normal DNA. These genes represent a minimal biomarker panel to which extra markers could be added to develop a highly sensitive diagnostic test for bladder cancer. 相似文献20.
Receptor Displacement in the Cell Membrane by Hydrodynamic Force Amplification through Nanoparticles
Silvan Türkcan Maximilian?U. Richly Cedric?I. Bouzigues Jean-Marc Allain Antigoni Alexandrou 《Biophysical journal》2013,105(1):116-126
We introduce an intrinsically multiplexed and easy to implement method to apply an external force to a biomolecule and thus probe its interaction with a second biomolecule or, more generally, its environment (for example, the cell membrane). We take advantage of the hydrodynamic interaction with a controlled fluid flow within a microfluidic channel to apply a force. By labeling the biomolecule with a nanoparticle that acts as a kite and increases the hydrodynamic interaction with the fluid, the drag induced by convection becomes important. We use this approach to track the motion of single membrane receptors, the Clostridium perfringens ε-toxin (CPεT) receptors that are confined in lipid raft platforms, and probe their interaction with the environment. Under external force, we observe displacements over distances up to 10 times the confining domain diameter due to elastic deformation of a barrier and return to the initial position after the flow is stopped. Receptors can also jump over such barriers. Analysis of the receptor motion characteristics before, during, and after a force is applied via the flow indicates that the receptors are displaced together with their confining raft platform. Experiments before and after incubation with latrunculin B reveal that the barriers are part of the actin cytoskeleton and have an average spring constant of 2.5 ± 0.6 pN/μm before vs. 0.6 ± 0.2 pN/μm after partial actin depolymerization. Our data, in combination with our previous work demonstrating that the ε-toxin receptor confinement is not influenced by the cytoskeleton, imply that it is the raft platform and its constituents rather than the receptor itself that encounters and deforms the barriers formed by the actin cytoskeleton. 相似文献