期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Non-referenced genome assembly from epigenomic short-read data

Antony Kaspi Mark Ziemann Samuel T Keating Ishant Khurana Timothy Connor Briana Spolding Adrian Cooper Ross Lazarus Ken Walder Paul Zimmet Assam El-Osta 《Epigenetics》2014,9(10):1329-1338

Current computational methods used to analyze changes in DNA methylation and chromatin modification rely on sequenced genomes. Here we describe a pipeline for the detection of these changes from short-read sequence data that does not require a reference genome. Open source software packages were used for sequence assembly, alignment, and measurement of differential enrichment. The method was evaluated by comparing results with reference-based results showing a strong correlation between chromatin modification and gene expression. We then used our de novo sequence assembly to build the DNA methylation profile for the non-referenced Psammomys obesus genome. The pipeline described uses open source software for fast annotation and visualization of unreferenced genomic regions from short-read data. 相似文献

2.

X-MATE: a flexible system for mapping short read data

Wood DL Xu Q Pearson JV Cloonan N Grimmond SM 《Bioinformatics (Oxford, England)》2011,27(4):580-581

SUMMARY: Accurate and complete mapping of short-read sequencing to a reference genome greatly enhances the discovery of biological results and improves statistical predictions. We recently presented RNA-MATE, a pipeline for the recursive mapping of RNA-Seq datasets. With the rapid increase in genome re-sequencing projects, progression of available mapping software and the evolution of file formats, we now present X-MATE, an updated version of RNA-MATE, capable of mapping both RNA-Seq and DNA datasets and with improved performance, output file formats, configuration files, and flexibility in core mapping software. AVAILABILITY: Executables, source code, junction libraries, test data and results and the user manual are available from http://grimmond.imb.uq.edu.au/X-MATE/. 相似文献

3.

An Integrated Pipeline for de Novo Assembly of Microbial Genomes

Andrew Tritt Jonathan A. Eisen Marc T. Facciotti Aaron E. Darling 《PloS one》2012,7(9)

Remarkable advances in DNA sequencing technology have created a need for de novo genome assembly methods tailored to work with the new sequencing data types. Many such methods have been published in recent years, but assembling raw sequence data to obtain a draft genome has remained a complex, multi-step process, involving several stages of sequence data cleaning, error correction, assembly, and quality control. Successful application of these steps usually requires intimate knowledge of a diverse set of algorithms and software. We present an assembly pipeline called A5 (Andrew And Aaron''s Awesome Assembly pipeline) that simplifies the entire genome assembly process by automating these stages, by integrating several previously published algorithms with new algorithms for quality control and automated assembly parameter selection. We demonstrate that A5 can produce assemblies of quality comparable to a leading assembly algorithm, SOAPdenovo, without any prior knowledge of the particular genome being assembled and without the extensive parameter tuning required by the other assembly algorithm. In particular, the assemblies produced by A5 exhibit 50% or more reduction in broken protein coding sequences relative to SOAPdenovo assemblies. The A5 pipeline can also assemble Illumina sequence data from libraries constructed by the Nextera (transposon-catalyzed) protocol, which have markedly different characteristics to mechanically sheared libraries. Finally, A5 has modest compute requirements, and can assemble a typical bacterial genome on current desktop or laptop computer hardware in under two hours, depending on depth of coverage. 相似文献

4.

Dynamics of Nucleosome Assembly and Effects of DNA Methylation

Ju Yeon Lee Jaehyoun Lee Hongjun Yue Tae-Hee Lee 《The Journal of biological chemistry》2015,290(7):4291-4303

The nucleosome is the fundamental packing unit of the eukaryotic genome, and CpG methylation is an epigenetic modification associated with gene repression and silencing. We investigated nucleosome assembly mediated by histone chaperone Nap1 and the effects of CpG methylation based on three-color single molecule FRET measurements, which enabled direct monitoring of histone binding in the context of DNA wrapping. According to our observation, (H3-H4)₂ tetramer incorporation must precede H2A-H2B dimer binding, which is independent of DNA termini wrapping. Upon CpG methylation, (H3-H4)₂ tetramer incorporation and DNA termini wrapping are facilitated, whereas proper incorporation of H2A-H2B dimers is inhibited. We suggest that these changes are due to rigidified DNA and increased random binding of histones to DNA. According to the results, CpG methylation expedites nucleosome assembly in the presence of abundant DNA and histones, which may help facilitate gene packaging in chromatin. The results also indicate that the slowest steps in nucleosome assembly are DNA termini wrapping and tetramer positioning, both of which are affected heavily by changes in the physical properties of DNA. 相似文献

5.

A biologist's guide to de novo genome assembly using next-generation sequence data: A test with fungal genomes

Haridas S Breuill C Bohlmann J Hsiang T 《Journal of microbiological methods》2011,86(3):368-375

We offer a guide to de novo genome assembly¹ using sequence data generated by the Illumina platform for biologists working with fungi or other organisms whose genomes are less than 100 Mb in size. The guide requires no familiarity with sequencing assembly technology or associated computer programs. It defines commonly used terms in genome sequencing and assembly; provides examples of assembling short-read genome sequence data for four strains of the fungus Grosmannia clavigera using four assembly programs; gives examples of protocols and software; and presents a commented flowchart that extends from DNA preparation for submission to a sequencing center, through to processing and assembly of the raw sequence reads using freely available operating systems and software. 相似文献

6.

Accuracy of de novo assembly of DNA sequences from double‐digest libraries varies substantially among software

Melanie E. F. LaCava Ellen O. Aikens Libby C. Megna Gregg Randolph Charley Hubbard C. Alex Buerkle 《Molecular ecology resources》2020,20(2):360-370

Advances in DNA sequencing have made it feasible to gather genomic data for non‐model organisms and large sets of individuals, often using methods for sequencing subsets of the genome. Several of these methods sequence DNA associated with endonuclease restriction sites (various RAD and GBS methods). For use in taxa without a reference genome, these methods rely on de novo assembly of fragments in the sequencing library. Many of the software options available for this application were originally developed for other assembly types and we do not know their accuracy for reduced representation libraries. To address this important knowledge gap, we simulated data from the Arabidopsis thaliana and Homo sapiens genomes and compared de novo assemblies by six software programs that are commonly used or promising for this purpose (ABySS , CD‐HIT , Stacks , Stacks2 , Velvet and VSEARCH ). We simulated different mutation rates and types of mutations, and then applied the six assemblers to the simulated data sets, varying assembly parameters. We found substantial variation in software performance across simulations and parameter settings. ABySS failed to recover any true genome fragments, and Velvet and VSEARCH performed poorly for most simulations. Stacks and Stacks2 produced accurate assemblies of simulations containing SNPs, but the addition of insertion and deletion mutations decreased their performance. CD‐HIT was the only assembler that consistently recovered a high proportion of true genome fragments. Here, we demonstrate the substantial difference in the accuracy of assemblies from different software programs and the importance of comparing assemblies that result from different parameter settings. 相似文献

7.

Improved reduced representation bisulfite sequencing for epigenomic profiling of clinical samples

Yew Kok Lee Shengnan Jin Shiwei Duan Yen Ching Lim Desmond PY Ng Xueqin Michelle Lin George SH Yeo Chunming Ding 《Biological procedures online》2014,16(1):1-9

Background

DNA methylation plays crucial roles in epigenetic gene regulation in normal development and disease pathogenesis. Efficient and accurate quantification of DNA methylation at single base resolution can greatly advance the knowledge of disease mechanisms and be used to identify potential biomarkers. We developed an improved pipeline based on reduced representation bisulfite sequencing (RRBS) for cost-effective genome-wide quantification of DNA methylation at single base resolution. A selection of two restriction enzymes (Taq^αI and MspI) enables a more unbiased coverage of genomic regions of different CpG densities. We further developed a highly automated software package to analyze bisulfite sequencing results from the Solexa GAIIx system.

Results

With two sequencing lanes, we were able to quantify ~1.8 million individual CpG sites at a minimum sequencing depth of 10. Overall, about 76.7% of CpG islands, 54.9% of CpG island shores and 52.2% of core promoters in the human genome were covered with at least 3 CpG sites per region.

Conclusions

With this new pipeline, it is now possible to perform whole-genome DNA methylation analysis at single base resolution for a large number of samples for understanding how DNA methylation and its changes are involved in development, differentiation, and disease pathogenesis. 相似文献

8.

Empirical methods for controlling false positives and estimating confidence in ChIP-Seq peaks

David A Nix Samir J Courdy Kenneth M Boucher 《BMC bioinformatics》2008,9(1):523

相似文献

9.

DNA methylation and epigenetic control of cellular differentiation

《Cell cycle (Georgetown, Tex.)》2013,12(19):3880-3883

相似文献

10.

A Distinct DNA-Methylation Boundary in the 5′- Upstream Sequence of the FMR1 Promoter Binds Nuclear Proteins and Is Lost in Fragile X Syndrome

Anja Naumann Stefanie Weber Walter Doerfler 《American journal of human genetics》2009,85(5):606-616

We have discovered a distinct DNA-methylation boundary at a site between 650 and 800 nucleotides upstream of the CGG repeat in the first exon of the human FMR1 gene. This boundary, identified by bisulfite sequencing, is present in all human cell lines and cell types, irrespective of age, gender, and developmental stage. The same boundary is found also in different mouse tissues, although sequence homology between human and mouse in this region is only 46.7%. This boundary sequence, in both the unmethylated and the CpG-methylated modes, binds specifically to nuclear proteins from human cells. We interpret this boundary as carrying a specific chromatin structure that delineates a hypermethylated area in the genome from the unmethylated FMR1 promoter and protecting it from the spreading of DNA methylation. In individuals with the fragile X syndrome (FRAXA), the methylation boundary is lost; methylation has penetrated into the FMR1 promoter and inactivated the FMR1 gene. In one FRAXA genome, the upstream terminus of the methylation boundary region exhibits decreased methylation as compared to that of healthy individuals. This finding suggests changes in nucleotide sequence and chromatin structure in the boundary region of this FRAXA individual. In the completely de novo methylated FMR1 promoter, there are isolated unmethylated CpG dinucleotides that are, however, not found when the FMR1 promoter and upstream sequences are methylated in vitro with the bacterial M-SssI DNA methyltransferase. They may arise during de novo methylation only in DNA that is organized in chromatin and be due to the binding of specific proteins. 相似文献

11.

Scaffolder - software for manual genome scaffolding

Barton MD Barton HA 《Source code for biology and medicine》2012,7(1):4-6

ABSTRACT: BACKGROUND: The assembly of next-generation short-read sequencing data can result in a fragmented non-contiguous set of genomic sequences. Therefore a common step in a genome project is to join neighbouring sequence regions together and fill gaps. This scaffolding step is non-trivial and requires manually editing large blocks of nucleotide sequence. Joining these sequences together also hides the source of each region in the final genome sequence. Taken together these considerations may make reproducing or editing an existing genome scaffold difficult. METHODS: The software outlined here, "Scaffolder," is implemented in the Ruby programming language and can be installed via the RubyGems software management system. Genome scaffolds are defined using YAML - a data format which is both human and machine-readable. Command line binaries and extensive documentation are available. RESULTS: This software allows a genome build to be defined in terms of the constituent sequences using a relatively simple syntax. This syntax further allows unknown regions to be specified and additional sequence to be used to fill known gaps in the scaffold. Defining the genome construction in a file makes the scaffolding process reproducible and easier to edit compared with large FASTA nucleotide sequences. CONCLUSIONS: Scaffolder is easy-to-use genome scaffolding software which promotes reproducibility and continuous development in a genome project. Scaffolder can be found at http://next.gs. 相似文献

12.

Tiling Assembly: a new tool for reference annotation-independent transcript assembly and novel gene identification by RNA-sequencing

Kenneth A. Watanabe Arielle Homayouni Tara Tufano Jennifer Lopez Patricia Ringler Paul Rushton Qingxi J. Shen 《DNA research》2015,22(5):319-329

相似文献

13.

Plastid Genome Assembly Using Long-read data

Wenbin Zhou Carolina E. Armijos Chaehee Lee Ruisen Lu Jeremy Wang Tracey A. Ruhlman Robert K. Jansen Alan M. Jones Corbin D. Jones 《Molecular ecology resources》2023,23(6):1442-1457

Although plastid genome (plastome) structure is highly conserved across most seed plants, investigations during the past two decades have revealed several disparately related lineages that experienced substantial rearrangements. Most plastomes contain a large inverted repeat and two single-copy regions, and a few dispersed repeats; however, the plastomes of some taxa harbour long repeat sequences (>300 bp). These long repeats make it challenging to assemble complete plastomes using short-read data, leading to misassemblies and consensus sequences with spurious rearrangements. Single-molecule, long-read sequencing has the potential to overcome these challenges, yet there is no consensus on the most effective method for accurately assembling plastomes using long-read data. We generated a pipeline, plastid Genome Assembly Using Long-read data (ptGAUL), to address the problem of plastome assembly using long-read data from Oxford Nanopore Technologies (ONT) or Pacific Biosciences platforms. We demonstrated the efficacy of the ptGAUL pipeline using 16 published long-read data sets. We showed that ptGAUL quickly produces accurate and unbiased assemblies using only ~50× coverage of plastome data. Additionally, we deployed ptGAUL to assemble four new Juncus (Juncaceae) plastomes using ONT long reads. Our results revealed many long repeats and rearrangements in Juncus plastomes compared with basal lineages of Poales. The ptGAUL pipeline is available on GitHub: https://github.com/Bean061/ptgaul . 相似文献

14.

Large-insert genome analysis technology detects structural variation in Pseudomonas aeruginosa clinical strains from cystic fibrosis patients

Hayden HS Gillett W Saenphimmachak C Lim R Zhou Y Jacobs MA Chang J Rohmer L D'Argenio DA Palmieri A Levy R Haugen E Wong GK Brittnacher MJ Burns JL Miller SI Olson MV Kaul R 《Genomics》2008,91(6):530-537

Large-insert genome analysis (LIGAN) is a broadly applicable, high-throughput technology designed to characterize genome-scale structural variation. Fosmid paired-end sequences and DNA fingerprints from a query genome are compared to a reference sequence using the Genomic Variation Analysis (GenVal) suite of software tools to pinpoint locations of insertions, deletions, and rearrangements. Fosmids spanning regions that contain new structural variants can then be sequenced. Clonal pairs of Pseudomonas aeruginosa isolates from four cystic fibrosis patients were used to validate the LIGAN technology. Approximately 1.5 Mb of inserted sequences were identified, including 743 kb containing 615 ORFs that are absent from published P. aeruginosa genomes. Six rearrangement breakpoints and 220 kb of deleted sequences were also identified. Our study expands the “genome universe” of P. aeruginosa and validates a technology that complements emerging, short-read sequencing methods that are better suited to characterizing single-nucleotide polymorphisms than structural variation. 相似文献

15.

RFTS-deleted DNMT1 enhances tumorigenicity with focal hypermethylation and global hypomethylation

Bo-Kuan Wu Szu-Chieh Mei 《Cell cycle (Georgetown, Tex.)》2014,13(20):3222-3231

Site-specific hypermethylation of tumor suppressor genes accompanied by genome-wide hypomethylation are epigenetic hallmarks of malignancy. However, the molecular mechanisms that drive these linked changes in DNA methylation remain obscure. DNA methyltransferase 1 (DNMT1), the principle enzyme responsible for maintaining methylation patterns is commonly dysregulated in tumors. Replication foci targeting sequence (RFTS) is an N-terminal domain of DNMT1 that inhibits DNA-binding and catalytic activity, suggesting that RFTS deletion would result in a gain of DNMT1 function. However, a substantial body of data suggested that RFTS is required for DNMT1 activity. Here, we demonstrate that deletion of RFTS alters DNMT1-dependent DNA methylation during malignant transformation. Compared to full-length DNMT1, ectopic expression of hyperactive DNMT1-ΔRFTS caused greater malignant transformation and enhanced promoter methylation with condensed chromatin structure that silenced DAPK and DUOX1 expression. Simultaneously, deletion of RFTS impaired DNMT1 chromatin association with pericentromeric Satellite 2 (SAT2) repeat sequences and produced DNA demethylation at SAT2 repeats and globally. To our knowledge, RFTS-deleted DNMT1 is the first single factor that can reprogram focal hypermethylation and global hypomethylation in parallel during malignant transformation. Our evidence suggests that the RFTS domain of DNMT1 is a target responsible for epigenetic changes in cancer. 相似文献

16.

Histone modifications and mRNA expression in the inner cell mass and trophectoderm of bovine blastocysts

Doris Herrmann John Arne Dahl Andrea Lucas-Hahn Philippe Collas Heiner Niemann 《Epigenetics》2013,8(3):281-289

相似文献

17.

Single‐base methylome analysis reveals dynamic epigenomic differences associated with water deficit in apple