期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Characterising chromosome rearrangements: recent technical advances in molecular cytogenetics

Le Scouarnec S Gribble SM 《Heredity》2012,108(1):75-85

Genomic rearrangements can result in losses, amplifications, translocations and inversions of DNA fragments thereby modifying genome architecture, and potentially having clinical consequences. Many genomic disorders caused by structural variation have initially been uncovered by early cytogenetic methods. The last decade has seen significant progression in molecular cytogenetic techniques, allowing rapid and precise detection of structural rearrangements on a whole-genome scale. The high resolution attainable with these recently developed techniques has also uncovered the role of structural variants in normal genetic variation alongside single-nucleotide polymorphisms (SNPs). We describe how array-based comparative genomic hybridisation, SNP arrays, array painting and next-generation sequencing analytical methods (read depth, read pair and split read) allow the extensive characterisation of chromosome rearrangements in human genomes. 相似文献

2.

SVachra: a tool to identify genomic structural variation in mate pair sequencing data containing inward and outward facing reads

Oliver A. Hampton Adam C. English Mark Wang William J. Salerno Yue Liu Donna M. Muzny Yi Han David A. Wheeler Kim C. Worley James R. Lupski 《BMC genomics》2017,18(6):691

Background

Characterization of genomic structural variation (SV) is essential to expanding the research and clinical applications of genome sequencing. Reliance upon short DNA fragment paired end sequencing has yielded a wealth of single nucleotide variants and internal sequencing read insertions-deletions, at the cost of limited SV detection. Multi-kilobase DNA fragment mate pair sequencing has supplemented the void in SV detection, but introduced new analytic challenges requiring SV detection tools specifically designed for mate pair sequencing data. Here, we introduce SVachra – Structural Variation Assessment of CHRomosomal Aberrations, a breakpoint calling program that identifies large insertions-deletions, inversions, inter- and intra-chromosomal translocations utilizing both inward and outward facing read types generated by mate pair sequencing.

Results

We demonstrate SVachra’s utility by executing the program on large-insert (Illumina Nextera) mate pair sequencing data from the personal genome of a single subject (HS1011). An additional data set of long-read (Pacific BioSciences RSII) was also generated to validate SV calls from SVachra and other comparison SV calling programs. SVachra exhibited the highest validation rate and reported the widest distribution of SV types and size ranges when compared to other SV callers.

Conclusions

SVachra is a highly specific breakpoint calling program that exhibits a more unbiased SV detection methodology than other callers.

相似文献

3.

Enhancer detection in the zebrafish using pseudotyped murine retroviruses

Laplante M Kikuta H König M Becker TS 《Methods (San Diego, Calif.)》2006,39(3):189-198

Vectors based on murine retroviruses are among the most efficient means to insert reporter constructs into the context of a vertebrate chromosome with the aim to visualize cis-regulatory information available to a basal promoter at the site of insertion. In combination with using the zebrafish embryo as a readout for the activity of regulatory elements, enhancer detection becomes a powerful technique for gene discovery and for the mapping of the extent of regulatory domains in a vertebrate genome. Our laboratory has performed the only large-scale enhancer detection screen to date in any vertebrate and we describe in this paper the methods we developed to generate viral particles, to insert reporter constructs into the zebrafish germ line, the screening of detection events in heterozygous F1 embryos, and the isolation of genomic sequence flanking the inserted vector for the purpose of genomic mapping. Given sufficient scale, the technology described here can be used to obtain cis-regulatory information across the entire zebrafish genome for any given basal promoter. 相似文献

4.

A genome-wide approach for detecting novel insertion-deletion variants of mid-range size

Li C. Xia Sukolsak Sakshuwong Erik S. Hopmans John M. Bell Susan M. Grimes David O. Siegmund Hanlee P. Ji Nancy R. Zhang 《Nucleic acids research》2016,44(15):e126

We present SWAN, a statistical framework for robust detection of genomic structural variants in next-generation sequencing data and an analysis of mid-range size insertion and deletions (<10 Kb) for whole genome analysis and DNA mixtures. To identify these mid-range size events, SWAN collectively uses information from read-pair, read-depth and one end mapped reads through statistical likelihoods based on Poisson field models. SWAN also uses soft-clip/split read remapping to supplement the likelihood analysis and determine variant boundaries. The accuracy of SWAN is demonstrated by in silico spike-ins and by identification of known variants in the NA12878 genome. We used SWAN to identify a series of novel set of mid-range insertion/deletion detection that were confirmed by targeted deep re-sequencing. An R package implementation of SWAN is open source and freely available. 相似文献

5.

SoftSearch: Integration of Multiple Sequence Features to Identify Breakpoints of Structural Variations

Steven N. Hart Vivekananda Sarangi Raymond Moore Saurabh Baheti Jaysheel D. Bhavsar Fergus J. Couch Jean-Pierre A. Kocher 《PloS one》2013,8(12)

Background

Structural variation (SV) represents a significant, yet poorly understood contribution to an individual’s genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints.

Results

We developed and validated SoftSearch using real and synthetic datasets. SoftSearch’s key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call.

Conclusions

We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance. 相似文献

6.

Structure of simian virus 40-phiX174 recombinant genomes isolated from single cells. 总被引：4，自引：3，他引：1

下载免费PDF全文

E Winocour V Lavie I Keshet 《Journal of virology》1983,48(1):229-238

Three simian virus (SV40)-phi X174 recombinant genomes were isolated from single BSC-1 monkey cells cotransfected with SV40 and phi X174 RF1 DNAs. The individual cell progenies were amplified, cloned, and mapped by a combination of restriction endonuclease and heteroduplex analyses. In each case, the 600 to 1,000 base pairs of phi X174 DNA (derived from different regions of the phi X174 genome) were present as single inserts, located in either the early or late SV40 regions; the deletion of SV40 DNA was greater than the size of the insert; and the remaining portions of the hybrid genome were indistinguishable from wild-type SV40 DNA, as judged by both mapping and biological tests. Hence, apart from the deletion which accommodates the phi X174 DNA insert, no other rearrangements of SV40 DNA were detected. The restriction map of a SV40-phi X174 recombinant DNA isolate before molecular cloning was indistinguishable from those of two separate cloned derivatives of that isolate, indicating that the species cloned was the major amplifiable recombinant structure generated by a single recombinant-producing cell. The relative simplicity of the SV40-phi X174 recombinant DNA examined is consistent with the notion that most recombinant-producing BSC-1 cells support single recombination events generating only one amplifiable recombinant structure. 相似文献

7.

Expression of a human alpha-globin/fibronectin gene hybrid generates two mRNAs by alternative splicing. 总被引：23，自引：5，他引：18

下载免费PDF全文

K Vibe-Pedersen A R Kornblihtt F E Baralle 《The EMBO journal》1984,3(11):2511-2516

We have isolated genomic clones for human fibronectin (FN), by screening a human gene library with previously isolated FN cDNA clones. We have recently reported two different FN mRNAs, one of them containing an additional 270 nucleotide insert coding for a structural domain ED. Restriction mapping and DNA sequencing of the genomic clones show that the ED type III unit corresponds to exactly one exon in the gene, whilst the two flanking type III units are split in two exons at variable positions. When an alpha-globin/FN gene hybrid construct, containing the ED exon, flanking introns and neighbouring FN exons, is transfected into HeLa cells, two hybrid mRNAs differing by the ED exon are synthesized. These experiments confirmed that the two FN mRNAs observed in vivo arise from the same gene by alternative splicing. 相似文献

8.

The Subread aligner: fast,accurate and scalable read mapping by seed-and-vote

Yang Liao Gordon K. Smyth Wei Shi 《Nucleic acids research》2013,41(10):e108

Read alignment is an ongoing challenge for the analysis of data from sequencing technologies. This article proposes an elegantly simple multi-seed strategy, called seed-and-vote, for mapping reads to a reference genome. The new strategy chooses the mapped genomic location for the read directly from the seeds. It uses a relatively large number of short seeds (called subreads) extracted from each read and allows all the seeds to vote on the optimal location. When the read length is <160 bp, overlapping subreads are used. More conventional alignment algorithms are then used to fill in detailed mismatch and indel information between the subreads that make up the winning voting block. The strategy is fast because the overall genomic location has already been chosen before the detailed alignment is done. It is sensitive because no individual subread is required to map exactly, nor are individual subreads constrained to map close by other subreads. It is accurate because the final location must be supported by several different subreads. The strategy extends easily to find exon junctions, by locating reads that contain sets of subreads mapping to different exons of the same gene. It scales up efficiently for longer reads. 相似文献

9.

ViVar: A Comprehensive Platform for the Analysis and Visualization of Structural Genomic Variation

Tom Sante Sarah Vergult Pieter-Jan Volders Wigard P. Kloosterman Geert Trooskens Katleen De Preter Annelies Dheedene Frank Speleman Tim De Meyer Bj?rn Menten 《PloS one》2014,9(12)

Structural genomic variations play an important role in human disease and phenotypic diversity. With the rise of high-throughput sequencing tools, mate-pair/paired-end/single-read sequencing has become an important technique for the detection and exploration of structural variation. Several analysis tools exist to handle different parts and aspects of such sequencing based structural variation analyses pipelines. A comprehensive analysis platform to handle all steps, from processing the sequencing data, to the discovery and visualization of structural variants, is missing. The ViVar platform is built to handle the discovery of structural variants, from Depth Of Coverage analysis, aberrant read pair clustering to split read analysis. ViVar provides you with powerful visualization options, enables easy reporting of results and better usability and data management. The platform facilitates the processing, analysis and visualization, of structural variation based on massive parallel sequencing data, enabling the rapid identification of disease loci or genes. ViVar allows you to scale your analysis with your work load over multiple (cloud) servers, has user access control to keep your data safe and is easy expandable as analysis techniques advance. URL: https://www.cmgg.be/vivar/ 相似文献

10.

Development of TBSPG Pipelines for Refining Unique Mapping and Repetitive Sequence Detection Using the Two Halves of Each Illumina Sequence Read

Heng Xiang Xiu-Qing Li 《Plant Molecular Biology Reporter》2016,34(1):172-181

相似文献

11.

Assessing structural variation in a personal genome—towards a human reference diploid genome

Adam C English William J Salerno Oliver A Hampton Claudia Gonzaga-Jauregui Shruthi Ambreth Deborah I Ritter Christine R Beck Caleb F Davis Mahmoud Dahdouli Singer Ma Andrew Carroll Narayanan Veeraraghavan Jeremy Bruestle Becky Drees Alex Hastie Ernest T Lam Simon White Pamela Mishra Min Wang Yi Han Feng Zhang Pawel Stankiewicz David A Wheeler Jeffrey G Reid Donna M Muzny Jeffrey Rogers Aniko Sabo Kim C Worley James R Lupski Eric Boerwinkle Richard A Gibbs 《BMC genomics》2015,16(1)

Background

Characterizing large genomic variants is essential to expanding the research and clinical applications of genome sequencing. While multiple data types and methods are available to detect these structural variants (SVs), they remain less characterized than smaller variants because of SV diversity, complexity, and size. These challenges are exacerbated by the experimental and computational demands of SV analysis. Here, we characterize the SV content of a personal genome with Parliament, a publicly available consensus SV-calling infrastructure that merges multiple data types and SV detection methods.

Results

We demonstrate Parliament’s efficacy via integrated analyses of data from whole-genome array comparative genomic hybridization, short-read next-generation sequencing, long-read (Pacific BioSciences RSII), long-insert (Illumina Nextera), and whole-genome architecture (BioNano Irys) data from the personal genome of a single subject (HS1011). From this genome, Parliament identified 31,007 genomic loci between 100 bp and 1 Mbp that are inconsistent with the hg19 reference assembly. Of these loci, 9,777 are supported as putative SVs by hybrid local assembly, long-read PacBio data, or multi-source heuristics. These SVs span 59 Mbp of the reference genome (1.8%) and include 3,801 events identified only with long-read data. The HS1011 data and complete Parliament infrastructure, including a BAM-to-SV workflow, are available on the cloud-based service DNAnexus.

Conclusions

HS1011 SV analysis reveals the limits and advantages of multiple sequencing technologies, specifically the impact of long-read SV discovery. With the full Parliament infrastructure, the HS1011 data constitute a public resource for novel SV discovery, software calibration, and personal genome structural variation analysis.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1479-3) contains supplementary material, which is available to authorized users. 相似文献

12.

The fine-scale architecture of structural variants in 17 mouse genomes

Yalcin B Wong K Bhomra A Goodson M Keane TM Adams DJ Flint J 《Genome biology》2012,13(3):R18-12

Background

Accurate catalogs of structural variants (SVs) in mammalian genomes are necessary to elucidate the potential mechanisms that drive SV formation and to assess their functional impact. Next generation sequencing methods for SV detection are an advance on array-based methods, but are almost exclusively limited to four basic types: deletions, insertions, inversions and copy number gains.

Results

By visual inspection of 100 Mbp of genome to which next generation sequence data from 17 inbred mouse strains had been aligned, we identify and interpret 21 paired-end mapping patterns, which we validate by PCR. These paired-end mapping patterns reveal a greater diversity and complexity in SVs than previously recognized. In addition, Sanger-based sequence analysis of 4,176 breakpoints at 261 SV sites reveal additional complexity at approximately a quarter of structural variants analyzed. We find micro-deletions and micro-insertions at SV breakpoints, ranging from 1 to 107 bp, and SNPs that extend breakpoint micro-homology and may catalyze SV formation.

Conclusions

An integrative approach using experimental analyses to train computational SV calling is essential for the accurate resolution of the architecture of SVs. We find considerable complexity in SV formation; about a quarter of SVs in the mouse are composed of a complex mixture of deletion, insertion, inversion and copy number gain. Computational methods can be adapted to identify most paired-end mapping patterns. 相似文献

13.

New bioinformatic tool for quick identification of functionally relevant endogenous retroviral inserts in human genome

Andrew Garazha Alena Ivanova Maria Suntsova Galina Malakhova Sergey Roumiantsev Alex Zhavoronkov Anton Buzdin 《Cell cycle (Georgetown, Tex.)》2015,14(9):1476-1484

相似文献

14.

Comparative genomics approach to detecting split-coding regions in a low-coverage genome: lessons from the chimaera Callorhinchus milii (Holocephali, Chondrichthyes)

Dessimoz C Zoller S Manousaki T Qiu H Meyer A Kuraku S 《Briefings in bioinformatics》2011,12(5):474-484

Recent development of deep sequencing technologies has facilitated de novo genome sequencing projects, now conducted even by individual laboratories. However, this will yield more and more genome sequences that are not well assembled, and will hinder thorough annotation when no closely related reference genome is available. One of the challenging issues is the identification of protein-coding sequences split into multiple unassembled genomic segments, which can confound orthology assignment and various laboratory experiments requiring the identification of individual genes. In this study, using the genome of a cartilaginous fish, Callorhinchus milii, as test case, we performed gene prediction using a model specifically trained for this genome. We implemented an algorithm, designated ESPRIT, to identify possible linkages between multiple protein-coding portions derived from a single genomic locus split into multiple unassembled genomic segments. We developed a validation framework based on an artificially fragmented human genome, improvements between early and recent mouse genome assemblies, comparison with experimentally validated sequences from GenBank, and phylogenetic analyses. Our strategy provided insights into practical solutions for efficient annotation of only partially sequenced (low-coverage) genomes. To our knowledge, our study is the first formulation of a method to link unassembled genomic segments based on proteomes of relatively distantly related species as references. 相似文献

15.

Comprehensive DNA copy number profile and BAC library construction of an Indian individual

Chakrabarty S D'Souza RR Bellampalli R Rotti H Saadi AV Gopinath PM Acharya RV Govindaraj P Thangaraj K Satyamoorthy K 《Gene》2012,500(2):186-193

Bacterial artificial chromosomes (BACs) are used in genomic variation studies due to their capacity to carry a large insert, their high clonal stability, low rate of chimerism and ease of manipulation. In the present study, an attempt was made to create the first genomic BAC library of an anonymous Indian male (IMBL4) consisting of 100,224 clones covering the human genome more than three times. Restriction mapping of 255 BAC clones by pulse field gel electrophoresis confirmed an average insert size of 120 kb. The library was screened by PCR using SHANK3 (SH3 and multiple ankyrin repeat domains 3) and OLFM3 (olfactomedin 3) specific primers. A selection of clones was analyzed by fluorescent in situ hybridization (FISH) and sequencing. Fine mapping of copy number variable regions by array based comparative genomic hybridization identified 467 CNVRs in the IMBL4 genome. The IMBL4 BAC library represents the first cataloged Indian genome resource for applications in basic and clinical research. 相似文献

16.

Manifold Based Optimization for Single-Cell 3D Genome Reconstruction

Jonas Paulsen Odin Gramstad Philippe Collas 《PLoS computational biology》2015,11(8)

The three-dimensional (3D) structure of the genome is important for orchestration of gene expression and cell differentiation. While mapping genomes in 3D has for a long time been elusive, recent adaptations of high-throughput sequencing to chromosome conformation capture (3C) techniques, allows for genome-wide structural characterization for the first time. However, reconstruction of "consensus" 3D genomes from 3C-based data is a challenging problem, since the data are aggregated over millions of cells. Recent single-cell adaptations to the 3C-technique, however, allow for non-aggregated structural assessment of genome structure, but data suffer from sparse and noisy interaction sampling. We present a manifold based optimization (MBO) approach for the reconstruction of 3D genome structure from chromosomal contact data. We show that MBO is able to reconstruct 3D structures based on the chromosomal contacts, imposing fewer structural violations than comparable methods. Additionally, MBO is suitable for efficient high-throughput reconstruction of large systems, such as entire genomes, allowing for comparative studies of genomic structure across cell-lines and different species. 相似文献

17.

Complementary packing of alpha-helices in proteins 总被引：10，自引：0，他引：10

Efimov AV 《FEBS letters》1999,452(1-2):3-6

相似文献

18.

A comparison study: applying segmentation to array CGH data for downstream analyses

Willenbrock H Fridlyand J 《Bioinformatics (Oxford, England)》2005,21(22):4084-4091

MOTIVATION: Array comparative genomic hybridization (CGH) allows detection and mapping of copy number of DNA segments. A challenge is to make inferences about the copy number structure of the genome. Several statistical methods have been proposed to determine genomic segments with different copy number levels. However, to date, no comprehensive comparison of various characteristics of these methods exists. Moreover, the segmentation results have not been utilized in downstream analyses. RESULTS: We describe a comparison of three popular and publicly available methods for the analysis of array CGH data and we demonstrate how segmentation results may be utilized in the downstream analyses such as testing and classification, yielding higher power and prediction accuracy. Since the methods operate on individual chromosomes, we also propose a novel procedure for merging segments across the genome, which results in an interpretable set of copy number levels, and thus facilitate identification of copy number alterations in each genome. AVAILABILITY: http://www.bioconductor.org 相似文献

19.

Enhancers as information integration hubs in development: lessons from genomics 总被引：1，自引：0，他引：1

Buecker C Wysocka J 《Trends in genetics : TIG》2012,28(6):276-284

相似文献

20.

Genome sequencing of mouse induced pluripotent stem cells reveals retroelement stability and infrequent DNA rearrangement during reprogramming

Quinlan AR Boland MJ Leibowitz ML Shumilina S Pehrson SM Baldwin KK Hall IM 《Cell Stem Cell》2011,9(4):366-373

The biomedical utility of induced pluripotent stem cells (iPSCs) will be diminished if most iPSC lines harbor deleterious genetic mutations. Recent microarray studies have shown that human iPSCs carry elevated levels of DNA copy number variation compared with those in embryonic stem cells, suggesting that these and other classes of genomic structural variation (SV), including inversions, smaller duplications and deletions, complex rearrangements, and retroelement transpositions, may frequently arise as a consequence of reprogramming. Here we employ whole-genome paired-end DNA sequencing and sensitive mapping algorithms to identify all classes of SV in three fully pluripotent mouse iPSC lines. Despite the improved scope and resolution of this study, we find few spontaneous mutations per line (one or two) and no evidence for?endogenous retroelement transposition. These results show that genome stability can persist throughout reprogramming, and argue that it is possible to generate iPSCs lacking gene-disrupting mutations using current reprogramming methods. 相似文献