首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Complex genomic rearrangements (CGRs) consisting of two or more breakpoint junctions have been observed in genomic disorders. Recently, a chromosome catastrophe phenomenon termed chromothripsis, in which numerous genomic rearrangements are apparently acquired in one single catastrophic event, was described in multiple cancers. Here, we show that constitutionally acquired CGRs share similarities with cancer chromothripsis. In the 17 CGR cases investigated, we observed localization and multiple copy number changes including deletions, duplications, and/or triplications, as well as extensive translocations and inversions. Genomic rearrangements involved varied in size and complexities; in one case, array comparative genomic hybridization revealed 18 copy number changes. Breakpoint sequencing identified characteristic features, including small templated insertions at breakpoints and microhomology at breakpoint junctions, which have been attributed to replicative processes. The resemblance between CGR and chromothripsis suggests similar mechanistic underpinnings. Such chromosome catastrophic events appear to reflect basic DNA metabolism operative throughout an organism's life cycle.  相似文献   

2.
We investigated complex genomic rearrangements (CGRs) consisting of triplication copy-number variants (CNVs) that were accompanied by extended regions of copy-number-neutral absence of heterozygosity (AOH) in subjects with multiple congenital abnormalities. Molecular analyses provided observational evidence that in humans, post-zygotically generated CGRs can lead to regional uniparental disomy (UPD) due to template switches between homologs versus sister chromatids by using microhomology to prime DNA replication—a prediction of the replicative repair model, MMBIR. Our findings suggest that replication-based mechanisms might underlie the formation of diverse types of genomic alterations (CGRs and AOH) implicated in constitutional disorders.  相似文献   

3.
Genome synthesis endows scientists the ability of de novo creating genomes absent in nature, by thorough redesigning DNA sequences and introducing numerous custom features. However, the genome synthesis is a labor‐ and time‐consuming work, and thus it is a challenge to verify and quantify the synthetic genome rapidly and precisely. Thus, specific DNA sequences different from native genomic sequences are designed into synthetic genomes during synthesis, namely genomic markers. Genomic markers can be easily detected by PCR reaction, whole‐genome sequencing (WGS) and a variety of methods to identify the synthetic genome from native one. Here, we review types and applications of genomic markers utilized in synthetic genomes, with the hope of providing a guidance for future works.  相似文献   

4.
Transposons are genomic parasites, and their new insertions can cause instability and spur the evolution of their host genomes. Rapid accumulation of short-read whole-genome sequencing data provides a great opportunity for studying new transposon insertions and their impacts on the host genome. Although many algorithms are available for detecting transposon insertions, the task remains challenging and existing tools are not designed for identifying de novo insertions. Here, we present a new benchmark fly dataset based on PacBio long-read sequencing and a new method TEMP2 for detecting germline insertions and measuring de novo ‘singleton’ insertion frequencies in eukaryotic genomes. TEMP2 achieves high sensitivity and precision for detecting germline insertions when compared with existing tools using both simulated data in fly and experimental data in fly and human. Furthermore, TEMP2 can accurately assess the frequencies of de novo transposon insertions even with high levels of chimeric reads in simulated datasets; such chimeric reads often occur during the construction of short-read sequencing libraries. By applying TEMP2 to published data on hybrid dysgenic flies inflicted by de-repressed P-elements, we confirmed the continuous new insertions of P-elements in dysgenic offspring before they regain piRNAs for P-element repression. TEMP2 is freely available at Github: https://github.com/weng-lab/TEMP2.  相似文献   

5.
6.
Long INterspersed Elements (LINE-1s or L1s) are abundant non-LTR retrotransposons in mammalian genomes that are capable of insertional mutagenesis. They have been associated with target site deletions upon insertion in cell culture studies of retrotransposition. Here, we report 50 deletion events in the human and chimpanzee genomes directly linked to the insertion of L1 elements, resulting in the loss of ~18 kb of sequence from the human genome and ~15 kb from the chimpanzee genome. Our data suggest that during the primate radiation, L1 insertions may have deleted up to 7.5 Mb of target genomic sequences. While the results of our in vivo analysis differ from those of previous cell culture assays of L1 insertion-mediated deletions in terms of the size and rate of sequence deletion, evolutionary factors can reconcile the differences. We report a pattern of genomic deletion sizes similar to those created during the retrotransposition of Alu elements. Our study provides support for the existence of different mechanisms for small and large L1-mediated deletions, and we present a model for the correlation of L1 element size and the corresponding deletion size. In addition, we show that internal rearrangements can modify L1 structure during retrotransposition events associated with large deletions.  相似文献   

7.
Long terminal repeat (LTR) retrotransposons and endogenous retroviruses (ERVs) are transposable elements in eukaryotic genomes well suited for computational identification. De novo identification tools determine the position of potential LTR retrotransposon or ERV insertions in genomic sequences. For further analysis, it is desirable to obtain an annotation of the internal structure of such candidates. This article presents LTRdigest, a novel software tool for automated annotation of internal features of putative LTR retrotransposons. It uses local alignment and hidden Markov model-based algorithms to detect retrotransposon-associated protein domains as well as primer binding sites and polypurine tracts. As an example, we used LTRdigest results to identify 88 (near) full-length ERVs in the chromosome 4 sequence of Mus musculus, separating them from truncated insertions and other repeats. Furthermore, we propose a work flow for the use of LTRdigest in de novo LTR retrotransposon classification and perform an exemplary de novo analysis on the Drosophila melanogaster genome as a proof of concept. Using a new method solely based on the annotations generated by LTRdigest, 518 potential LTR retrotransposons were automatically assigned to 62 candidate groups. Representative sequences from 41 of these 62 groups were matched to reference sequences with >80% global sequence similarity.  相似文献   

8.
The fully annotated genome sequence of the European strain, 26695 was first published in 1997 and, in 1999, it was directly compared to the USA isolate J99, promoting two standard laboratory isolates for Helicobacter pylori (H. pylori) research. With the genomic scaffolds available from these important genomes and the advent of benchtop high-throughput sequencing technology, a bacterial genome can now be sequenced within a few days. We sequenced and analysed strains J99 and 26695 using the benchtop-sequencing machines Ion Torrent PGM and the Illumina MiSeq Nextera and Nextera XT methodologies. Using publically available algorithms, we analysed the raw data and interrogated both genomes by mapping the data and by de novo assembly. We compared the accuracy of the coding sequence assemblies to the originally published sequences. With the Ion Torrent PGM, we found an inherently high-error rate in the raw sequence data. Using the Illumina MiSeq, we found significantly more non-covered nucleotides when using the less expensive Illumina Nextera XT compared with the Illumina Nextera library creation method. We found the most accurate de novo assemblies using the Nextera technology, however, extracting an accurate multi-locus sequence type was inconsistent compared to the Ion Torrent PGM. We found the cagPAI failed to assemble onto a single contig in all technologies but was more accurate using the Nextera. Our results indicate the Illumina MiSeq Nextera method is the most accurate for de novo whole genome sequencing of H. pylori.  相似文献   

9.
High-throughput DNA sequencing technologies have revolutionized genomic analysis, including the de novo assembly of whole genomes. Nevertheless, assembly of complex genomes remains challenging, in part due to the presence of dispersed repeats which introduce ambiguity during genome reconstruction. Transposable elements (TEs) can be particularly problematic, especially for TE families exhibiting high sequence identity, high copy number, or complex genomic arrangements. While TEs strongly affect genome function and evolution, most current de novo assembly approaches cannot resolve long, identical, and abundant families of TEs. Here, we applied a novel Illumina technology called TruSeq synthetic long-reads, which are generated through highly-parallel library preparation and local assembly of short read data and which achieve lengths of 1.5–18.5 Kbp with an extremely low error rate (0.03% per base). To test the utility of this technology, we sequenced and assembled the genome of the model organism Drosophila melanogaster (reference genome strain y; cn, bw, sp) achieving an N50 contig size of 69.7 Kbp and covering 96.9% of the euchromatic chromosome arms of the current reference genome. TruSeq synthetic long-read technology enables placement of individual TE copies in their proper genomic locations as well as accurate reconstruction of TE sequences. We entirely recovered and accurately placed 4,229 (77.8%) of the 5,434 annotated transposable elements with perfect identity to the current reference genome. As TEs are ubiquitous features of genomes of many species, TruSeq synthetic long-reads, and likely other methods that generate long-reads, offer a powerful approach to improve de novo assemblies of whole genomes.  相似文献   

10.
As next-generation sequencing continues to have an expanding presence in the clinic, the identification of the most cost-effective and robust strategy for identifying copy number changes and translocations in tumor genomes is needed. We hypothesized that performing shallow whole genome sequencing (WGS) of 900–1000-bp inserts (long insert WGS, LI-WGS) improves our ability to detect these events, compared with shallow WGS of 300–400-bp inserts. A priori analyses show that LI-WGS requires less sequencing compared with short insert WGS to achieve a target physical coverage, and that LI-WGS requires less sequence coverage to detect a heterozygous event with a power of 0.99. We thus developed an LI-WGS library preparation protocol based off of Illumina’s WGS library preparation protocol and illustrate the feasibility of performing LI-WGS. We additionally applied LI-WGS to three separate tumor/normal DNA pairs collected from patients diagnosed with different cancers to demonstrate our application of LI-WGS on actual patient samples for identification of somatic copy number alterations and translocations. With the evolution of sequencing technologies and bioinformatics analyses, we show that modifications to current approaches may improve our ability to interrogate cancer genomes.  相似文献   

11.
Autism spectrum disorder (ASD) is a neurodevelopmental disorder with considerable clinical and genetic heterogeneity.In this study,we identified all classes of genomic variants from whole-genome sequencing (WGS) dataset of 32 Chinese trios with ASD,including de novo mutations,inherited variants,copy number variants (CNVs) and genomic structural variants.A higher mutation rate (Poisson test,P2.2×10~(-16)) in exonic (1.37×10~(-8)) and 3'-UTR regions (1.42×10~(-8)) was revealed in comparison with that of whole genome (1.05×10~(-8)).Using an integrated model,we identified 87 potentially risk genes (P0.01) from 4832 genes harboring various rare deleterious variants,including CHD8 and NRXN2,implying that the disorders may be in favor to multiple-hit.In particular,frequent rare inherited mutations of several microcephaly-associated genes (ASPM,WDR62,and ZNF335)were found in ASD.In chromosomal structure analyses,we found four de novo CNVs and one de novo chromosomal rearrangement event,including a de novo duplication of UBE3A-containing region at 15q11.2-q13.1,which causes Angelman syndrome and microcephaly,and a disrupted TNR due to de novo chromosomal translocation t (1;5) (q25.1;q33.2).Taken together,our results suggest that abnormalities of centrosomal function and chromatin remodeling of the microcephaly-associated genes may be implicated in pathogenesis of ASD.Adoption of WGS as a new yet efficient technique to illustrate the full genetic spectrum in complex disorders,such as ASD,could provide novel insights into pathogenesis,diagnosis and treatment.  相似文献   

12.
13.
Genomic rearrangements can cause both Mendelian and complex disorders. Currently, several major mechanisms causing genomic rearrangements, such as non-allelic homologous recombination (NAHR), non-homologous end joining (NHEJ), fork stalling and template switching (FoSTeS), and microhomology-mediated break-induced replication (MMBIR), have been proposed. However, to what extent these mechanisms contribute to gene-specific pathogenic copy-number variations (CNVs) remains understudied. Furthermore, few studies have resolved these pathogenic alterations at the nucleotide-level. Accordingly, our aim was to explore which mechanisms contribute to a large, unique set of locus-specific non-recurrent genomic rearrangements causing the genetic neurocutaneous disorder neurofibromatosis type 1 (NF1). Through breakpoint-spanning PCR as well as array comparative genomic hybridization, we have identified the breakpoints in 85 unrelated individuals carrying an NF1 intragenic CNV. Furthermore, we characterized the likely rearrangement mechanisms of these 85 CNVs, along with those of two additional previously published NF1 intragenic CNVs. Unlike the most typical recurrent rearrangements mediated by flanking low-copy repeats (LCRs), NF1 intragenic rearrangements vary in size, location, and rearrangement mechanisms. We propose the DNA-replication-based mechanisms comprising both FoSTeS and/or MMBIR and serial replication stalling to be the predominant mechanisms leading to NF1 intragenic CNVs. In addition to the loop within a 197-bp palindrome located in intron 40, four Alu elements located in introns 1, 2, 3, and 50 were also identified as intragenic-rearrangement hotspots within NF1.  相似文献   

14.
A well-known mechanism through which new protein-coding genes originate is by modification of pre-existing genes, e.g. by duplication or horizontal transfer. In contrast, many viruses generate protein-coding genes de novo, via the overprinting of a new reading frame onto an existing (“ancestral”) frame. This mechanism is thought to play an important role in viral pathogenicity, but has been poorly explored, perhaps because identifying the de novo frames is very challenging. Therefore, a new approach to detect them was needed. We assembled a reference set of overlapping genes for which we could reliably determine the ancestral frames, and found that their codon usage was significantly closer to that of the rest of the viral genome than the codon usage of de novo frames. Based on this observation, we designed a method that allowed the identification of de novo frames based on their codon usage with a very good specificity, but intermediate sensitivity. Using our method, we predicted that the Rex gene of deltaretroviruses has originated de novo by overprinting the Tax gene. Intriguingly, several genes in the same genomic region have also originated de novo and encode proteins that regulate the functions of Tax. Such “gene nurseries” may be common in viral genomes. Finally, our results confirm that the genomic GC content is not the only determinant of codon usage in viruses and suggest that a constraint linked to translation must influence codon usage.  相似文献   

15.
Most studies of tumor instability are PCR-based. PCR-based methods may underestimate mutation frequencies of heterogeneous tumor genomes. Using a novel PCR-free random cloning/sequencing method, we analyzed 100 kb of total genomic DNA from blood lymphocytes, normal prostate and tumor prostate taken from six individuals. Variations were identified by comparison of the sequence of the cloned fragments with the nr-database in Genbank. After excluding known polymorphisms (by comparison to the NCBI dbSNP), we report a significant over-representation of variants in the tumors: 0.66 variations per kilobase of sequence, compared with the corresponding normal prostates (0.14 variations/kb) or blood (0.09 variations/kb). Extrapolating the observed difference between tumor and normal prostate DNA, we estimate 1.8 million somatic (de novo) alterations per tumor cell genome, a much higher frequency than previous measurements obtained by mostly PCR-based methods in other tumor types. Moreover, unlike the normal prostate and blood, most of the tumor variations occur in a specific motif (P = 0.046), suggesting common etiology. We further report high tumor cell-to-cell heterogeneity. These data have important implications for selecting appropriate technologies for cancer genome projects as well as for understanding prostate cancer progression.  相似文献   

16.
Transposable elements (TEs) are DNA sequences capable of mobilization from one location to another in the genome. Since the discovery of ‘Dissociation (Dc) locus’ by Barbara McClintock in maize (1), mounting evidence in the era of genomics indicates that a significant fraction of most eukaryotic genomes is composed of TE sequences, involving in various aspects of biological processes such as development, physiology, diseases and evolution. Although technical advances in genomics have discovered numerous functional impacts of TE across species, our understanding of TEs is still ongoing process due to challenges resulted from complexity and abundance of TEs in the genome. In this mini-review, we briefly summarize biology of TEs and their impacts on the host genome, emphasizing importance of understanding TE landscape in the genome. Then, we introduce recent endeavors especially in vivo retrotransposition assays and long read sequencing technology for identifying de novo insertions/TE polymorphism, which will broaden our knowledge of extraordinary relationship between genomic cohabitants and their host.  相似文献   

17.

Background

Congenital malformations are present in approximately 2–3% of liveborn babies and 20% of stillborn fetuses. The mechanisms underlying the majority of sporadic and isolated congenital malformations are poorly understood, although it is hypothesized that the accumulation of rare genetic, genomic and epigenetic variants converge to deregulate developmental networks.

Methodology/Principal Findings

We selected samples from 95 fetuses with congenital malformations not ascribed to a specific syndrome (68 with isolated malformations, 27 with multiple malformations). Karyotyping and Multiplex Ligation-dependent Probe Amplification (MLPA) discarded recurrent genomic and cytogenetic rearrangements. DNA extracted from the affected tissue (46%) or from lung or liver (54%) was analyzed by molecular karyotyping. Validations and inheritance were obtained by MLPA. We identified 22 rare copy number variants (CNV) [>100 kb, either absent (n = 7) or very uncommon (n = 15, <1/2,000) in the control population] in 20/95 fetuses with congenital malformations (21%), including 11 deletions and 11 duplications. One of the 9 tested rearrangements was de novo while the remaining were inherited from a healthy parent. The highest frequency was observed in fetuses with heart hypoplasia (8/17, 62.5%), with two events previously related with the phenotype. Double events hitting candidate genes were detected in two samples with brain malformations. Globally, the burden of deletions was significantly higher in fetuses with malformations compared to controls.

Conclusions/Significance

Our data reveal a significant contribution of rare deletion-type CNV, mostly inherited but also de novo, to human congenital malformations, especially heart hypoplasia, and reinforce the hypothesis of a multifactorial etiology in most cases.  相似文献   

18.
The Hardness (Ha) locus controls grain hardness in hexaploid wheat (Triticum aestivum) and its relatives (Triticum and Aegilops species) and represents a classical example of a trait whose variation arose from gene loss after polyploidization. In this study, we investigated the molecular basis of the evolutionary events observed at this locus by comparing corresponding sequences of diploid, tertraploid, and hexaploid wheat species (Triticum and Aegilops). Genomic rearrangements, such as transposable element insertions, genomic deletions, duplications, and inversions, were shown to constitute the major differences when the same genomes (i.e., the A, B, or D genomes) were compared between species of different ploidy levels. The comparative analysis allowed us to determine the extent and sequences of the rearranged regions as well as rearrangement breakpoints and sequence motifs at their boundaries, which suggest rearrangement by illegitimate recombination. Among these genomic rearrangements, the previously reported Pina and Pinb genes loss from the Ha locus of polyploid wheat species was caused by a large genomic deletion that probably occurred independently in the A and B genomes. Moreover, the Ha locus in the D genome of hexaploid wheat (T. aestivum) is 29 kb smaller than in the D genome of its diploid progenitor Ae. tauschii, principally because of transposable element insertions and two large deletions caused by illegitimate recombination. Our data suggest that illegitimate DNA recombination, leading to various genomic rearrangements, constitutes one of the major evolutionary mechanisms in wheat species.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号