首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background

Copy number variants (CNVs), including deletions, amplifications, and other rearrangements, are common in human and cancer genomes. Copy number data from array comparative genome hybridization (aCGH) and next-generation DNA sequencing is widely used to measure copy number variants. Comparison of copy number data from multiple individuals reveals recurrent variants. Typically, the interior of a recurrent CNV is examined for genes or other loci associated with a phenotype. However, in some cases, such as gene truncations and fusion genes, the target of variant lies at the boundary of the variant.

Results

We introduce Neighborhood Breakpoint Conservation (NBC), an algorithm for identifying rearrangement breakpoints that are highly conserved at the same locus in multiple individuals. NBC detects recurrent breakpoints at varying levels of resolution, including breakpoints whose location is exactly conserved and breakpoints whose location varies within a gene. NBC also identifies pairs of recurrent breakpoints such as those that result from fusion genes. We apply NBC to aCGH data from 36 primary prostate tumors and identify 12 novel rearrangements, one of which is the well-known TMPRSS2-ERG fusion gene. We also apply NBC to 227 glioblastoma tumors and predict 93 novel rearrangements which we further classify as gene truncations, germline structural variants, and fusion genes. A number of these variants involve the protein phosphatase PTPN12 suggesting that deregulation of PTPN12, via a variety of rearrangements, is common in glioblastoma.

Conclusions

We demonstrate that NBC is useful for detection of recurrent breakpoints resulting from copy number variants or other structural variants, and in particular identifies recurrent breakpoints that result in gene truncations or fusion genes. Software is available at http://http.//cs.brown.edu/people/braphael/software.html.  相似文献   

2.
3.
Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95-99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ~15% and ~20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.  相似文献   

4.
Chromosome 15 is frequently involved in the formation of structural rearrangements. We report the molecular characterisation of 16 independent interstitial duplications, including those of one individual who carried a duplication on both of her chromosomes 15, and three interstitial triplications of the Prader-Willi/Angelman syndrome critical region (PWACR). In all probands except one, the rearrangement was maternal in origin. In one family, the duplication was paternal in origin, yet appeared to segregate in a sibship of three with an abnormal phenotype that included developmental delay and a behavioural disorder. Ten duplications were familial, five de novo and one unknown. All 16 duplications, including two not visible by routine G-banding, were of an almost uniform size and shared the common deletion breakpoints of Prader-Willi syndrome and Angelman syndrome. Like deletions, the formation of duplications can occur in both male and female meiosis and involve both inter- and intrachromosomal events. This implies that at least some deletions and duplications are the reciprocal products of each other. We observed no instances of meiotic instability in the transmission of a duplication, although recombination within the PWACR occurred in two members of the same family between the normal and the duplicated chromosome 15 homologues. All three triplications arose de novo and included alleles from both maternal chromosomes 15. Triplication breakpoints were more variable and extended distally beyond the PWACR. The molecular characteristics of duplications and triplications suggest that they are formed by different mechanisms.  相似文献   

5.
Nontandem segmental duplications provide a useful alternative to conventional recombination mapping for determining gene order in a haploid organism such asNeurospora. When an insertional or terminal rearrangement is crossed by Normal sequence, a class of progeny is produced that have a precisely delimited chromosome segment duplicated. In such Duplication progeny, a recessive gene in the Normal-sequence donor chromosome may or may not be masked (“covered”) by its dominant wild-type allele in the translocation-sequence recipient chromosome. Coverage depends upon whether the gene in question is left or right of the rearrangement breakpoint. The recessive gene will be heterozygous and covered (not expressed) if its locus is within the duplicated segment, but it will be haploid and expressed if the locus is outside the segment. Not only genes but also centromeres can be mapped by means of duplications, because genes included in. the same viable duplication must reside in the same chromosome arm. - Numerous sequences in the current genetic maps ofN. crassa have been determined using duplications. Gene order in the albino region and in the centromere region of linkage group I provide examples. Over 50 insertional or terminal rearrangements are available from which nontandem duplications of defined content can be obtained at will; collectively these cover about 75% of the genome. - Intercrosses between partially overlapping chromosome rearrangements also produce Duplication progeny containing two copies of regions between the breakpoints. The 180 mapped reciprocal translocations and inversions include numerous overlapping combinations that can be used for duplication mapping.  相似文献   

6.
Inverted duplications are a common type of copy number variation (CNV) in germline and somatic genomes. Large duplications that include many genes can lead to both neurodevelopmental phenotypes in children and gene amplifications in tumors. There are several models for inverted duplication formation, most of which include a dicentric chromosome intermediate followed by breakage-fusion-bridge (BFB) cycles, but the mechanisms that give rise to the inverted dicentric chromosome in most inverted duplications remain unknown. Here we have combined high-resolution array CGH, custom sequence capture, next-generation sequencing, and long-range PCR to analyze the breakpoints of 50 nonrecurrent inverted duplications in patients with intellectual disability, autism, and congenital anomalies. For half of the rearrangements in our study, we sequenced at least one breakpoint junction. Sequence analysis of breakpoint junctions reveals a normal-copy disomic spacer between inverted and non-inverted copies of the duplication. Further, short inverted sequences are present at the boundary of the disomic spacer and the inverted duplication. These data support a mechanism of inverted duplication formation whereby a chromosome with a double-strand break intrastrand pairs with itself to form a “fold-back” intermediate that, after DNA replication, produces a dicentric inverted chromosome with a disomic spacer corresponding to the site of the fold-back loop. This process can lead to inverted duplications adjacent to terminal deletions, inverted duplications juxtaposed to translocations, and inverted duplication ring chromosomes.  相似文献   

7.
Duplicate genes emerge as copy-number variations (CNVs) at the population level, and remain copy-number polymorphic until they are fixed or lost. The successful establishment of such structural polymorphisms in the genome plays an important role in evolution by promoting genetic diversity, complexity and innovation. To characterize the early evolutionary stages of duplicate genes and their potential adaptive benefits, we combine comparative genomics with population genomics analyses to evaluate the distribution and impact of CNVs across natural populations of an eco-genomic model, the three-spined stickleback. With whole genome sequences of 66 individuals from populations inhabiting three distinct habitats, we find that CNVs generally occur at low frequencies and are often only found in one of the 11 populations surveyed. A subset of CNVs, however, displays copy-number differentiation between populations, showing elevated within-population frequencies consistent with local adaptation. By comparing teleost genomes to identify lineage-specific genes and duplications in sticklebacks, we highlight rampant gene content differences among individuals in which over 30% of young duplicate genes are CNVs. These CNV genes are evolving rapidly at the molecular level and are enriched with functional categories associated with environmental interactions, depicting the dynamic early copy-number polymorphic stage of genes during population differentiation.  相似文献   

8.
Copy-number variations cause genomic disorders. Triplications, unlike deletions and duplications, are poorly understood because of challenges in molecular identification, the choice of a proper model system for study, and awareness of their phenotypic consequences. We investigated the genomic disorder Charcot-Marie-Tooth disease type 1A (CMT1A), a dominant peripheral neuropathy caused by a 1.4 Mb recurrent duplication occurring by nonallelic homologous recombination. We identified CMT1A triplications in families in which the duplication segregates. The triplications arose de novo from maternally transmitted duplications and caused a more severe distal symmetric polyneuropathy phenotype. The recombination that generated the triplication occurred between sister chromatids on the duplication-bearing chromosome and could accompany gene conversions with the homologous chromosome. Diagnostic testing for CMT1A (n = 20,661 individuals) identified 13% (n = 2,752 individuals) with duplication and 0.024% (n = 5 individuals) with segmental tetrasomy, suggesting that triplications emerge from duplications at a rate as high as ∼1:550, which is more frequent than the rate of de novo duplication. We propose that individuals with duplications are predisposed to acquiring triplications and that the population prevalence of triplication is underascertained.  相似文献   

9.
10.
11.
Fabry disease, an inborn error of glycosphingolipid catabolism, results from mutations in the X-linked gene encoding the lysosomal enzyme, alpha-galactosidase A (EC 3.2.1.22). Six alpha-galactosidase A gene rearrangements that cause Fabry disease were investigated to assess the role of Alu repetitive elements and short direct and/or inverted repeats in the generation of these germinal mutations. The breakpoints of five partial gene deletions and one partial gene duplication were determined by either cloning and sequencing the mutant gene from an affected hemizygote, or by polymerase chain reaction amplifying and sequencing the genomic region containing the novel junction. Although the alpha-galactosidase A gene contains 12 Alu repetitive elements (representing approximately 30% of the 12-kilobase (kb) gene or approximately 1 Alu/1.0 kb), only one deletion resulted from an Alu-Alu recombination. The remaining five rearrangements involved illegitimate recombinational events between short direct repeats of 2 to 6 base pairs (bp) at the deletion or duplication breakpoints. Of these rearrangements, one had a 3' short direct repeat within an Alu element, while another was unusual having two deletions of 1.7 kb and 14 bp separated by a 151-bp inverted sequence. These findings suggested that slipped mispairing or intrachromosomal exchanges involving short direct repeats were responsible for the generation of most of these gene rearrangements. There were no inverted repeat sequences or alternating purine-pyrimidine regions which may have predisposed the gene to these rearrangements. Intriguingly, the tetranucleotide CCAG and the trinucleotide CAG (or their respective complements, CTGG and CTG) occurred within or adjacent to the direct repeats at the 5' breakpoints in three and four of the five alpha-galactosidase A gene rearrangements, respectively, suggesting a possible functional role in these illegitimate recombinational events. These studies indicate that short direct repeats are important in the formation of gene rearrangements, even in human genes like alpha-galactosidase A that are rich in Alu repetitive elements.  相似文献   

12.
Dosage sensitivity is an important evolutionary force which impacts on gene dispensability and duplicability. The newly available data on human copy-number variation (CNV) allow an analysis of the most recent and ongoing evolution. Provided that heterozygous gene deletions and duplications actually change gene dosage, we expect to observe negative selection against CNVs encompassing dosage sensitive genes. In this study, we make use of several sources of population genetic data to identify selection on structural variations of dosage sensitive genes. We show that CNVs can directly affect expression levels of contained genes. We find that genes encoding members of protein complexes exhibit limited expression variation and overlap significantly with a manually derived set of dosage sensitive genes. We show that complexes and other dosage sensitive genes are underrepresented in CNV regions, with a particular bias against frequent variations and duplications. These results suggest that dosage sensitivity is a significant force of negative selection on regions of copy-number variation.  相似文献   

13.
Complex genomic rearrangements (CGRs) consisting of two or more breakpoint junctions have been observed in genomic disorders. Recently, a chromosome catastrophe phenomenon termed chromothripsis, in which numerous genomic rearrangements are apparently acquired in one single catastrophic event, was described in multiple cancers. Here, we show that constitutionally acquired CGRs share similarities with cancer chromothripsis. In the 17 CGR cases investigated, we observed localization and multiple copy number changes including deletions, duplications, and/or triplications, as well as extensive translocations and inversions. Genomic rearrangements involved varied in size and complexities; in one case, array comparative genomic hybridization revealed 18 copy number changes. Breakpoint sequencing identified characteristic features, including small templated insertions at breakpoints and microhomology at breakpoint junctions, which have been attributed to replicative processes. The resemblance between CGR and chromothripsis suggests similar mechanistic underpinnings. Such chromosome catastrophic events appear to reflect basic DNA metabolism operative throughout an organism's life cycle.  相似文献   

14.
The recent availability of genomic sequence information for the class I region of the MHC has provided an opportunity to examine the genomic organization of HLA class I (HLAcI) and PERB11/MIC genes with a view to explaining their evolution from the perspective of extended genomic duplications rather than by simple gene duplications and/or gene conversion events. Analysis of genomic sequence from two regions of the MHC (the alpha- and beta-blocks) revealed that at least 6 PERB11 and 14 HLAcI genes, pseudogenes, and gene fragments are contained within extended duplicated segments. Each segment was searched for the presence of shared (paralogous) retroelements by RepeatMasker in order to use them as markers of evolution, genetic rearrangements, and evidence of segmental duplications. Shared Alu elements and other retroelements allowed the duplicated segments to be classified into five distinct groups (A to E) that could be further distilled down to an ancient preduplication segment containing a HLA and PERB11 gene, an endogenous retrovirus (HERV-16), and distinctive retroelements. The breakpoints within and between the different HLAcI segments were found mainly within the PERB11 and HLA genes, HERV-16, and other retroelements, suggesting that the latter have played a major role in duplication and indel events leading to the present organization of PERB11 and HLAcI genes. On the basis of the features contained within the segments, a coevolutionary model premised on tandem duplication of single and multipartite genomic segments is proposed. The model is used to explain the origins and genomic organization of retroelements, HERV-16, DNA transposons, PERB11, and HLAcI genes as distinct segmental combinations within the alpha- and beta-blocks of the human MHC. Received: 5 December 1998 / Accepted: 27 January 1999  相似文献   

15.
Gene duplication plays important roles in organismal evolution, because duplicate genes provide raw materials for the evolution of mechanisms controlling physiological and/or morphological novelties. Gene duplication can occur via several mechanisms, including segmental duplication, tandem duplication and retroposition. Although segmental and tandem duplications have been found to be important for the expansion of a number of multigene families, the contribution of retroposition is not clear. Here we show that plant SKP1 genes have evolved by multiple duplication events from a single ancestral copy in the most recent common ancestor (MRCA) of eudicots and monocots, resulting in 19 ASK (Arabidopsis SKP1-like) and 28 OSK (Oryza SKP1-like) genes. The estimated birth rates are more than ten times the average rate of gene duplication, and are even higher than that of other rapidly duplicating plant genes, such as type I MADS box genes, R genes, and genes encoding receptor-like kinases. Further analyses suggest that a relatively large proportion of the duplication events may be explained by tandem duplication, but few, if any, are likely to be due to segmental duplication. In addition, by mapping the gain/loss of a specific intron on gene phylogenies, and by searching for the features that characterize retrogenes/retrosequences, we show that retroposition is an important mechanism for expansion of the plant SKP1 gene family. Specifically, we propose that two and three ancient retroposition events occurred in lineages leading to Arabidopsis and rice, respectively, followed by repeated tandem duplications and chromosome rearrangements. Our study represents a thorough investigation showing that retroposition can play an important role in the evolution of a plant gene family whose members do not encode mobile elements.  相似文献   

16.
There is growing evidence that duplications have played a major role in eucaryotic genome evolution. Sequencing data revealed the presence of large duplicated regions in the genomes of many eucaryotic organisms, and comparative studies have suggested that duplication of large DNA segments has been a continuing process during evolution. However, little experimental data have been produced regarding this issue. Using a gene dosage assay for growth recovery in Saccharomyces cerevisiae, we demonstrate that a majority of the revertant strains (58%) resulted from the spontaneous duplication of large DNA segments, either intra- or interchromosomally, ranging from 41 to 655 kb in size. These events result in the concomitant duplication of dozens of genes and in some cases in the formation of chimeric open reading frames at the junction of the duplicated blocks. The types of sequences at the breakpoints as well as their superposition with the replication map suggest that spontaneous large segmental duplications result from replication accidents. Aneuploidization events or suppressor mutations that do not involve large-scale rearrangements accounted for the rest of the reversion events (in 26 and 16% of the strains, respectively).  相似文献   

17.
Insertion mutants of herpes simplex virus type 1, containing a second copy of the sequences of BamHI fragment L (map coordinates 0.706 to 0.744) inserted in inverted orientation into the thymidine kinase gene (at map coordinate 0.315), have been further characterized. We reported previously that, as a result of intramolecular or intermolecular recombination between copies of the BamHI-L sequence at the normal locus and inserted locus, a high proportion of progeny genomes exhibited either inversions of the unique sequence flanked by these inverted repeats or other rearrangements. Now we report that a genetic marker (syn-1 or syn-1+) originally present only in the inserted copy of BamHI fragment L appears in progeny at both the normal and inserted loci, and vice versa, at high frequency. Because these phenomena have not been observed with other insertion mutants containing duplications of other sequences from unique regions of the genome, we conclude that BamHI fragment L contains an element that enhances the rate of homologous recombination in adjacent sequences, resulting in genome rearrangements and gene conversion-like events.  相似文献   

18.
We describe genomic structures of 59 X-chromosome segmental duplications that include the proteolipid protein 1 gene (PLP1) in patients with Pelizaeus-Merzbacher disease. We provide the first report of 13 junction sequences, which gives insight into underlying mechanisms. Although proximal breakpoints were highly variable, distal breakpoints tended to cluster around low-copy repeats (LCRs) (50% of distal breakpoints), and each duplication event appeared to be unique (100 kb to 4.6 Mb in size). Sequence analysis of the junctions revealed no large homologous regions between proximal and distal breakpoints. Most junctions had microhomology of 1-6 bases, and one had a 2-base insertion. Boundaries between single-copy and duplicated DNA were identical to the reference genomic sequence in all patients investigated. Taken together, these data suggest that the tandem duplications are formed by a coupled homologous and nonhomologous recombination mechanism. We suggest repair of a double-stranded break (DSB) by one-sided homologous strand invasion of a sister chromatid, followed by DNA synthesis and nonhomologous end joining with the other end of the break. This is in contrast to other genomic disorders that have recurrent rearrangements formed by nonallelic homologous recombination between LCRs. Interspersed repetitive elements (Alu elements, long interspersed nuclear elements, and long terminal repeats) were found at 18 of the 26 breakpoint sequences studied. No specific motif that may predispose to DSBs was revealed, but single or alternating tracts of purines and pyrimidines that may cause secondary structures were common. Analysis of the 2-Mb region susceptible to duplications identified proximal-specific repeats and distal LCRs in addition to the previously reported ones, suggesting that the unique genomic architecture may have a role in nonrecurrent rearrangements by promoting instability.  相似文献   

19.
Duplications at 15q11.2-q13.3 overlapping the Prader-Willi/Angelman syndrome (PWS/AS) region have been associated with developmental delay (DD), autism spectrum disorder (ASD) and schizophrenia (SZ). Due to presence of imprinted genes within the region, the parental origin of these duplications may be key to the pathogenicity. Duplications of maternal origin are associated with disease, whereas the pathogenicity of paternal ones is unclear. To clarify the role of maternal and paternal duplications, we conducted the largest and most detailed study to date of parental origin of 15q11.2-q13.3 interstitial duplications in DD, ASD and SZ cohorts. We show, for the first time, that paternal duplications lead to an increased risk of developing DD/ASD/multiple congenital anomalies (MCA), but do not appear to increase risk for SZ. The importance of the epigenetic status of 15q11.2-q13.3 duplications was further underlined by analysis of a number of families, in which the duplication was paternally derived in the mother, who was unaffected, whereas her offspring, who inherited a maternally derived duplication, suffered from psychotic illness. Interestingly, the most consistent clinical characteristics of SZ patients with 15q11.2-q13.3 duplications were learning or developmental problems, found in 76% of carriers. Despite their lower pathogenicity, paternal duplications are less frequent in the general population with a general population prevalence of 0.0033% compared to 0.0069% for maternal duplications. This may be due to lower fecundity of male carriers and differential survival of embryos, something echoed in the findings that both types of duplications are de novo in just over 50% of cases. Isodicentric chromosome 15 (idic15) or interstitial triplications were not observed in SZ patients or in controls. Overall, this study refines the distinct roles of maternal and paternal interstitial duplications at 15q11.2-q13.3, underlining the critical importance of maternally expressed imprinted genes in the contribution of Copy Number Variants (CNVs) at this interval to the incidence of psychotic illness. This work will have tangible benefits for patients with 15q11.2-q13.3 duplications by aiding genetic counseling.  相似文献   

20.
In Neurospora crassa, DNA sequence duplications are detected and altered efficiently during the sexual cycle by a process known as RIP (repeat-induced point mutation). Affected sequences are subjected to multiple GC-to-AT mutations. To explore the pattern in which base changes are laid down by RIP we examined two sets of strains. First, we examined the products of a presumptive spontaneous RIP event at the mtr locus. Results of sequencing suggested that a single RIP event produces two distinct patterns of change, descended from the two strands of an affected DNA duplex. Equivalent results were obtained using an exceptional tetrad from a cross with a known duplication flanking the zeta-eta (zeta-eta) locus. The mtr sequence data were also used to further examine the basis for the differential severity of C-to-T mutations on the coding and noncoding strands in genes. The known bias of RIP toward CpA/TpG sites in conjunction with the sequence bias of Neurospora accounts for the differential effect. Finally, we used a collection of tandem repeats (from 16 to 935 bp in length) within the mtr gene to examine the length requirement for RIP. No evidence of RIP was found with duplications shorter than 400 bp while all longer tandem duplications were frequently affected. A comparison of these results with vegetative reversion data for the same duplications is consistent with the idea that reversion of long tandem duplications and RIP share a common step.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号