首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 512 毫秒
1.
Structural variation is an important cause of genetic variation. Whole genome analysis techniques can efficiently identify copy-number variable regions but there is a need for targeted methods, to verify and accurately size variable regions, and to diagnose large sample cohorts. We have developed a technique based on multiplex amplification of size-coded selectively circularized genomic fragments, which is robust, cheaper and more rapid than current multiplex targeted copy-number assays.  相似文献   

2.
Extensive copy-number variation of the human olfactory receptor gene family   总被引:3,自引:0,他引:3  
As much as a quarter of the human genome has been reported to vary in copy number between individuals, including regions containing about half of the members of the olfactory receptor (OR) gene family. We have undertaken a detailed study of copy-number variation of ORs to elucidate the selective and mechanistic forces acting on this gene family and the true impact of copy-number variation on human OR repertoires. We argue that the properties of copy-number variants (CNVs) and other sets of large genomic regions violate the assumptions of statistical methods that are commonly used in the assessment of gene enrichment. Using more appropriate methods, we provide evidence that OR enrichment in CNVs is not due to positive selection but is because of OR preponderance in segmentally duplicated regions, which are known to be frequently copy-number variable, and because purifying selection against CNVs is lower in OR-containing regions than in regions containing essential genes. We also combine multiplex ligation-dependent probe amplification (MLPA) and PCR to assay the copy numbers of 37 candidate CNV ORs in a panel of ~50 human individuals. We confirm copy-number variation of 18 ORs but find no variation in this human-diversity panel for 16 other ORs, highlighting the caveat that reported intervals often overrepresent true CNVs. The copy-number variation we describe is likely to underpin significant variation in olfactory abilities among human individuals. Finally, we show that both homology-based and homology-independent processes have played a recent role in remodeling the OR family.  相似文献   

3.
Studies of copy-number variation and linkage disequilibrium (LD) have typically excluded complex regions of the genome that are rich in duplications and prone to rearrangement. In an attempt to assess the heritability and LD of copy-number polymorphisms (CNPs) in duplication-rich regions of the genome, we profiled copy-number variation in 130 putative "rearrangement hotspot regions" among 269 individuals of European, Yoruba, Chinese, and Japanese ancestry analyzed by the International HapMap Consortium. Eighty-four hotspot regions, corresponding to 257 bacterial artificial chromosome (BAC) probes, showed evidence of copy-number differences. Despite a predisposing genetic architecture, no polymorphism was ever observed in the remaining 46 "rearrangement hotspots," and we suggest these represent excellent candidate sites for pathogenic rearrangements. We used a combination of BAC-based and high-density customized oligonucleotide arrays to resolve the molecular basis of structural rearrangements. For common variants (frequency >10%), we observed a distinct bias against copy-number losses, suggesting that deletions are subject to purifying selection. Heritability estimates did not differ significantly from 1.0 among the majority (30 of 34) of loci analyzed, consistent with normal Mendelian inheritance. Some of the CNPs in duplication-rich regions showed strong LD with nearby single-nucleotide polymorphisms (SNPs) and were observed to segregate on ancestral SNP haplotypes. However, LD with the best available SNP markers was weaker than has been reported for deletion polymorphisms in less complex regions of the genome. These observations may be accounted for by a low density of SNP data in duplicated regions, challenges in mapping and typing the CNPs, and the possibility that CNPs in these regions have rearranged on multiple haplotype backgrounds. Our results underscore the need for complete maps of genetic variation in duplication-rich regions of the genome.  相似文献   

4.
Segmental duplications and copy-number variation in the human genome   总被引:33,自引:0,他引:33       下载免费PDF全文
The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders.  相似文献   

5.
Targeted genome enrichment is a powerful tool for making use of the massive throughput of novel DNA-sequencing instruments. We herein present a simple and scalable protocol for multiplex amplification of target regions based on the Selector technique. The updated version exhibits improved coverage and compatibility with next-generation-sequencing (NGS) library-construction procedures for shotgun sequencing with NGS platforms. To demonstrate the performance of the technique, all 501 exons from 28 genes frequently involved in cancer were enriched for and sequenced in specimens derived from cell lines and tumor biopsies. DNA from both fresh frozen and formalin-fixed paraffin-embedded biopsies were analyzed and 94% specificity and 98% coverage of the targeted region was achieved. Reproducibility between replicates was high (R2 = 0, 98) and readily enabled detection of copy-number variations. The procedure can be carried out in <24 h and does not require any dedicated instrumentation.  相似文献   

6.
Dosage sensitivity is an important evolutionary force which impacts on gene dispensability and duplicability. The newly available data on human copy-number variation (CNV) allow an analysis of the most recent and ongoing evolution. Provided that heterozygous gene deletions and duplications actually change gene dosage, we expect to observe negative selection against CNVs encompassing dosage sensitive genes. In this study, we make use of several sources of population genetic data to identify selection on structural variations of dosage sensitive genes. We show that CNVs can directly affect expression levels of contained genes. We find that genes encoding members of protein complexes exhibit limited expression variation and overlap significantly with a manually derived set of dosage sensitive genes. We show that complexes and other dosage sensitive genes are underrepresented in CNV regions, with a particular bias against frequent variations and duplications. These results suggest that dosage sensitivity is a significant force of negative selection on regions of copy-number variation.  相似文献   

7.
We present a new random array format together with a decoding scheme for targeted multiplex digital molecular analyses. DNA samples are analyzed using multiplex sets of padlock or selector probes that create circular DNA molecules upon target recognition. The circularized DNA molecules are amplified through rolling-circle amplification (RCA) to generate amplified single molecules (ASMs). A random array is generated by immobilizing all ASMs on a microscopy glass slide. The ASMs are identified and counted through serial hybridizations of small sets of tag probes, according to a combinatorial decoding scheme. We show that random array format permits at least 10 iterations of hybridization, imaging and dehybridization, a process required for the combinatorial decoding scheme. We further investigated the quantitative dynamic range and precision of the random array format. Finally, as a demonstration, the decoding scheme was applied for multiplex quantitative analysis of genomic loci in samples having verified copy-number variations. Of 31 analyzed loci, all but one were correctly identified and responded according to the known copy-number variations. The decoding strategy is generic in that the target can be any biomolecule which has been encoded into a DNA circle via a molecular probing reaction.  相似文献   

8.
Diagnostic genome profiling in mental retardation   总被引:16,自引:0,他引:16       下载免费PDF全文
Mental retardation (MR) occurs in 2%-3% of the general population. Conventional karyotyping has a resolution of 5-10 million bases and detects chromosomal alterations in approximately 5% of individuals with unexplained MR. The frequency of smaller submicroscopic chromosomal alterations in these patients is unknown. Novel molecular karyotyping methods, such as array-based comparative genomic hybridization (array CGH), can detect submicroscopic chromosome alterations at a resolution of 100 kb. In this study, 100 patients with unexplained MR were analyzed using array CGH for DNA copy-number changes by use of a novel tiling-resolution genomewide microarray containing 32,447 bacterial artificial clones. Alterations were validated by fluorescence in situ hybridization and/or multiplex ligation-dependent probe amplification, and parents were tested to determine de novo occurrence. Reproducible DNA copy-number changes were present in 97% of patients. The majority of these alterations were inherited from phenotypically normal parents, which reflects normal large-scale copy-number variation. In 10% of the patients, de novo alterations considered to be clinically relevant were found: seven deletions and three duplications. These alterations varied in size from 540 kb to 12 Mb and were scattered throughout the genome. Our results indicate that the diagnostic yield of this approach in the general population of patients with MR is at least twice as high as that of standard GTG-banded karyotyping.  相似文献   

9.
We describe here a protocol for obtaining clones containing sequences present in low copy-number from genomic DNA where moderately and highly repeated sequences predominate. Specific chromosomal regions can be targeted by using deletion or addition line material. We have used this protocol to identify a sequence which has been deleted in both the tetraploid and hexaploid wheat mutants for the homoeologous chromosome pairing locus.  相似文献   

10.
11.
Despite considerable excitement over the potential functional significance of copy-number variants (CNVs), we still lack knowledge of the fine-scale architecture of the large majority of CNV regions in the human genome. In this study, we used a high-resolution array-based comparative genomic hybridization (aCGH) platform that targeted known CNV regions of the human genome at approximately 1 kb resolution to interrogate the genomic DNAs of 30 individuals from four HapMap populations. Our results revealed that 1020 of 1153 CNV loci (88%) were actually smaller in size than what is recorded in the Database of Genomic Variants based on previously published studies. A reduction in size of more than 50% was observed for 876 CNV regions (76%). We conclude that the total genomic content of currently known common human CNVs is likely smaller than previously thought. In addition, approximately 8% of the CNV regions observed in multiple individuals exhibited genomic architectural complexity in the form of smaller CNVs within larger ones and CNVs with interindividual variation in breakpoints. Future association studies that aim to capture the potential influences of CNVs on disease phenotypes will need to consider how to best ascertain this previously uncharacterized complexity.  相似文献   

12.

Background

Tandem repeat variation in protein-coding regions will alter protein length and may introduce frameshifts. Tandem repeat variants are associated with variation in pathogenicity in bacteria and with human disease. We characterized tandem repeat polymorphism in human proteins, using the UniGene database, and tested whether these were associated with host defense roles.

Results

Protein-coding tandem repeat copy-number polymorphisms were detected in 249 tandem repeats found in 218 UniGene clusters; observed length differences ranged from 2 to 144 nucleotides, with unit copy lengths ranging from 2 to 57. This corresponded to 1.59% (218/13,749) of proteins investigated carrying detectable polymorphisms in the copy-number of protein-coding tandem repeats. We found no evidence that tandem repeat copy-number polymorphism was significantly elevated in defense-response proteins (p = 0.882). An association with the Gene Ontology term 'protein-binding' remained significant after covariate adjustment and correction for multiple testing. Combining this analysis with previous experimental evaluations of tandem repeat polymorphism, we estimate the approximate mean frequency of tandem repeat polymorphisms in human proteins to be 6%. Because 13.9% of the polymorphisms were not a multiple of three nucleotides, up to 1% of proteins may contain frameshifting tandem repeat polymorphisms.

Conclusion

Around 1 in 20 human proteins are likely to contain tandem repeat copy-number polymorphisms within coding regions. Such polymorphisms are not more frequent among defense-response proteins; their prevalence among protein-binding proteins may reflect lower selective constraints on their structural modification. The impact of frameshifting and longer copy-number variants on protein function and disease merits further investigation.  相似文献   

13.
Differences between individuals in the copy-number of whole genes have been found in every multicellular species examined thus far. Such differences result in unique complements of protein-coding genes in all individuals, and have been shown to underlie adaptive phenotypic differences. Here, we review the evidence for copy-number variants (CNVs), focusing on the methods used to detect them and the molecular mechanisms responsible for generating this type of variation. Although there are multiple technical and computational challenges inherent to these experimental methods, next-generation sequencing technologies are making such experiments accessible in any system with a sequenced genome. We further discuss the connection between copy-number variation within species and copy-number divergence between species, showing that these values are exactly what one would expect from similar comparisons of nucleotide polymorphism and divergence. We conclude by reviewing the growing body of evidence for natural selection on copy-number variants. While it appears that most genic CNVs—especially deletions—are quickly eliminated by selection, there are now multiple studies demonstrating a strong link between copy-number differences at specific genes and phenotypic differences in adaptive traits. We argue that a complete understanding of the molecular basis for adaptive natural selection necessarily includes the study of copy-number variation.  相似文献   

14.
We describe methods with enhanced power and specificity to identify genes targeted by somatic copy-number alterations (SCNAs) that drive cancer growth. By separating SCNA profiles into underlying arm-level and focal alterations, we improve the estimation of background rates for each category. We additionally describe a probabilistic method for defining the boundaries of selected-for SCNA regions with user-defined confidence. Here we detail this revised computational approach, GISTIC2.0, and validate its performance in real and simulated datasets.  相似文献   

15.
Structural variation is an important class of genetic variation in mammals. High-throughput sequencing (HTS) technologies promise to revolutionize copy-number variation (CNV) detection but present substantial analytic challenges. Converging evidence suggests that multiple types of CNV-informative data (e.g. read-depth, read-pair, split-read) need be considered, and that sophisticated methods are needed for more accurate CNV detection. We observed that various sources of experimental biases in HTS confound read-depth estimation, and note that bias correction has not been adequately addressed by existing methods. We present a novel read-depth–based method, GENSENG, which uses a hidden Markov model and negative binomial regression framework to identify regions of discrete copy-number changes while simultaneously accounting for the effects of multiple confounders. Based on extensive calibration using multiple HTS data sets, we conclude that our method outperforms existing read-depth–based CNV detection algorithms. The concept of simultaneous bias correction and CNV detection can serve as a basis for combining read-depth with other types of information such as read-pair or split-read in a single analysis. A user-friendly and computationally efficient implementation of our method is freely available.  相似文献   

16.
The study of somatic genetic alterations in tumors contributes to the understanding and management of cancer. Genetic alterations, such us copy number or copy neutral changes, generate allelic imbalances (AIs) that can be determined using polymorphic markers. Here we report the development of a simple set of calculations for analyzing microsatellite multiplex PCR data from control-tumor pairs that allows us to obtain accurate information not only regarding the AI status of tumors, but also the percentage of tumor-infiltrating normal cells, the locus copy-number status and the mechanism involved in AI. We validated this new approach by re-analyzing a set of Neurofibromatosis type 1-associated dermal neurofibromas and comparing newly generated data with results obtained for the same tumors in a previous study using MLPA, Paralog Ratio Analysis and SNP-array techniques.Microsatellite multiplex PCR analysis (MMPA) should be particularly useful for analyzing specific regions of the genome containing tumor suppressor genes and also for determining the percentage of infiltrating normal cells within tumors allowing them to be sorted before they are analyzed by more expensive techniques.  相似文献   

17.
Nuclear DNA-based markers for plant evolutionary biology   总被引:8,自引:0,他引:8  
While DNA-based markers can provide a wealth of information for the study of plant evolutionary biology, progress is limited by the lack of primers available for PCR. To overcome this limitation, we outline a protocol for developing oligonucleotide primers targeting regions of low copy-number nuclear genes. This protocol is intended to lead to universally useful primer sets. To test our approach, we designed eight primer sets and tested their abilities to amplify targets from representatives of each dicot and one monocot subclass. Five of the eight primer sets amplified targets from at least five of the seven taxa and thus exhibited broad taxonomic usefulness; the remaining primers were rather specific, however, and amplified targets from at most three taxa. In only one primer-taxon combination was a complex multiple-banded amplification produced. Overall, the protocol outlined proved quite useful at identifying broadly applicable primers targeted to low copy-number nuclear genes. Wider application of this approach should be effective at greatly increasing the amount of genetic information available for a diversity of plant nuclear genomes.  相似文献   

18.
Height is a model polygenic trait that is highly heritable. Genome-wide association studies have identified hundreds of single-nucleotide polymorphisms associated with stature, but the role of structural variation in determining height is largely unknown. We performed a genome-wide association study of copy-number variation and stature in a clinical cohort of children who had undergone comparative genomic hybridization (CGH) microarray analysis for clinical indications. We found that subjects with short stature had a greater global burden of copy-number variants (CNVs) and a greater average CNV length than did controls (p < 0.002). These associations were present for lower-frequency (<5%) and rare (<1%) deletions, but there were no significant associations seen for duplications. Known gene-deletion syndromes did not account for our findings, and we saw no significant associations with tall stature. We then extended our findings into a population-based cohort and found that, in agreement with the clinical cohort study, an increased burden of lower-frequency deletions was associated with shorter stature (p = 0.015). Our results suggest that in individuals undergoing copy-number analysis for clinical indications, short stature increases the odds that a low-frequency deletion will be found. Additionally, copy-number variation might contribute to genetic variation in stature in the general population.  相似文献   

19.
Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95-99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ~15% and ~20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.  相似文献   

20.
Multiplex polymerase chain reaction (PCR) has multiple applications in molecular biology, including developing new targeted next-generation sequencing (NGS) panels. We present NGS-PrimerPlex, an efficient and versatile command-line application that designs primers for different refined types of amplicon-based genome target enrichment. It supports nested and anchored multiplex PCR, redistribution among multiplex reactions of primers constructed earlier, and extension of existing NGS-panels. The primer design process takes into consideration the formation of secondary structures, non-target amplicons between all primers of a pool, primers and high-frequent genome single-nucleotide polymorphisms (SNPs) overlapping. Moreover, users of NGS-PrimerPlex are free from manually defining input genome regions, because it can be done automatically from a list of genes or their parts like exon or codon numbers. Using the program, the NGS-panel for sequencing the LRRK2 gene coding regions was created, and 354 DNA samples were studied successfully with a median coverage of 97.4% of target regions by at least 30 reads. To show that NGS-PrimerPlex can also be applied for bacterial genomes, we designed primers to detect foodborne pathogens Salmonella enterica, Escherichia coli O157:H7, Listeria monocytogenes, and Staphylococcus aureus considering variable positions of the genomes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号