首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Segmental duplications and copy-number variation in the human genome   总被引:33,自引:0,他引:33       下载免费PDF全文
The human genome contains numerous blocks of highly homologous duplicated sequence. This higher-order architecture provides a substrate for recombination and recurrent chromosomal rearrangement associated with genomic disease. However, an assessment of the role of segmental duplications in normal variation has not yet been made. On the basis of the duplication architecture of the human genome, we defined a set of 130 potential rearrangement hotspots and constructed a targeted bacterial artificial chromosome (BAC) microarray (with 2,194 BACs) to assess copy-number variation in these regions by array comparative genomic hybridization. Using our segmental duplication BAC microarray, we screened a panel of 47 normal individuals, who represented populations from four continents, and we identified 119 regions of copy-number polymorphism (CNP), 73 of which were previously unreported. We observed an equal frequency of duplications and deletions, as well as a 4-fold enrichment of CNPs within hotspot regions, compared with control BACs (P < .000001), which suggests that segmental duplications are a major catalyst of large-scale variation in the human genome. Importantly, segmental duplications themselves were also significantly enriched >4-fold within regions of CNP. Almost without exception, CNPs were not confined to a single population, suggesting that these either are recurrent events, having occurred independently in multiple founders, or were present in early human populations. Our study demonstrates that segmental duplications define hotspots of chromosomal rearrangement, likely acting as mediators of normal variation as well as genomic disease, and it suggests that the consideration of genomic architecture can significantly improve the ascertainment of large-scale rearrangements. Our specialized segmental duplication BAC microarray and associated database of structural polymorphisms will provide an important resource for the future characterization of human genomic disorders.  相似文献   

2.
YP Zhang  FY Deng  TL Yang  F Zhang  XD Chen  H Shen  XZ Zhu  Q Tian  HW Deng 《PloS one》2012,7(9):e44292

Introduction

Human height is a highly heritable trait considered as an important factor for health. There has been limited success in identifying the genetic factors underlying height variation. We aim to identify sequence variants associated with adult height by a genome-wide association study of copy number variants (CNVs) in Chinese.

Methods

Genome-wide CNV association analyses were conducted in 1,625 unrelated Chinese adults and sex specific subgroup for height variation, respectively. Height was measured with a stadiometer. Affymetrix SNP6.0 genotyping platform was used to identify copy number polymorphisms (CNPs). We constructed a genomic map containing 1,009 CNPs in Chinese individuals and performed a genome-wide association study of CNPs with height.

Results

We detected 10 significant association signals for height (p<0.05) in the whole population, 9 and 11 association signals for Chinese female and male population, respectively. A copy number polymorphism (CNP12587, chr18:54081842-54086942, p = 2.41×10−4) was found to be significantly associated with height variation in Chinese females even after strict Bonferroni correction (p = 0.048). Confirmatory real time PCR experiments lent further support for CNV validation. Compared to female subjects with two copies of the CNP, carriers of three copies had an average of 8.1% decrease in height. An important candidate gene, ubiquitin-protein ligase NEDD4-like (NEDD4L), was detected at this region, which plays important roles in bone metabolism by binding to bone formation regulators.

Conclusions

Our findings suggest the important genetic variants underlying height variation in Chinese.  相似文献   

3.
Genome structural variation shows remarkable complexity with respect to copy number, sequence content and distribution. While the discovery of copy number polymorphisms (CNP) has increased exponentially in recent years, the transition from discovery to genotyping has proved challenging, particularly for CNPs embedded in complex regions of the genome. CNPs that are collectively common in the population and possess a dynamic range of copy numbers have proved the most difficult to genotype in association studies. This is in some part due to technical limitations of genotyping assays and the sequence properties of the genomic region being analyzed. Here we describe in detail the basis of a number of molecular techniques used to genotype complex CNPs, compare and contrast these approaches for determination of multi-allelic copy number, and discuss the potential application of these techniques in genetic studies.  相似文献   

4.

Background  

Copy number variations (CNVs) and polymorphisms (CNPs) have only recently gained the genetic community's attention. Conservative estimates have shown that CNVs and CNPs might affect more than 10% of the genome and that they may be at least as important as single nucleotide polymorphisms in assessing human variability. Widely used tools for CNP analysis have been implemented in Birdsuite and PLINK for the purpose of conducting genetic association studies based on the unpartitioned total number of CNP copies provided by the intensities from Affymetrix's Genome-Wide Human SNP Array. Here, we are interested in partitioning copy number variations and polymorphisms in extended pedigrees for the purpose of linkage analysis on familial data.  相似文献   

5.
Kim W  Gordon D  Sebat J  Ye KQ  Finch SJ 《PloS one》2008,3(10):e3475
Recent studies suggest that copy number polymorphisms (CNPs) may play an important role in disease susceptibility and onset. Currently, the detection of CNPs mainly depends on microarray technology. For case-control studies, conventionally, subjects are assigned to a specific CNP category based on the continuous quantitative measure produced by microarray experiments, and cases and controls are then compared using a chi-square test of independence. The purpose of this work is to specify the likelihood ratio test statistic (LRTS) for case-control sampling design based on the underlying continuous quantitative measurement, and to assess its power and relative efficiency (as compared to the chi-square test of independence on CNP counts). The sample size and power formulas of both methods are given. For the latter, the CNPs are classified using the Bayesian classification rule. The LRTS is more powerful than this chi-square test for the alternatives considered, especially alternatives in which the at-risk CNP categories have low frequencies. An example of the application of the LRTS is given for a comparison of CNP distributions in individuals of Caucasian or Taiwanese ethnicity, where the LRTS appears to be more powerful than the chi-square test, possibly due to misclassification of the most common CNP category into a less common category.  相似文献   

6.
Chimpanzee populations are diminishing as a consequence of human activities, and as a result this species is now endangered. In the context of conservation programmes, genetic data can add vital information, for instance on the genetic diversity and structure of threatened populations. Single nucleotide polymorphisms (SNP) are biallelic markers that are widely used in human molecular studies and can be implemented in efficient microarray systems. This technology offers the potential of robust, multiplexed SNP genotyping at low reagent cost in other organisms than humans, but it is not commonly used yet in wild population studies. Here, we describe the characterization of new SNPs in Y-chromosomal intronic regions in chimpanzees and also identify SNPs from mitochondrial genes, with the aim of developing a microarray system that permits the simultaneous study of both paternal and maternal lineages. Our system consists of 42 SNPs for the Y chromosome and 45 SNPs for the mitochondrial genome. We demonstrate the applicability of this microarray in a captive population where genotypes accurately reflected its large pedigree. Two wild-living populations were also analysed and the results show that the microarray will be a useful tool alongside microsatellite markers, since it supplies complementary information about population structure and ecology. SNP genotyping using microarray technology, therefore, is a promising approach and may become an essential tool in conservation genetics to help in the management and study of captive and wild-living populations. Moreover, microarrays that combine SNPs from different genomic regions could replace microsatellite typing in the future.  相似文献   

7.
Copy-number polymorphisms: mining the tip of an iceberg   总被引:5,自引:0,他引:5  
Copy-number polymorphisms (CNPs) represent a greatly underestimated aspect of human genetic variation. Recently, two landmark studies reported genome-wide analyses of CNPs in normal individuals and represent the beginning of an understanding of this type of large-scale variation. Future array-CGH-based CNP analyses should include standard criteria on a common microarray platform. It is only when parallel analyses of CNPs and SNPs are performed in an integrated format that we will obtain a global picture of our genetic diversity.  相似文献   

8.
We present a protocol for reliably detecting DNA copy number aberrations in a single human cell. Multiple displacement-amplified DNAs of a cell are hybridized to a 3,000-bacterial artificial chromosome (BAC) array and to an Affymetrix 250,000 (250K)-SNP array. Subsequent copy number calling is based on the integration of BAC probe-specific copy number probabilities that are estimated by comparing probe intensities with a single-cell whole-genome amplification (WGA) reference model for diploid chromosomes, as well as SNP copy number and loss-of-heterozygosity states estimated by hidden Markov models (HMM). All methods for detecting DNA copy number aberrations in single human cells have difficulty in confidently discriminating WGA artifacts from true genetic variants. Furthermore, some methods lack thorough validation for segmental DNA imbalance detection. Our protocol minimizes false-positive variant calling and enables uniparental isodisomy detection in single cells. Additionally, it provides quality assessment, allowing the exclusion of uninterpretable single-cell WGA samples. The protocol takes 5-7 d.  相似文献   

9.
DNA copy number variants (CNVs) that alter the copy number of a particular DNA segment in the genome play an important role in human phenotypic variability and disease susceptibility. A number of CNVs overlapping with genes have been shown to confer risk to a variety of human diseases thus highlighting the relevance of addressing the variability of CNVs at a higher resolution. So far, it has not been possible to deterministically infer the allelic composition of different haplotypes present within the CNV regions. We have developed a novel computational method, called PiCNV, which enables to resolve the haplotype sequence composition within CNV regions in nuclear families based on SNP genotyping microarray data. The algorithm allows to i) phase normal and CNV-carrying haplotypes in the copy number variable regions, ii) resolve the allelic copies of rearranged DNA sequence within the haplotypes and iii) infer the heritability of identified haplotypes in trios or larger nuclear families. To our knowledge this is the first program available that can deterministically phase null, mono-, di-, tri- and tetraploid genotypes in CNV loci. We applied our method to study the composition and inheritance of haplotypes in CNV regions of 30 HapMap Yoruban trios and 34 Estonian families. For 93.6% of the CNV loci, PiCNV enabled to unambiguously phase normal and CNV-carrying haplotypes and follow their transmission in the corresponding families. Furthermore, allelic composition analysis identified the co-occurrence of alternative allelic copies within 66.7% of haplotypes carrying copy number gains. We also observed less frequent transmission of CNV-carrying haplotypes from parents to children compared to normal haplotypes and identified an emergence of several de novo deletions and duplications in the offspring.  相似文献   

10.
A previously detected copy number polymorphism (Ep CNP) in patients affected with neuroectodermal tumors led us to investigate its frequency and length in the normal population. For this purpose, a program called Sequence Allocator was developed and applied for the construction of an array that consisted of unique and duplicated fragments, allowing the assessment of copy number variation within regions of segmental duplications. The average resolution of this array was 11 kb and we determined the size of the Ep CNP to be 290 kb. Analysis of normal controls identified 7.7 and 7.1% gains in peripheral blood and lymphoblastoid cell line (LCL) DNA, respectively, while deletions were found only in the LCL group (7.1%). This array platform allows the detection of DNA copy number variation within regions of pronounced genomic complexity, which constitutes an improvement over available technologies.  相似文献   

11.
Smoking is a major public health problem, but the genetic factors associated with smoking behaviors are not fully elucidated. Here, we have conducted an integrated genome-wide association study to identify common copy number polymorphisms (CNPs) and single nucleotide polymorphisms (SNPs) associated with the number of cigarettes smoked per day (CPD) in Japanese smokers ( = 17,158). Our analysis identified a common CNP with a strong effect on CPD (rs8102683; ) in the 19q13 region, encompassing the CYP2A6 locus. After adjustment for the associated CNP, we found an additional associated SNP (rs11878604; ) located 30 kb downstream of the CYP2A6 gene. Imputation of the CYP2A6 locus revealed that haplotypes underlying the CNP and the SNP corresponded to classical, functional alleles of CYP2A6 gene that regulate nicotine metabolism and explained 2% of the phenotypic variance of CPD (ANOVA -test ). These haplotypes were also associated with smoking-related diseases, including lung cancer, chronic obstructive pulmonary disease and arteriosclerosis obliterans.  相似文献   

12.
The detection of copy number variants (CNV) by array-based platforms provides valuable insight into understanding human diversity. However, suboptimal study design and data processing negatively affect CNV assessment. We quantitatively evaluate their impact when short-sequence oligonucleotide arrays are applied (Affymetrix Genome-Wide Human SNP Array 6.0) by evaluating 42 HapMap samples for CNV detection. Several processing and segmentation strategies are implemented, and results are compared to CNV assessment obtained using an oligonucleotide array CGH platform designed to query CNVs at high resolution (Agilent). We quantitatively demonstrate that different reference models (e.g. single versus pooled sample reference) used to detect CNVs are a major source of inter-platform discrepancy (up to 30%) and that CNVs residing within segmental duplication regions (higher reference copy number) are significantly harder to detect (P < 0.0001). After adjusting Affymetrix data to mimic the Agilent experimental design (reference sample effect), we applied several common segmentation approaches and evaluated differential sensitivity and specificity for CNV detection, ranging 39–77% and 86–100% for non-segmental duplication regions, respectively, and 18–55% and 39–77% for segmental duplications. Our results are relevant to any array-based CNV study and provide guidelines to optimize performance based on study-specific objectives.  相似文献   

13.
Segmental copy-number polymorphisms (CNPs) represent a significant component of human genetic variation and are likely to contribute to disease susceptibility. These potentially multiallelic and highly polymorphic systems present new challenges to family-based genetic-analysis tools that commonly assume codominant markers and allow for no genotyping error. The copy-number quantitation (CNP phenotype) represents the total number of segmental copies present in an individual and provides a means to infer, rather than to observe, the underlying allele segregation. We present an integrated approach to meet these challenges, in the form of a graphical model in which we infer the underlying CNP phenotype from the (single or replicate) quantitative measure within the analysis while assuming an allele-based system segregating through the pedigree. This approach can be readily applied to the study of any form of genetic measure, and the construction permits extension to a wide variety of hypothesis tests. We have implemented the basic model for use with nuclear families, and we illustrate its application through an analysis of the CNP located in gene CCL3L1 in 201 families with asthma.  相似文献   

14.
Array-based technologies have been used to detect chromosomal copy number changes (aneuploidies) in the human genome. Recent studies identified numerous copy number variants (CNV) and some are common polymorphisms that may contribute to disease susceptibility. We developed, and experimentally validated, a novel computational framework (QuantiSNP) for detecting regions of copy number variation from BeadArray SNP genotyping data using an Objective Bayes Hidden-Markov Model (OB-HMM). Objective Bayes measures are used to set certain hyperparameters in the priors using a novel re-sampling framework to calibrate the model to a fixed Type I (false positive) error rate. Other parameters are set via maximum marginal likelihood to prior training data of known structure. QuantiSNP provides probabilistic quantification of state classifications and significantly improves the accuracy of segmental aneuploidy identification and mapping, relative to existing analytical tools (Beadstudio, Illumina), as demonstrated by validation of breakpoint boundaries. QuantiSNP identified both novel and validated CNVs. QuantiSNP was developed using BeadArray SNP data but it can be adapted to other platforms and we believe that the OB-HMM framework has widespread applicability in genomic research. In conclusion, QuantiSNP is a novel algorithm for high-resolution CNV/aneuploidy detection with application to clinical genetics, cancer and disease association studies.  相似文献   

15.

Background

Single nucleotide polymorphisms (SNPs) have been used extensively in genetics and epidemiology studies. Traditionally, SNPs that did not pass the Hardy-Weinberg equilibrium (HWE) test were excluded from these analyses. Many investigators have addressed possible causes for departure from HWE, including genotyping errors, population admixture and segmental duplication. Recent large-scale surveys have revealed abundant structural variations in the human genome, including copy number variations (CNVs). This suggests that a significant number of SNPs must be within these regions, which may cause deviation from HWE.

Results

We performed a Bayesian analysis on the potential effect of copy number variation, segmental duplication and genotyping errors on the behavior of SNPs. Our results suggest that copy number variation is a major factor of HWE violation for SNPs with a small minor allele frequency, when the sample size is large and the genotyping error rate is 0∼1%.

Conclusions

Our study provides the posterior probability that a SNP falls in a CNV or a segmental duplication, given the observed allele frequency of the SNP, sample size and the significance level of HWE testing.  相似文献   

16.
Summary High‐density single‐nucleotide polymorphism (SNP) microarrays provide a useful tool for the detection of copy number variants (CNVs). The analysis of such large amounts of data is complicated, especially with regard to determining where copy numbers change and their corresponding values. In this article, we propose a Bayesian multiple change‐point model (BMCP) for segmentation and estimation of SNP microarray data. Segmentation concerns separating a chromosome into regions of equal copy number differences between the sample of interest and some reference, and involves the detection of locations of copy number difference changes. Estimation concerns determining true copy number for each segment. Our approach not only gives posterior estimates for the parameters of interest, namely locations for copy number difference changes and true copy number estimates, but also useful confidence measures. In addition, our algorithm can segment multiple samples simultaneously, and infer both common and rare CNVs across individuals. Finally, for studies of CNVs in tumors, we incorporate an adjustment factor for signal attenuation due to tumor heterogeneity or normal contamination that can improve copy number estimates.  相似文献   

17.
The 2',3'-cyclic nucleotide 3'-phosphodiesterases (CNPs) are closely related oligodendrocyte proteins whose in vivo function is unknown. To identify subcellular sites of CNP function, the distribution of CNP and CNP mRNA was determined in tissue sections from rats of various developmental ages. Our results indicate that CNP gene products were expressed exclusively by oligodendrocytes in the CNS. CNP mRNA was concentrated around oligodendrocyte perinuclear regions during all stages of myelination. Developmentally, initial detection of CNP mRNA closely paralleled initial detection of its translation products. In electron micrographs of immunostained ultrathin cryosections, CNP was associated with oligodendrocyte membranes during the earliest phase of axonal ensheathment. In more mature fibers, immunocytochemistry established that the CNPs are not major components of compact myelin but are concentrated within specific regions of the oligodendrocyte and myelin internode. These include (a) the plasma membrane of oligodendrocytes and their processes, (b) the periaxonal membrane and inner mesaxon, (c) the outer tongue process, (d) the paranodal myelin loops, and (e) the "incisure-like" membranes found in many larger CNS myelin sheaths. A cytoplasmic pool of CNP was also detected in oligodendrocyte perikarya and larger oligodendrocyte processes. CNP was also enriched in similar locations in myelinated fibers of the PNS.  相似文献   

18.
DNA variants, such as single nucleotide polymorphisms (SNPs) and copy number variants (CNVs), are unevenly distributed across the human genome. Currently, dbSNP contains more than 6 million human SNPs, and whole-genome genotyping arrays can assay more than 4 million of them simultaneously. In our study, we first questioned whether published genome-wide association studies (GWASs) assays cover all regions well in the genome. Using dbSNP build 135 data, we identified 50 genomic regions longer than 100 Kb that do not contain any common SNPs, i.e., those with minor allele frequency (MAF)≥1%. Secondly, because conserved regions are generally of functional importance, we tested genes in those large genomic regions without common SNPs. We found 97 genes and were enriched for reproduction function. In addition, we further filtered out regions with CNVs listed in the Database of Genomic Variants (DGV), segmental duplications from Human Genome Project and common variants identified by personal genome sequencing (UCSC). No region survived after those filtering. Our analysis suggests that, while there may not be many large genomic regions free of common variants, there are still some “holes” in the current human genomic map for common SNPs. Because GWAS only focused on common SNPs, interpretation of GWAS results should take this limitation into account. Particularly, two recent GWAS of fertility may be incomplete due to the map deficit. Additional SNP discovery efforts should pay close attention to these regions.  相似文献   

19.
Copy-number variations (CNVs) are widespread in the human genome, but comprehensive assignments of integer locus copy-numbers (i.e., copy-number genotypes) that, for example, enable discrimination of homozygous from heterozygous CNVs, have remained challenging. Here we present CopySeq, a novel computational approach with an underlying statistical framework that analyzes the depth-of-coverage of high-throughput DNA sequencing reads, and can incorporate paired-end and breakpoint junction analysis based CNV-analysis approaches, to infer locus copy-number genotypes. We benchmarked CopySeq by genotyping 500 chromosome 1 CNV regions in 150 personal genomes sequenced at low-coverage. The assessed copy-number genotypes were highly concordant with our performed qPCR experiments (Pearson correlation coefficient 0.94), and with the published results of two microarray platforms (95-99% concordance). We further demonstrated the utility of CopySeq for analyzing gene regions enriched for segmental duplications by comprehensively inferring copy-number genotypes in the CNV-enriched >800 olfactory receptor (OR) human gene and pseudogene loci. CopySeq revealed that OR loci display an extensive range of locus copy-numbers across individuals, with zero to two copies in some OR loci, and two to nine copies in others. Among genetic variants affecting OR loci we identified deleterious variants including CNVs and SNPs affecting ~15% and ~20% of the human OR gene repertoire, respectively, implying that genetic variants with a possible impact on smell perception are widespread. Finally, we found that for several OR loci the reference genome appears to represent a minor-frequency variant, implying a necessary revision of the OR repertoire for future functional studies. CopySeq can ascertain genomic structural variation in specific gene families as well as at a genome-wide scale, where it may enable the quantitative evaluation of CNVs in genome-wide association studies involving high-throughput sequencing.  相似文献   

20.

Background

Intellectual disability (ID) affects 2–3% of the population and may occur with or without multiple congenital anomalies (MCA) or other medical conditions. Established genetic syndromes and visible chromosome abnormalities account for a substantial percentage of ID diagnoses, although for ∼50% the molecular etiology is unknown. Individuals with features suggestive of various syndromes but lacking their associated genetic anomalies pose a formidable clinical challenge. With the advent of microarray techniques, submicroscopic genome alterations not associated with known syndromes are emerging as a significant cause of ID and MCA.

Methodology/Principal Findings

High-density SNP microarrays were used to determine genome wide copy number in 42 individuals: 7 with confirmed alterations in the WS region but atypical clinical phenotypes, 31 with ID and/or MCA, and 4 controls. One individual from the first group had the most telomeric gene in the WS critical region deleted along with 2 Mb of flanking sequence. A second person had the classic WS deletion and a rearrangement on chromosome 5p within the Cri du Chat syndrome (OMIM:123450) region. Six individuals from the ID/MCA group had large rearrangements (3 deletions, 3 duplications), one of whom had a large inversion associated with a deletion that was not detected by the SNP arrays.

Conclusions/Significance

Combining SNP microarray analyses and qPCR allowed us to clone and sequence 21 deletion breakpoints in individuals with atypical deletions in the WS region and/or ID or MCA. Comparison of these breakpoints to databases of genomic variation revealed that 52% occurred in regions harboring structural variants in the general population. For two probands the genomic alterations were flanked by segmental duplications, which frequently mediate recurrent genome rearrangements; these may represent new genomic disorders. While SNP arrays and related technologies can identify potentially pathogenic deletions and duplications, obtaining sequence information from the breakpoints frequently provides additional information.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号