首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Background

Array-based comparative genomic hybridization (array CGH) is a highly efficient technique, allowing the simultaneous measurement of genomic DNA copy number at hundreds or thousands of loci and the reliable detection of local one-copy-level variations. Characterization of these DNA copy number changes is important for both the basic understanding of cancer and its diagnosis. In order to develop effective methods to identify aberration regions from array CGH data, many recent research work focus on both smoothing-based and segmentation-based data processing. In this paper, we propose stationary packet wavelet transform based approach to smooth array CGH data. Our purpose is to remove CGH noise in whole frequency while keeping true signal by using bivariate model.

Results

In both synthetic and real CGH data, Stationary Wavelet Packet Transform (SWPT) is the best wavelet transform to analyze CGH signal in whole frequency. We also introduce a new bivariate shrinkage model which shows the relationship of CGH noisy coefficients of two scales in SWPT. Before smoothing, the symmetric extension is considered as a preprocessing step to save information at the border.

Conclusion

We have designed the SWTP and the SWPT-Bi which are using the stationary wavelet packet transform with the hard thresholding and the new bivariate shrinkage estimator respectively to smooth the array CGH data. We demonstrate the effectiveness of our approach through theoretical and experimental exploration of a set of array CGH data, including both synthetic data and real data. The comparison results show that our method outperforms the previous approaches.
  相似文献   

2.
Comparative genomic hybridization (CGH) microarrays have been used to determine copy number variations (CNVs) and their effects on complex diseases. Detection of absolute CNVs independent of genomic variants of an arbitrary reference sample has been a critical issue in CGH array experiments. Whole genome analysis using massively parallel sequencing with multiple ultra-high resolution CGH arrays provides an opportunity to catalog highly accurate genomic variants of the reference DNA (NA10851). Using information on variants, we developed a new method, the CGH array reference-free algorithm (CARA), which can determine reference-unbiased absolute CNVs from any CGH array platform. The algorithm enables the removal and rescue of false positive and false negative CNVs, respectively, which appear due to the effects of genomic variants of the reference sample in raw CGH array experiments. We found that the CARA remarkably enhanced the accuracy of CGH array in determining absolute CNVs. Our method thus provides a new approach to interpret CGH array data for personalized medicine.  相似文献   

3.
MOTIVATION: Chromosomal copy number changes (aneuploidies) are common in cell populations that undergo multiple cell divisions including yeast strains, cell lines and tumor cells. Identification of aneuploidies is critical in evolutionary studies, where changes in copy number serve an adaptive purpose, as well as in cancer studies, where amplifications and deletions of chromosomal regions have been identified as a major pathogenetic mechanism. Aneuploidies can be studied on whole-genome level using array CGH (a microarray-based method that measures the DNA content), but their presence also affects gene expression. In gene expression microarray analysis, identification of copy number changes is especially important in preventing aberrant biological conclusions based on spurious gene expression correlation or masked phenotypes that arise due to aneuploidies. Previously suggested approaches for aneuploidy detection from microarray data mostly focus on array CGH, address only whole-chromosome or whole-arm copy number changes, and rely on thresholds or other heuristics, making them unsuitable for fully automated general application to gene expression datasets. There is a need for a general and robust method for identification of aneuploidies of any size from both array CGH and gene expression microarray data. RESULTS: We present ChARM (Chromosomal Aberration Region Miner), a robust and accurate expectation-maximization based method for identification of segmental aneuploidies (partial chromosome changes) from gene expression and array CGH microarray data. Systematic evaluation of the algorithm on synthetic and biological data shows that the method is robust to noise, aneuploidal segment size and P-value cutoff. Using our approach, we identify known chromosomal changes and predict novel potential segmental aneuploidies in commonly used yeast deletion strains and in breast cancer. ChARM can be routinely used to identify aneuploidies in array CGH datasets and to screen gene expression data for aneuploidies or array biases. Our methodology is sensitive enough to detect statistically significant and biologically relevant aneuploidies even when expression or DNA content changes are subtle as in mixed populations of cells. AVAILABILITY: Code available by request from the authors and on Web supplement at http://function.cs.princeton.edu/ChARM/  相似文献   

4.
MOTIVATION: Array Comparative Genomic Hybridization (CGH) can reveal chromosomal aberrations in the genomic DNA. These amplifications and deletions at the DNA level are important in the pathogenesis of cancer and other diseases. While a large number of approaches have been proposed for analyzing the large array CGH datasets, the relative merits of these methods in practice are not clear. RESULTS: We compare 11 different algorithms for analyzing array CGH data. These include both segment detection methods and smoothing methods, based on diverse techniques such as mixture models, Hidden Markov Models, maximum likelihood, regression, wavelets and genetic algorithms. We compute the Receiver Operating Characteristic (ROC) curves using simulated data to quantify sensitivity and specificity for various levels of signal-to-noise ratio and different sizes of abnormalities. We also characterize their performance on chromosomal regions of interest in a real dataset obtained from patients with Glioblastoma Multiforme. While comparisons of this type are difficult due to possibly sub-optimal choice of parameters in the methods, they nevertheless reveal general characteristics that are helpful to the biological investigator.  相似文献   

5.
阵列-比较基因组杂交技术(array comparative genomic hybridization, array CGH)能在全基因组水平和/或高分辨率基础上检测染色体拷贝数的变化,主要应用于遗传学和肿瘤学研究。Array CGH中微阵列探针通常是PCR扩增的BAC克隆或cDNA分子。最近几年,寡核苷酸阵列比较基因组杂交(oligonucleotide array CGH, oaCGH)逐渐开始应用。oaCGH与BAC array CGH比较,具有操作更简便、探针设计更灵活、分辨率更高等多项优点,预计oaCGH将逐步取代利用BAC克隆片段或cDNA分子的array CGH。oaCGH的应用及其与其它高通量检测技术的结合将促进新的癌症相关基因、肿瘤耐药基因的发现。本文综述了现有主要oaCGH平台在空间分辨率、探针长度、灵敏度、特异性等方面的特点及其应用,概括了oaCGH近年来的进展。  相似文献   

6.
SUMMARY: We describe a tool, called aCGH-Smooth, for the automated identification of breakpoints and smoothing of microarray comparative genomic hybridization (array CGH) data. aCGH-Smooth is written in visual C++, has a user-friendly interface including a visualization of the results and user-defined parameters adapting the performance of data smoothing and breakpoint recognition. aCGH-Smooth can handle array-CGH data generated by all array-CGH platforms: BAC, PAC, cosmid, cDNA and oligo CGH arrays. The tool has been successfully applied to real-life data. AVAILABILITY: aCGH-Smooth is free for researchers at academic and non-profit institutions at http://www.few.vu.nl/~vumarray/.  相似文献   

7.

Background

Molecular alterations critical to development of cancer include mutations, copy number alterations (amplifications and deletions) as well as genomic rearrangements resulting in gene fusions. Massively parallel next generation sequencing, which enables the discovery of such changes, uses considerable quantities of genomic DNA (> 5 ug), a serious limitation in ever smaller clinical samples. However, a commonly available microarray platforms such as array comparative genomic hybridization (array CGH) allows the characterization of gene copy number at a single gene resolution using much smaller amounts of genomic DNA. In this study we evaluate the sensitivity of ultra-dense array CGH platforms developed by Agilent, especially that of the 1 million probe array (1 M array), and their application when whole genome amplification is required because of limited sample quantities.

Methods

We performed array CGH on whole genome amplified and not amplified genomic DNA from MCF-7 breast cancer cells, using 244 K and 1 M Agilent arrays. The ADM-2 algorithm was used to identify micro-copy number alterations that measured less than 1 Mb in genomic length.

Results

DNA from MCF-7 breast cancer cells was analyzed for micro-copy number alterations, defined as measuring less than 1 Mb in genomic length. The 4-fold extra resolution of the 1 M array platform relative to the less dense 244 K array platform, led to the improved detection of copy number variations (CNVs) and micro-CNAs. The identification of intra-genic breakpoints in areas of DNA copy number gain signaled the possible presence of gene fusion events. However, the ultra-dense platforms, especially the densest 1 M array, detect artifacts inherent to whole genome amplification and should be used only with non-amplified DNA samples.

Conclusions

This is a first report using 1 M array CGH for the discovery of cancer genes and biomarkers. We show the remarkable capacity of this technology to discover CNVs, micro-copy number alterations and even gene fusions. However, these platforms require excellent genomic DNA quality and do not tolerate relatively small imperfections related to the whole genome amplification.  相似文献   

8.
The array CGH technique (Array Comparative Genome Hybridization) has been developed to detect chromosomal copy number changes on a genome-wide and/or high-resolution scale. It is used in human genetics and oncology, with great promise for clinical application. Until recently primarily PCR amplified bacterial artificial chromosomes (BACs) or cDNAs have been spotted as elements on the array. The large-scale DNA isolations or PCR amplifications of the large-insert clones necessary for manufacturing the arrays are elaborate and time-consuming. Lack of a high-resolution highly sensitive (commercial) alternative has undoubtedly hindered the implementation of array CGH in research and diagnostics. Recently, synthetic oligonucleotides as arrayed elements have been introduced as an alternative substrate for array CGH, both by academic institutions as well as by commercial providers. Oligonucleotide libraries or ready-made arrays can be bought off-the-shelf saving considerable time and efforts. For RNA expression profiling, we have seen a gradual transition from in-house printed cDNA-based expression arrays to oligonucleotide arrays and we expect a similar transition for array CGH. This review compares the different platforms and will attempt to shine a light on the ‘BAC to the future’ of the array CGH technique.  相似文献   

9.
MOTIVATION: Array comparative genomic hybridization (CGH) allows detection and mapping of copy number of DNA segments. A challenge is to make inferences about the copy number structure of the genome. Several statistical methods have been proposed to determine genomic segments with different copy number levels. However, to date, no comprehensive comparison of various characteristics of these methods exists. Moreover, the segmentation results have not been utilized in downstream analyses. RESULTS: We describe a comparison of three popular and publicly available methods for the analysis of array CGH data and we demonstrate how segmentation results may be utilized in the downstream analyses such as testing and classification, yielding higher power and prediction accuracy. Since the methods operate on individual chromosomes, we also propose a novel procedure for merging segments across the genome, which results in an interpretable set of copy number levels, and thus facilitate identification of copy number alterations in each genome. AVAILABILITY: http://www.bioconductor.org  相似文献   

10.
The use of array comparative genomic hybridization (array CGH) as a diagnostic tool in molecular genetics has facilitated the identification of many new microdeletion/microduplication syndromes (MMSs). Furthermore, this method has allowed for the identification of copy number variations (CNVs) whose pathogenic role has yet to be uncovered. Here, we report on our application of array CGH for the identification of pathogenic CNVs in 79 Russian children with intellectual disability (ID). Twenty-six pathogenic or likely pathogenic changes in copy number were detected in 22 patients (28%): 8 CNVs corresponded to known MMSs, and 17 were not associated with previously described syndromes. In this report, we describe our findings and comment on genes potentially associated with ID that are located within the CNV regions.  相似文献   

11.
Array-based comparative genomics hybridization (aCGH) has gained prevalence as an effective technique for measuring structural variations in the genome. Copy-number variations (CNVs) form a large source of genomic structural variation, but it is not known whether phenotypic differences between intra-species groups, such as divergent human populations, or breeds of a domestic animal, can be attributed to CNVs. Several computational methods have been proposed to improve the detection of CNVs from array CGH data, but few population studies have used CGH data for identification of intra-species differences. In this paper we propose a novel method of genome-wide comparison and classification using CGH data that condenses whole genome information, aimed at quantification of intra-species variations and discovery of shared ancestry. Our strategy included smoothing CGH data using an appropriate denoising algorithm, extracting features via wavelets, quantifying the information via wavelet power spectrum and hierarchical clustering of the resultant profile. To evaluate the classification efficiency of our method, we used simulated data sets. We applied it to aCGH data from human and bovine individuals and showed that it successfully detects existing intra-specific variations with additional evolutionary implications.  相似文献   

12.
BACKGROUND: Array-based comparative genomic hybridization (aCGH) enables genome-wide quantitative delineation of genomic imbalances. A high-resolution contig array was developed specifically for chromosome 8q because this chromosome arm is frequently altered in many human cancers. METHODS: A minimal tiling path contig of 702 8q-specific bacterial artificial chromosome (BAC) clones was generated with a novel computational tool (BAC Contig Assembler). BAC clones were amplified by degenerative oligonucleotide primer (DOP) polymerase chain reaction and subsequently printed onto glass slides. For validation of the array DNA samples of gastroesophageal and prostate cancer cell lines, and chronic myeloid leukemia specimens were used, which were previously characterized by multicolor fluorescence in situ hybridization and conventional CGH. RESULTS: Single and double copy gains were confidently demonstrated with the 8q array. Single copy loss and high-level amplifications were accurately detected and confirmed by bicolor fluorescence in situ hybridization experiments. The 8q array was further tested with paraffin-embedded prostate cancer specimens. In these archival specimens, the copy number changes were confirmed. In fresh and archival samples, additional alterations were disclosed. In comparison with conventional CGH, the resolution of the detected changes was much improved, which was demonstrated by an amplicon of 0.7 Mb and a deletion of 0.6 Mb, both spanned by only six BAC clones. CONCLUSIONS: A comprehensive array is presented, which provides a high-resolution method for mapping copy number alterations on chromosome 8q.  相似文献   

13.
The availability of high resolution array comparative genomic hybridization (CGH) platforms has led to increasing complexities in data analysis. Specifically, defining contiguous regions of alterations or segmentation can be computationally intensive and popular algorithms can take hours to days for the processing of arrays comprised of hundreds of thousands to millions of elements. Additionally, tumors tend to demonstrate subtle copy number alterations due to heterogeneity, ploidy and hybridization effects. Thus, there is a need for fast, sensitive array CGH segmentation and alteration calling algorithms. Here, we describe Fast Algorithm for Calling After Detection of Edges (FACADE), a highly sensitive and easy to use algorithm designed to rapidly segment and call high resolution array data.  相似文献   

14.
MOTIVATION: Array CGH technologies enable the simultaneous measurement of DNA copy number for thousands of sites on a genome. We developed the circular binary segmentation (CBS) algorithm to divide the genome into regions of equal copy number. The algorithm tests for change-points using a maximal t-statistic with a permutation reference distribution to obtain the corresponding P-value. The number of computations required for the maximal test statistic is O(N2), where N is the number of markers. This makes the full permutation approach computationally prohibitive for the newer arrays that contain tens of thousands markers and highlights the need for a faster algorithm. RESULTS: We present a hybrid approach to obtain the P-value of the test statistic in linear time. We also introduce a rule for stopping early when there is strong evidence for the presence of a change. We show through simulations that the hybrid approach provides a substantial gain in speed with only a negligible loss in accuracy and that the stopping rule further increases speed. We also present the analyses of array CGH data from breast cancer cell lines to show the impact of the new approaches on the analysis of real data. AVAILABILITY: An R version of the CBS algorithm has been implemented in the "DNAcopy" package of the Bioconductor project. The proposed hybrid method for the P-value is available in version 1.2.1 or higher and the stopping rule for declaring a change early is available in version 1.5.1 or higher.  相似文献   

15.
16.
The statistical analysis of array comparative genomic hybridization (CGH) data has now shifted to the joint assessment of copy number variations at the cohort level. Considering multiple profiles gives the opportunity to correct for systematic biases observed on single profiles, such as probe GC content or the so-called "wave effect." In this article, we extend the segmentation model developed in the univariate case to the joint analysis of multiple CGH profiles. Our contribution is multiple: we propose an integrated model to perform joint segmentation, normalization, and calling for multiple array CGH profiles. This model shows great flexibility, especially in the modeling of the wave effect that gives a likelihood framework to approaches proposed by others. We propose a new dynamic programming algorithm for break point positioning, as well as a model selection criterion based on a modified bayesian information criterion proposed in the univariate case. The performance of our method is assessed using simulated and real data sets. Our method is implemented in the R package cghseg.  相似文献   

17.
18.
Noncrossing quantile regression curve estimation   总被引:4,自引:0,他引:4  
Bondell HD  Reich BJ  Wang H 《Biometrika》2010,97(4):825-838
Since quantile regression curves are estimated individually, the quantile curves can cross, leading to an invalid distribution for the response. A simple constrained version of quantile regression is proposed to avoid the crossing problem for both linear and nonparametric quantile curves. A simulation study and a reanalysis of tropical cyclone intensity data shows the usefulness of the procedure. Asymptotic properties of the estimator are equivalent to the typical approach under standard conditions, and the proposed estimator reduces to the classical one if there is no crossing. The performance of the constrained estimator has shown significant improvement by adding smoothing and stability across the quantile levels.  相似文献   

19.
Microarray-based comparative genomic hybridization (array-CGH) is a technique by which variations in copy numbers between two genomes can be analyzed using DNA microarrays. Array CGH has been used to survey chromosomal amplifications and deletions in fetal aneuploidies or cancer tissues. Herein we report a user-friendly, MATLAB-based, array CGH analyzing program, Chang Gung comparative genomic hybridization (CGcgh), as a standalone PC version. The analyzed chromosomal data are displayed in a graphic interface, and CGcgh allows users to launch a corresponding G-banding ideogram. The abnormal DNA copy numbers (gains and losses) can be identified automatically using a user defined window size (default value is 50 probes) and sequential student t-tests with sliding windows along with chromosomes. CGcgh has been tested in multiple karyotype-confirmed human samples, including five published cases and trisomies 13, 18, 21 and X from our laboratories, and 18 cases of which microarray data are available publicly. CGcgh can be used to detect the copy number changes in small genomic regions, which are commonly encountered by clinical geneticists. CGcgh works well for the data from cDNA microarray, spotted oligonucleotide microarrays, and Affymetrix Human Mapping Arrays (10K, 100K, 500K Array Sets). The program can be freely downloaded from . Y. S. Lee and A. Chao contributed equally to this work.  相似文献   

20.
The discovery of an abundance of copy number variants (CNVs; gains and losses of DNA sequences >1 kb) and other structural variants in the human genome is influencing the way research and diagnostic analyses are being designed and interpreted. As such, comprehensive databases with the most relevant information will be critical to fully understand the results and have impact in a diverse range of disciplines ranging from molecular biology to clinical genetics. Here, we describe the development of bioinformatics resources to facilitate these studies. The Database of Genomic Variants (http://projects.tcag.ca/variation/) is a comprehensive catalogue of structural variation in the human genome. The database currently contains 1,267 regions reported to contain copy number variation or inversions in apparently healthy human cases. We describe the current contents of the database and how it can serve as a resource for interpretation of array comparative genomic hybridization (array CGH) and other DNA copy imbalance data. We also present the structure of the database, which was built using a new data modeling methodology termed Cross-Referenced Tables (XRT). This is a generic and easy-to-use platform, which is strong in handling textual data and complex relationships. Web-based presentation tools have been built allowing publication of XRT data to the web immediately along with rapid sharing of files with other databases and genome browsers. We also describe a novel tool named eFISH (electronic fluorescence in situ hybridization) (http://projects.tcag.ca/efish/), a BLAST-based program that was developed to facilitate the choice of appropriate clones for FISH and CGH experiments, as well as interpretation of results in which genomic DNA probes are used in hybridization-based experiments.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号