首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 125 毫秒
1.
Genomic DNA copy-number alterations (CNAs) are associated with complex diseases, including cancer: CNAs are indeed related to tumoral grade, metastasis, and patient survival. CNAs discovered from array-based comparative genomic hybridization (aCGH) data have been instrumental in identifying disease-related genes and potential therapeutic targets. To be immediately useful in both clinical and basic research scenarios, aCGH data analysis requires accurate methods that do not impose unrealistic biological assumptions and that provide direct answers to the key question, "What is the probability that this gene/region has CNAs?" Current approaches fail, however, to meet these requirements. Here, we introduce reversible jump aCGH (RJaCGH), a new method for identifying CNAs from aCGH; we use a nonhomogeneous hidden Markov model fitted via reversible jump Markov chain Monte Carlo; and we incorporate model uncertainty through Bayesian model averaging. RJaCGH provides an estimate of the probability that a gene/region has CNAs while incorporating interprobe distance and the capability to analyze data on a chromosome or genome-wide basis. RJaCGH outperforms alternative methods, and the performance difference is even larger with noisy data and highly variable interprobe distance, both commonly found features in aCGH data. Furthermore, our probabilistic method allows us to identify minimal common regions of CNAs among samples and can be extended to incorporate expression data. In summary, we provide a rigorous statistical framework for locating genes and chromosomal regions with CNAs with potential applications to cancer and other complex human diseases.  相似文献   

2.
SUMMARY: We present a tool for control-free copy number alteration (CNA) detection using deep-sequencing data, particularly useful for cancer studies. The tool deals with two frequent problems in the analysis of cancer deep-sequencing data: absence of control sample and possible polyploidy of cancer cells. FREEC (control-FREE Copy number caller) automatically normalizes and segments copy number profiles (CNPs) and calls CNAs. If ploidy is known, FREEC assigns absolute copy number to each predicted CNA. To normalize raw CNPs, the user can provide a control dataset if available; otherwise GC content is used. We demonstrate that for Illumina single-end, mate-pair or paired-end sequencing, GC-contentr normalization provides smooth profiles that can be further segmented and analyzed in order to predict CNAs. AVAILABILITY: Source code and sample data are available at http://bioinfo-out.curie.fr/projects/freec/.  相似文献   

3.
Cancer progression is often driven by an accumulation of genetic changes but also accompanied by increasing genomic instability. These processes lead to a complicated landscape of copy number alterations (CNAs) within individual tumors and great diversity across tumor samples. High resolution array-based comparative genomic hybridization (aCGH) is being used to profile CNAs of ever larger tumor collections, and better computational methods for processing these data sets and identifying potential driver CNAs are needed. Typical studies of aCGH data sets take a pipeline approach, starting with segmentation of profiles, calls of gains and losses, and finally determination of frequent CNAs across samples. A drawback of pipelines is that choices at each step may produce different results, and biases are propagated forward. We present a mathematically robust new method that exploits probe-level correlations in aCGH data to discover subsets of samples that display common CNAs. Our algorithm is related to recent work on maximum-margin clustering. It does not require pre-segmentation of the data and also provides grouping of recurrent CNAs into clusters. We tested our approach on a large cohort of glioblastoma aCGH samples from The Cancer Genome Atlas and recovered almost all CNAs reported in the initial study. We also found additional significant CNAs missed by the original analysis but supported by earlier studies, and we identified significant correlations between CNAs.  相似文献   

4.
Copy number alteration (CNA) profiling of human tumors has revealed recurrent patterns of DNA amplifications and deletions across diverse cancer types. These patterns are suggestive of conserved selection pressures during tumor evolution but cannot be fully explained by known oncogenes and tumor suppressor genes. Using a pan‐cancer analysis of CNA data from patient tumors and experimental systems, here we show that principal component analysis‐defined CNA signatures are predictive of glycolytic phenotypes, including 18F‐fluorodeoxy‐glucose (FDG) avidity of patient tumors, and increased proliferation. The primary CNA signature is enriched for p53 mutations and is associated with glycolysis through coordinate amplification of glycolytic genes and other cancer‐linked metabolic enzymes. A pan‐cancer and cross‐species comparison of CNAs highlighted 26 consistently altered DNA regions, containing 11 enzymes in the glycolysis pathway in addition to known cancer‐driving genes. Furthermore, exogenous expression of hexokinase and enolase enzymes in an experimental immortalization system altered the subsequent copy number status of the corresponding endogenous loci, supporting the hypothesis that these metabolic genes act as drivers within the conserved CNA amplification regions. Taken together, these results demonstrate that metabolic stress acts as a selective pressure underlying the recurrent CNAs observed in human tumors, and further cast genomic instability as an enabling event in tumorigenesis and metabolic evolution.  相似文献   

5.

Background

DNA copy number alterations are frequently observed in ovarian cancer, but it remains a challenge to identify the most relevant alterations and the specific causal genes in those regions.

Methods

We obtained high-resolution 500K SNP array data for 52 ovarian tumors and identified the most statistically significant minimal genomic regions with the most prevalent and highest-level copy number alterations (recurrent CNAs). Within a region of recurrent CNA, comparison of expression levels in tumors with a given CNA to tumors lacking that CNA and to whole normal ovary samples was used to select genes with CNA-specific expression patterns. A public expression array data set of laser capture micro-dissected (LCM) non-malignant fallopian tube epithelia and LCM ovarian serous adenocarcinoma was used to evaluate the effect of cell-type mixture biases.

Results

Fourteen recurrent deletions were detected on chromosomes 4, 6, 9, 12, 13, 15, 16, 17, 18, 22 and most prevalently on X and 8. Copy number and expression data suggest several apoptosis mediators as candidate drivers of the 8p deletions. Sixteen recurrent gains were identified on chromosomes 1, 2, 3, 5, 8, 10, 12, 15, 17, 19, and 20, with the most prevalent gains localized to 8q and 3q. Within the 8q amplicon, PVT1, but not MYC, was strongly over-expressed relative to tumors lacking this CNA and showed over-expression relative to normal ovary. Likewise, the cell polarity regulators PRKCI and ECT2 were identified as putative drivers of two distinct amplicons on 3q. Co-occurrence analyses suggested potential synergistic or antagonistic relationships between recurrent CNAs. Genes within regions of recurrent CNA showed an enrichment of Cancer Census genes, particularly when filtered for CNA-specific expression.

Conclusion

These analyses provide detailed views of ovarian cancer genomic changes and highlight the benefits of using multiple reference sample types for the evaluation of CNA-specific expression changes.  相似文献   

6.

Background  

DNA copy number aberration (CNA) is very important in the pathogenesis of tumors and other diseases. For example, CNAs may result in suppression of anti-oncogenes and activation of oncogenes, which would cause certain types of cancers. High density single nucleotide polymorphism (SNP) array data is widely used for the CNA detection. However, it is nontrivial to detect the CNA automatically because the signals obtained from high density SNP arrays often have low signal-to-noise ratio (SNR), which might be caused by whole genome amplification, mixtures of normal and tumor cells, experimental noise or other technical limitations. With the reduction in SNR, many false CNA regions are often detected and the true CNA regions are missed. Thus, more sophisticated statistical models are needed to make the CNAs detection, using the low SNR signals, more robust and reliable.  相似文献   

7.
Tumor formation is in part driven by DNA copy number alterations (CNAs), which can be measured using microarray-based Comparative Genomic Hybridization (aCGH). Multiexperiment analysis of aCGH data from tumors allows discovery of recurrent CNAs that are potentially causal to cancer development. Until now, multiexperiment aCGH data analysis has been dependent on discretization of measurement data to a gain, loss or no-change state. Valuable biological information is lost when a heterogeneous system such as a solid tumor is reduced to these states. We have developed a new approach which inputs nondiscretized aCGH data to identify regions that are significantly aberrant across an entire tumor set. Our method is based on kernel regression and accounts for the strength of a probe's signal, its local genomic environment and the signal distribution across multiple tumors. In an analysis of 89 human breast tumors, our method showed enrichment for known cancer genes in the detected regions and identified aberrations that are strongly associated with breast cancer subtypes and clinical parameters. Furthermore, we identified 18 recurrent aberrant regions in a new dataset of 19 p53-deficient mouse mammary tumors. These regions, combined with gene expression microarray data, point to known cancer genes and novel candidate cancer genes.  相似文献   

8.
X Yuan  J Zhang  L Yang  S Zhang  B Chen  Y Geng  Y Wang 《PloS one》2012,7(7):e41082
Somatic copy number alteration (CNA) is a common phenomenon in cancer genome. Distinguishing significant consensus events (SCEs) from random background CNAs in a set of subjects has been proven to be a valuable tool to study cancer. In order to identify SCEs with an acceptable type I error rate, better computational approaches should be developed based on reasonable statistics and null distributions. In this article, we propose a new approach named TAGCNA for identifying SCEs in somatic CNAs that may encompass cancer driver genes. TAGCNA employs a peel-off permutation scheme to generate a reasonable null distribution based on a prior step of selecting tag CNA markers from the genome being considered. We demonstrate the statistical power of TAGCNA on simulated ground truth data, and validate its applicability using two publicly available cancer datasets: lung and prostate adenocarcinoma. TAGCNA identifies SCEs that are known to be involved with proto-oncogenes (e.g. EGFR, CDK4) and tumor suppressor genes (e.g. CDKN2A, CDKN2B), and provides many additional SCEs with potential biological relevance in these data. TAGCNA can be used to analyze the significance of CNAs in various cancers. It is implemented in R and is freely available at http://tagcna.sourceforge.net/.  相似文献   

9.
We propose a statistical framework, named genoCN, to simultaneously dissect copy number states and genotypes using high-density SNP (single nucleotide polymorphism) arrays. There are at least two types of genomic DNA copy number differences: copy number variations (CNVs) and copy number aberrations (CNAs). While CNVs are naturally occurring and inheritable, CNAs are acquired somatic alterations most often observed in tumor tissues only. CNVs tend to be short and more sparsely located in the genome compared with CNAs. GenoCN consists of two components, genoCNV and genoCNA, designed for CNV and CNA studies, respectively. In contrast to most existing methods, genoCN is more flexible in that the model parameters are estimated from the data instead of being decided a priori. GenoCNA also incorporates two important strategies for CNA studies. First, the effects of tissue contamination are explicitly modeled. Second, if SNP arrays are performed for both tumor and normal tissues of one individual, the genotype calls from normal tissue are used to study CNAs in tumor tissue. We evaluated genoCN by applications to 162 HapMap individuals and a brain tumor (glioblastoma) dataset and showed that our method can successfully identify both types of copy number differences and produce high-quality genotype calls.  相似文献   

10.
11.
Zhou X  Cole SW  Hu S  Wong DT 《Human genetics》2004,114(5):464-467
Gene copy-number abnormalities (CNAs) are characteristic of solid tumors and are found in association with developmental abnormalities and/or mental retardation. The ultimate impact of CNAs is exerted by the altered expression of encoded genes. We have utilized high-density oligonucleotide arrays from Affymetrix to identify DNA CNAs via their impact on mRNA expression levels. In these studies, we have used three different trisomic cell lines (trisomy 9, trisomy 18, trisomy 21) as models of CNAs and have compared mRNA expression in those trisomic cells with that observed in diploid cell lines of matched tissue origin. Our data clearly show that genes from CNA chromosome regions are substantially over-represented (P<0.000001 by chi-square analysis) in the differentially expressed subset from comparisons of all three trisomic cell lines with normal matching cells. In addition, we have been able to detect the origin of the duplication by a statistical scan for over-expressed genes. These data show that microarray detection of differential mRNA expression can be used to identify significant DNA CNAs.  相似文献   

12.
Canine Diffuse Large B-cell Lymphoma (cDLBCL) is an aggressive cancer with variable clinical response. Despite recent attempts by gene expression profiling to identify the dog as a potential animal model for human DLBCL, this tumor remains biologically heterogeneous with no prognostic biomarkers to predict prognosis. The aim of this work was to identify copy number aberrations (CNAs) by high-resolution array comparative genomic hybridization (aCGH) in 12 dogs with newly diagnosed DLBCL. In a subset of these dogs, the genetic profiles at the end of therapy and at relapse were also assessed. In primary DLBCLs, 90 different genomic imbalances were counted, consisting of 46 gains and 44 losses. Two gains in chr13 were significantly correlated with clinical stage. In addition, specific regions of gains and losses were significantly associated to duration of remission. In primary DLBCLs, individual variability was found, however 14 recurrent CNAs (>30%) were identified. Losses involving IGK, IGL and IGH were always found, and gains along the length of chr13 and chr31 were often observed (>41%). In these segments, MYC, LDHB, HSF1, KIT and PDGFRα are annotated. At the end of therapy, dogs in remission showed four new CNAs, whereas three new CNAs were observed in dogs at relapse compared with the previous profiles. One ex novo CNA, involving TCR, was present in dogs in remission after therapy, possibly induced by the autologous vaccine. Overall, aCGH identified small CNAs associated with outcome, which, along with future expression studies, may reveal target genes relevant to cDLBCL.  相似文献   

13.
14.
Oral potentially malignant disorders (OPMDs) characterized by the presence of dysplasia and DNA copy number aberrations (CNAs), may reflect chromosomal instability (CIN) and predispose to oral squamous cell carcinoma (OSCC). Early detection of OPMDs with such characteristics may play a crucial role in OSCC prevention. The aim of this study was to explore the relationship between CNAs, histological diagnosis, oral subsite and aneuploidy in OPMDs/OSCCs. Samples from OPMDs and OSCCs were processed by high-resolution DNA flow cytometry (hr DNA-FCM) to determine the relative nuclear DNA content. Additionally, CNAs were obtained for a subset of these samples by genome-wide array comparative genomic hybridization (aCGH) using DNA extracted from either diploid or aneuploid nuclei suspension sorted by FCM. Our study shows that: i) aneuploidy, global genomic imbalance (measured as the total number of CNAs) and specific focal CNAs occur early in the development of oral cancer and become more frequent at later stages; ii) OPMDs limited to tongue (TNG) mucosa display a higher frequency of aneuploidy compared to OPMDs confined to buccal mucosa (BM) as measured by DNA-FCM; iii) TNG OPMDs/OSCCs show peculiar features of CIN compared to BM OPMDs/OSCCs given the preferential association with total broad and specific focal CNA gains. Follow-up studies are warranted to establish whether the presence of DNA aneuploidy and specific focal or broad CNAs may predict cancer development in non-dysplastic OPMDs.  相似文献   

15.
Recurrent copy number alterations (CNAs) play an important role in cancer genesis. While a number of computational methods have been proposed for identifying such CNAs, their relative merits remain largely unknown in practice since very few efforts have been focused on comparative analysis of the methods. To facilitate studies of recurrent CNA identification in cancer genome, it is imperative to conduct a comprehensive comparison of performance and limitations among existing methods. In this paper, six representative methods proposed in the latest six years are compared. These include one-stage and two-stage approaches, working with raw intensity ratio data and discretized data respectively. They are based on various techniques such as kernel regression, correlation matrix diagonal segmentation, semi-parametric permutation and cyclic permutation schemes. We explore multiple criteria including type I error rate, detection power, Receiver Operating Characteristics (ROC) curve and the area under curve (AUC), and computational complexity, to evaluate performance of the methods under multiple simulation scenarios. We also characterize their abilities on applications to two real datasets obtained from cancers with lung adenocarcinoma and glioblastoma. This comparison study reveals general characteristics of the existing methods for identifying recurrent CNAs, and further provides new insights into their strengths and weaknesses. It is believed helpful to accelerate the development of novel and improved methods.  相似文献   

16.
Copy number alterations (CNA) are common events occurring in leukaemias and solid tumors. Comparative Genome Hybridization (CGH) is actually the gold standard technique to analyze CNAs; however, CGH analysis requires dedicated instruments and is able to perform only low resolution Loss of Heterozygosity (LOH) analyses. Here we present CEQer (Comparative Exome Quantification analyzer), a new graphical, event-driven tool for CNA/allelic-imbalance (AI) coupled analysis of exome sequencing data. By using case-control matched exome data, CEQer performs a comparative digital exonic quantification to generate CNA data and couples this information with exome-wide LOH and allelic imbalance detection. This data is used to build mixed statistical/heuristic models allowing the identification of CNA/AI events. To test our tool, we initially used in silico generated data, then we performed whole-exome sequencing from 20 leukemic specimens and corresponding matched controls and we analyzed the results using CEQer. Taken globally, these analyses showed that the combined use of comparative digital exon quantification and LOH/AI allows generating very accurate CNA data. Therefore, we propose CEQer as an efficient, robust and user-friendly graphical tool for the identification of CNA/AI in the context of whole-exome sequencing data.  相似文献   

17.
Genome copy number is an important source of genetic variation in health and disease. In cancer, Copy Number Alterations (CNAs) can be inferred from short-read sequencing data, enabling genomics-based precision oncology. Emerging Nanopore sequencing technologies offer the potential for broader clinical utility, for example in smaller hospitals, due to lower instrument cost, higher portability, and ease of use. Nonetheless, Nanopore sequencing devices are limited in the number of retrievable sequencing reads/molecules compared to short-read sequencing platforms, limiting CNA inference accuracy. To address this limitation, we targeted the sequencing of short-length DNA molecules loaded at optimized concentration in an effort to increase sequence read/molecule yield from a single nanopore run. We show that short-molecule nanopore sequencing reproducibly returns high read counts and allows high quality CNA inference. We demonstrate the clinical relevance of this approach by accurately inferring CNAs in acute myeloid leukemia samples. The data shows that, compared to traditional approaches such as chromosome analysis/cytogenetics, short molecule nanopore sequencing returns more sensitive, accurate copy number information in a cost effective and expeditious manner, including for multiplex samples. Our results provide a framework for short-molecule nanopore sequencing with applications in research and medicine, which includes but is not limited to, CNAs.  相似文献   

18.
Array-based comparative genomic hybridization (aCGH) enables the measurement of DNA copy number across thousands of locations in a genome. The main goals of analyzing aCGH data are to identify the regions of copy number variation (CNV) and to quantify the amount of CNV. Although there are many methods for analyzing single-sample aCGH data, the analysis of multi-sample aCGH data is a relatively new area of research. Further, many of the current approaches for analyzing multi-sample aCGH data do not appropriately utilize the additional information present in the multiple samples. We propose a procedure called the Fused Lasso Latent Feature Model (FLLat) that provides a statistical framework for modeling multi-sample aCGH data and identifying regions of CNV. The procedure involves modeling each sample of aCGH data as a weighted sum of a fixed number of features. Regions of CNV are then identified through an application of the fused lasso penalty to each feature. Some simulation analyses show that FLLat outperforms single-sample methods when the simulated samples share common information. We also propose a method for estimating the false discovery rate. An analysis of an aCGH data set obtained from human breast tumors, focusing on chromosomes 8 and 17, shows that FLLat and Significance Testing of Aberrant Copy number (an alternative, existing approach) identify similar regions of CNV that are consistent with previous findings. However, through the estimated features and their corresponding weights, FLLat is further able to discern specific relationships between the samples, for example, identifying 3 distinct groups of samples based on their patterns of CNV for chromosome 17.  相似文献   

19.
To assess the possible existence of unbalanced chromosomal abnormalities and delineate the characterization of copy number alterations (CNAs) of acute myeloid leukemia-M5 (AML-M5), R-banding karyotype, oligonucelotide array CGH and FISH were performed in 24 patients with AML-M5. A total of 117 CNAs with size ranging from 0.004 to 146.263 Mb was recognized in 12 of 24 cases, involving all chromosomes other than chromosome 1, 4, X and Y. Cryptic CNAs with size less than 5 Mb accounted for 59.8% of all the CNAs. 12 recurrent chromosomal alterations were mapped. Seven out of them were described in the previous AML studies and five were new candidate AML-M5 associated CNAs, including gains of 3q26.2-qter and 13q31.3 as well as losses of 2q24.2, 8p12 and 14q32. Amplication of 3q26.2-qter was the sole large recurrent chromosomal anomaly and the pathogenic mechanism in AML-M5 was possibly different from the classical recurrent 3q21q26 abnormality in AML. As a tumor suppressor gene, FOXN3, was singled out from the small recurrent CNA of 14q32, however, it is proved that deletion of FOXN3 is a common marker of myeloid leukemia rather than a specific marker for AML-M5 subtype. Moreover, the concurrent amplication of MLL and deletion of CDKN2A were noted and it might be associated with AML-M5. The number of CNA did not show a significant association with clinico-biological parameters and CR number of the 22 patients received chemotherapy. This study provided the evidence that array CGH served as a complementary platform for routine cytogenetic analysis to identify those cryptic alterations in the patients with AML-M5. As a subtype of AML, AML-M5 carries both common recurrent CNAs and unique CNAs, which may harbor novel oncogenes or tumor suppressor genes. Clarifying the role of these genes will contribute to the understanding of leukemogenic network of AML-M5.  相似文献   

20.
《Genomics》2022,114(6):110510
Copy-number aberrations (CNAs) are assessed using FISH analysis in diagnostics of chronic lymphocytic leukemia (CLL), but CNAs can also be extrapolated from Illumina BeadChips developed for genome-wide methylation microarray screening. Increasing numbers of microarray data-sets are available from diagnostic samples, making it useful to assess the potential in CNA diagnostics.We benchmarked the limitations of CNA testing from two Illumina BeadChips (EPIC and 450k) and using two common packages for analysis (conumee and ChAMP) to FISH-based assessment of 11q, 13q, and 17p deletions in 202 CLL samples.Overall, the two packages predicted CNAs with similar accuracy regardless of the microarray type, but lower than FISH-based assessment. We showed that the bioinformatics analysis needs to be adjusted to the specific CNA, as no general settings were identified. Altogether, we were able to predict CNAs using methylation microarray data, however, with limited accuracy, making FISH-based assessment of deletions the superior diagnostic choice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号