首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
We propose a statistical framework, named genoCN, to simultaneously dissect copy number states and genotypes using high-density SNP (single nucleotide polymorphism) arrays. There are at least two types of genomic DNA copy number differences: copy number variations (CNVs) and copy number aberrations (CNAs). While CNVs are naturally occurring and inheritable, CNAs are acquired somatic alterations most often observed in tumor tissues only. CNVs tend to be short and more sparsely located in the genome compared with CNAs. GenoCN consists of two components, genoCNV and genoCNA, designed for CNV and CNA studies, respectively. In contrast to most existing methods, genoCN is more flexible in that the model parameters are estimated from the data instead of being decided a priori. GenoCNA also incorporates two important strategies for CNA studies. First, the effects of tissue contamination are explicitly modeled. Second, if SNP arrays are performed for both tumor and normal tissues of one individual, the genotype calls from normal tissue are used to study CNAs in tumor tissue. We evaluated genoCN by applications to 162 HapMap individuals and a brain tumor (glioblastoma) dataset and showed that our method can successfully identify both types of copy number differences and produce high-quality genotype calls.  相似文献   

3.

Background

Genomic instability in cancer leads to abnormal genome copy number alterations (CNA) as a mechanism underlying tumorigenesis. Using microarrays and other technologies, tumor CNA are detected by comparing tumor sample CN to normal reference sample CN. While advances in microarray technology have improved detection of copy number alterations, the increase in the number of measured signals, noise from array probes, variations in signal-to-noise ratio across batches and disparity across laboratories leads to significant limitations for the accurate identification of CNA regions when comparing tumor and normal samples.

Methods

To address these limitations, we designed a novel "Virtual Normal" algorithm (VN), which allowed for construction of an unbiased reference signal directly from test samples within an experiment using any publicly available normal reference set as a baseline thus eliminating the need for an in-lab normal reference set.

Results

The algorithm was tested using an optimal, paired tumor/normal data set as well as previously uncharacterized pediatric malignant gliomas for which a normal reference set was not available. Using Affymetrix 250K Sty microarrays, we demonstrated improved signal-to-noise ratio and detected significant copy number alterations using the VN algorithm that were validated by independent PCR analysis of the target CNA regions.

Conclusions

We developed and validated an algorithm to provide a virtual normal reference signal directly from tumor samples and minimize noise in the derivation of the raw CN signal. The algorithm reduces the variability of assays performed across different reagent and array batches, methods of sample preservation, multiple personnel, and among different laboratories. This approach may be valuable when matched normal samples are unavailable or the paired normal specimens have been subjected to variations in methods of preservation.  相似文献   

4.
Breast cancer is a disease of cell cycle, and the dysfunction of cell cycle checkpoints plays a vital role in the occurrence and development of breast cancer. We employed multi-gene fluorescence in situ hybridization (M-FISH) to investigate gene copy number aberrations (CNAs) of 4 genes (Rb1, CHEK2, c-Myc, CCND1) that are involved in the regulation of cell cycle, in order to analyze the impact of gene aberrations on prognosis in the young breast cancer patients. Gene copy number aberrations of these 4 genes were more frequently observed in young breast cancer patients when compared with the older group. Further, these CNAs were more frequently seen in Luminal B type, Her2 overexpression, and tiple-negative breast cancer (TNBC) type in young breast cancer patients. The variations of CCND1, Rb1, and CHEK2 were significantly correlated with poor survival in the young breast cancer patient group, while the amplification of c-Myc was not obviously correlated with poor survival in young breast cancer patients. Thus, gene copy number aberrations (CNAs) of cell cycle-regulated genes can serve as an important tool for prognosis in young breast cancer patients.  相似文献   

5.
To develop a comprehensive overview of copy number aberrations (CNAs) in stage-II/III colorectal cancer (CRC), we characterized 302 tumors from the PETACC-3 clinical trial. Microsatellite-stable (MSS) samples (n = 269) had 66 minimal common CNA regions, with frequent gains on 20 q (72.5%), 7 (41.8%), 8 q (33.1%) and 13 q (51.0%) and losses on 18 (58.6%), 4 q (26%) and 21 q (21.6%). MSS tumors have significantly more CNAs than microsatellite-instable (MSI) tumors: within the MSI tumors a novel deletion of the tumor suppressor WWOX at 16 q23.1 was identified (p<0.01). Focal aberrations identified by the GISTIC method confirmed amplifications of oncogenes including EGFR, ERBB2, CCND1, MET, and MYC, and deletions of tumor suppressors including TP53, APC, and SMAD4, and gene expression was highly concordant with copy number aberration for these genes. Novel amplicons included putative oncogenes such as WNK1 and HNF4A, which also showed high concordance between copy number and expression. Survival analysis associated a specific patient segment featured by chromosome 20 q gains to an improved overall survival, which might be due to higher expression of genes such as EEF1B2 and PTK6. The CNA clustering also grouped tumors characterized by a poor prognosis BRAF-mutant-like signature derived from mRNA data from this cohort. We further revealed non-random correlation between CNAs among unlinked loci, including positive correlation between 20 q gain and 8 q gain, and 20 q gain and chromosome 18 loss, consistent with co-selection of these CNAs. These results reinforce the non-random nature of somatic CNAs in stage-II/III CRC and highlight loci and genes that may play an important role in driving the development and outcome of this disease.  相似文献   

6.
The extent of focal chromosomal copy number aberrations (CNAs) in cancer has been uncovered through technical innovations, and this discovery has been critical for the identification of new cancer driver genes in genomics projects such as TCGA and ICGC. Unlike constitutive copy number variations (CNVs), focal CNAs are the result of many selection events during the evolution of cancer genomes. Therefore, it is possible that a single gene in a focal CNA gives the tumor a selective growth advantage. This concept has been instrumental in the discovery of new cancer driver genes. However, focal CNAs lack a consensus definition; therefore, we propose one based on pragmatic considerations. We also describe different strategies to identify focal CNAs and procedures to distinguish them from large CNAs and CNVs.  相似文献   

7.
SUMMARY: We present a tool for control-free copy number alteration (CNA) detection using deep-sequencing data, particularly useful for cancer studies. The tool deals with two frequent problems in the analysis of cancer deep-sequencing data: absence of control sample and possible polyploidy of cancer cells. FREEC (control-FREE Copy number caller) automatically normalizes and segments copy number profiles (CNPs) and calls CNAs. If ploidy is known, FREEC assigns absolute copy number to each predicted CNA. To normalize raw CNPs, the user can provide a control dataset if available; otherwise GC content is used. We demonstrate that for Illumina single-end, mate-pair or paired-end sequencing, GC-contentr normalization provides smooth profiles that can be further segmented and analyzed in order to predict CNAs. AVAILABILITY: Source code and sample data are available at http://bioinfo-out.curie.fr/projects/freec/.  相似文献   

8.
There is an increasing interest in using single nucleotide polymorphism (SNP) genotyping arrays for profiling chromosomal rearrangements in tumors, as they allow simultaneous detection of copy number and loss of heterozygosity with high resolution. Critical issues such as signal baseline shift due to aneuploidy, normal cell contamination, and the presence of GC content bias have been reported to dramatically alter SNP array signals and complicate accurate identification of aberrations in cancer genomes. To address these issues, we propose a novel Global Parameter Hidden Markov Model (GPHMM) to unravel tangled genotyping data generated from tumor samples. In contrast to other HMM methods, a distinct feature of GPHMM is that the issues mentioned above are quantitatively modeled by global parameters and integrated within the statistical framework. We developed an efficient EM algorithm for parameter estimation. We evaluated performance on three data sets and show that GPHMM can correctly identify chromosomal aberrations in tumor samples containing as few as 10% cancer cells. Furthermore, we demonstrated that the estimation of global parameters in GPHMM provides information about the biological characteristics of tumor samples and the quality of genotyping signal from SNP array experiments, which is helpful for data quality control and outlier detection in cohort studies.  相似文献   

9.
We designed a study to investigate genetic relationships between primary tumors of oral squamous cell carcinoma (OSCC) and their lymph node metastases, and to identify genomic copy number aberrations (CNAs) related to lymph node metastasis. For this purpose, we collected a total of 42 tumor samples from 25 patients and analyzed their genomic profiles by array-based comparative genomic hybridization. We then compared the genetic profiles of metastatic primary tumors (MPTs) with their paired lymph node metastases (LNMs), and also those of LNMs with non-metastatic primary tumors (NMPTs). Firstly, we found that although there were some distinctive differences in the patterns of genomic profiles between MPTs and their paired LNMs, the paired samples shared similar genomic aberration patterns in each case. Unsupervised hierarchical clustering analysis grouped together 12 of the 15 MPT-LNM pairs. Furthermore, similarity scores between paired samples were significantly higher than those between non-paired samples. These results suggested that MPTs and their paired LNMs are composed predominantly of genetically clonal tumor cells, while minor populations with different CNAs may also exist in metastatic OSCCs. Secondly, to identify CNAs related to lymph node metastasis, we compared CNAs between grouped samples of MPTs and LNMs, but were unable to find any CNAs that were more common in LNMs. Finally, we hypothesized that subpopulations carrying metastasis-related CNAs might be present in both the MPT and LNM. Accordingly, we compared CNAs between NMPTs and LNMs, and found that gains of 7p, 8q and 17q were more common in the latter than in the former, suggesting that these CNAs may be involved in lymph node metastasis of OSCC. In conclusion, our data suggest that in OSCCs showing metastasis, the primary and metastatic tumors share similar genomic profiles, and that cells in the primary tumor may tend to metastasize after acquiring metastasis-associated CNAs.  相似文献   

10.
Cancer progression is often driven by an accumulation of genetic changes but also accompanied by increasing genomic instability. These processes lead to a complicated landscape of copy number alterations (CNAs) within individual tumors and great diversity across tumor samples. High resolution array-based comparative genomic hybridization (aCGH) is being used to profile CNAs of ever larger tumor collections, and better computational methods for processing these data sets and identifying potential driver CNAs are needed. Typical studies of aCGH data sets take a pipeline approach, starting with segmentation of profiles, calls of gains and losses, and finally determination of frequent CNAs across samples. A drawback of pipelines is that choices at each step may produce different results, and biases are propagated forward. We present a mathematically robust new method that exploits probe-level correlations in aCGH data to discover subsets of samples that display common CNAs. Our algorithm is related to recent work on maximum-margin clustering. It does not require pre-segmentation of the data and also provides grouping of recurrent CNAs into clusters. We tested our approach on a large cohort of glioblastoma aCGH samples from The Cancer Genome Atlas and recovered almost all CNAs reported in the initial study. We also found additional significant CNAs missed by the original analysis but supported by earlier studies, and we identified significant correlations between CNAs.  相似文献   

11.
利用SNP数据检测肿瘤细胞染色体拷贝数变异是癌症相关研究的一个热点,目前已有多种方法可以通过分析SNP array数据检测染色体拷贝数。然而在某些情况下,这些检测方法检测结果与真实拷贝数具有一定错误率。目前并没有方法研究预测结果发生错误的规律。本文分别分析了GPHMM,ASCAT两种检测方法结果信息熵与检测正确率的关系,发现检测正确率与信息熵存在很强的相关性。通过对比不同肿瘤细胞比例下信息熵与正确率关系,本文发现随着肿瘤细胞比例的增大,检测结果信息熵平均值增大,方差减小;同时平均检测正确率也越来越大,方差显著减小。这些结果显示信息熵的大小可以反映出检测结果正确率的高低。最后,本文以高肿瘤细胞比例下拷贝数检测结果为例,研究了在变异类型单一,信息熵小的情况下,染色体倍性检测的正确率。结果表明信息熵可以作为衡量检测结果可信度的指标:即信息熵越高,检测结果越可信。  相似文献   

12.
13.
Breast cancer recurrence (BCR) is a common treatment outcome despite curative-intent primary treatment of non-metastatic breast cancer. Currently used prognostic and predictive factors utilize tumor-based markers, and are not optimal determinants of risk of BCR. Germline-based copy number aberrations (CNAs) have not been evaluated as determinants of predisposition to experience BCR. In this study, we accessed germline DNA from 369 female breast cancer subjects who received curative-intent primary treatment following diagnosis. Of these, 155 experienced BCR and 214 did not, after a median duration of follow up after breast cancer diagnosis of 6.35 years (range = 0.60–21.78) and 8.60 years (range = 3.08–13.57), respectively. Whole genome CNA genotyping was performed on the Affymetrix SNP array 6.0 platform. CNAs were identified using the SNP-Fast Adaptive States Segmentation Technique 2 algorithm implemented in Nexus Copy Number 6.0. Six samples were removed due to poor quality scores, leaving 363 samples for further analysis. We identified 18,561 CNAs with ≥1 kb as a predefined cut-off for observed aberrations. Univariate survival analyses (log-rank tests) identified seven CNAs (two copy number gains and five copy neutral-loss of heterozygosities, CN-LOHs) showing significant differences (P<2.01×10−5) in recurrence-free survival (RFS) probabilities with and without CNAs.We also observed three additional but distinct CN-LOHs showing significant differences in RFS probabilities (P<2.86×10−5) when analyses were restricted to stratified cases (luminal A, n = 208) only. After adjusting for tumor stage and grade in multivariate analyses (Cox proportional hazards models), all the CNAs remained strongly associated with the phenotype of BCR. Of these, we confirmed three CNAs at 17q11.2, 11q13.1 and 6q24.1 in representative samples using independent genotyping platforms. Our results suggest further investigations on the potential use of germline DNA variations as prognostic markers in cancer-associated phenotypes.  相似文献   

14.

Background

Tumor single nucleotide polymorphism (SNP) array is a common platform for investigating the cancer genomic aberration and the functionally important altered genes. Original SNP array signals are usually corrupted by noise, and need to be de-convoluted into absolute copy number profile by analytical methods. Unfortunately, in contrast with the popularity of tumor Affymetrix SNP array, the methods that are specifically designed for this platform are still limited. The complicated characteristics of noise in signals is one of the difficulties for dissecting tumor Affymetrix SNP array data, as they inevitably blur the distinction between aberrations and create an obstacle for the copy number aberration (CNA) identification.

Results

We propose a tool named TAFFYS for comprehensive analysis of tumor Affymetrix SNP array data. TAFFYS introduce a wavelet-based de-noising approach and copy number-specific signal variance model for suppressing and modelling the noise in signals. Then a hidden Markov model is employed for copy number inference. Finally, by using the absolute copy number profile, statistical significance of each aberration region is calculated in term of different aberration types, including amplification, deletion and loss of heterozygosity (LOH). The result shows that copy number specific-variance model and wavelet de-noising algorithm fits well with the Affymetrix SNP array signals, leading to more accurate estimation for diluted tumor sample (even with only 30% of cancer cells) than other existed methods. Results of examinations also demonstrate a good compatibility and extensibility for different Affymetrix SNP array platforms. Application on the 35 breast tumor samples shows that TAFFYS can automatically dissect the tumor samples and reveal statistically significant aberration regions where cancer-related genes locate.

Conclusions

TAFFYS provide an efficient and convenient tool for identifying the copy number alteration and allelic imbalance and assessing the recurrent aberrations for the tumor Affymetrix SNP array data.  相似文献   

15.
Oral potentially malignant disorders (OPMDs) characterized by the presence of dysplasia and DNA copy number aberrations (CNAs), may reflect chromosomal instability (CIN) and predispose to oral squamous cell carcinoma (OSCC). Early detection of OPMDs with such characteristics may play a crucial role in OSCC prevention. The aim of this study was to explore the relationship between CNAs, histological diagnosis, oral subsite and aneuploidy in OPMDs/OSCCs. Samples from OPMDs and OSCCs were processed by high-resolution DNA flow cytometry (hr DNA-FCM) to determine the relative nuclear DNA content. Additionally, CNAs were obtained for a subset of these samples by genome-wide array comparative genomic hybridization (aCGH) using DNA extracted from either diploid or aneuploid nuclei suspension sorted by FCM. Our study shows that: i) aneuploidy, global genomic imbalance (measured as the total number of CNAs) and specific focal CNAs occur early in the development of oral cancer and become more frequent at later stages; ii) OPMDs limited to tongue (TNG) mucosa display a higher frequency of aneuploidy compared to OPMDs confined to buccal mucosa (BM) as measured by DNA-FCM; iii) TNG OPMDs/OSCCs show peculiar features of CIN compared to BM OPMDs/OSCCs given the preferential association with total broad and specific focal CNA gains. Follow-up studies are warranted to establish whether the presence of DNA aneuploidy and specific focal or broad CNAs may predict cancer development in non-dysplastic OPMDs.  相似文献   

16.
17.
Tumorigenesis is a multi-step process in which normal cells transform into malignant tumors following the accumulation of genetic mutations that enable them to evade the growth control checkpoints that would normally suppress their growth or result in apoptosis. It is therefore important to identify those combinations of mutations that collaborate in cancer development and progression. DNA copy number alterations (CNAs) are one of the ways in which cancer genes are deregulated in tumor cells. We hypothesized that synergistic interactions between cancer genes might be identified by looking for regions of co-occurring gain and/or loss. To this end we developed a scoring framework to separate truly co-occurring aberrations from passenger mutations and dominant single signals present in the data. The resulting regions of high co-occurrence can be investigated for between-region functional interactions. Analysis of high-resolution DNA copy number data from a panel of 95 hematological tumor cell lines correctly identified co-occurring recombinations at the T-cell receptor and immunoglobulin loci in T- and B-cell malignancies, respectively, showing that we can recover truly co-occurring genomic alterations. In addition, our analysis revealed networks of co-occurring genomic losses and gains that are enriched for cancer genes. These networks are also highly enriched for functional relationships between genes. We further examine sub-networks of these networks, core networks, which contain many known cancer genes. The core network for co-occurring DNA losses we find seems to be independent of the canonical cancer genes within the network. Our findings suggest that large-scale, low-intensity copy number alterations may be an important feature of cancer development or maintenance by affecting gene dosage of a large interconnected network of functionally related genes.  相似文献   

18.
Genomic copy number aberrations (CNAs) in gastric cancer have already been extensively characterized by array comparative genomic hybridization (array CGH) analysis. However, involvement of genomic CNAs in the process of submucosal invasion and lymph node metastasis in early gastric cancer is still poorly understood. In this study, to address this issue, we collected a total of 59 tumor samples from 27 patients with submucosal-invasive gastric cancers (SMGC), analyzed their genomic profiles by array CGH, and compared them between paired samples of mucosal (MU) and submucosal (SM) invasion (23 pairs), and SM invasion and lymph node (LN) metastasis (9 pairs). Initially, we hypothesized that acquisition of specific CNA(s) is important for these processes. However, we observed no significant difference in the number of genomic CNAs between paired MU and SM, and between paired SM and LN. Furthermore, we were unable to find any CNAs specifically associated with SM invasion or LN metastasis. Among the 23 cases analyzed, 15 had some similar pattern of genomic profiling between SM and MU. Interestingly, 13 of the 15 cases also showed some differences in genomic profiles. These results suggest that the majority of SMGCs are composed of heterogeneous subpopulations derived from the same clonal origin. Comparison of genomic CNAs between SMGCs with and without LN metastasis revealed that gain of 11q13, 11q14, 11q22, 14q32 and amplification of 17q21 were more frequent in metastatic SMGCs, suggesting that these CNAs are related to LN metastasis of early gastric cancer. In conclusion, our data suggest that generation of genetically distinct subclones, rather than acquisition of specific CNA at MU, is integral to the process of submucosal invasion, and that subclones that acquire gain of 11q13, 11q14, 11q22, 14q32 or amplification of 17q21 are likely to become metastatic.  相似文献   

19.
Telomeres are repetitive sequences (TTAGGG) located at the end of chromosomes. Telomeres progressively shorten with each cell replication cycle, ultimately leading to chromosomal instability and loss of cell viability. Telomere length anomaly appears to be one of the earliest and most prevalent genetic alterations in malignant transformation. Here we aim to estimate telomere length from whole-exome sequencing data in colon tumors and normal colonic mucosa, and to analyze the potential association of telomere length with clinical factors and gene expression in colon cancer.Reads containing at least five repetitions of the telomere sequence (TTAGGG) were extracted from the raw sequences of 42 adjacent normal-tumor paired samples. The number of reads from the tumor sample was normalized to build the Tumor Telomere Length Ratio (TTLR), considered an estimation of telomere length change in the tumor compared to the paired normal tissue. We evaluated the associations between TTLR and clinical factors, gene expression and copy number (CN) aberrations measured in the same tumor samples.Colon tumors showed significantly shorter telomeres than their paired normal samples. No significant association was observed between TTLR and gender, age, tumor location, prognosis, stromal infiltration or molecular subtypes. The functional gene set enrichment analysis showed pathways related to immune response significantly associated with TLLR.By extracting a relative measure of telomere length from whole-exome sequencing data, we have assessed that colon tumor cells predominantly shorten telomeres, and this alteration is associated with expression changes in genes related to immune response and inflammation in tumor cells.  相似文献   

20.
We describe a bioinformatic tool, Tumor Aberration Prediction Suite (TAPS), for the identification of allele-specific copy numbers in tumor samples using data from Affymetrix SNP arrays. It includes detailed visualization of genomic segment characteristics and iterative pattern recognition for copy number identification, and does not require patient-matched normal samples. TAPS can be used to identify chromosomal aberrations with high sensitivity even when the proportion of tumor cells is as low as 30%. Analysis of cancer samples indicates that TAPS is well suited to investigate samples with aneuploidy and tumor heterogeneity, which is commonly found in many types of solid tumors.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号