期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Allele-specific disparity in breast cancer

Fatemeh Kaveh Hege Edvardsen Anne-Lise Børresen-Dale Vessela N Kristensen Hiroko K Solvang 《BMC medical genomics》2011,4(1):1-13

Background

Molecular alterations critical to development of cancer include mutations, copy number alterations (amplifications and deletions) as well as genomic rearrangements resulting in gene fusions. Massively parallel next generation sequencing, which enables the discovery of such changes, uses considerable quantities of genomic DNA (> 5 ug), a serious limitation in ever smaller clinical samples. However, a commonly available microarray platforms such as array comparative genomic hybridization (array CGH) allows the characterization of gene copy number at a single gene resolution using much smaller amounts of genomic DNA. In this study we evaluate the sensitivity of ultra-dense array CGH platforms developed by Agilent, especially that of the 1 million probe array (1 M array), and their application when whole genome amplification is required because of limited sample quantities.

Methods

We performed array CGH on whole genome amplified and not amplified genomic DNA from MCF-7 breast cancer cells, using 244 K and 1 M Agilent arrays. The ADM-2 algorithm was used to identify micro-copy number alterations that measured less than 1 Mb in genomic length.

Results

DNA from MCF-7 breast cancer cells was analyzed for micro-copy number alterations, defined as measuring less than 1 Mb in genomic length. The 4-fold extra resolution of the 1 M array platform relative to the less dense 244 K array platform, led to the improved detection of copy number variations (CNVs) and micro-CNAs. The identification of intra-genic breakpoints in areas of DNA copy number gain signaled the possible presence of gene fusion events. However, the ultra-dense platforms, especially the densest 1 M array, detect artifacts inherent to whole genome amplification and should be used only with non-amplified DNA samples.

Conclusions

This is a first report using 1 M array CGH for the discovery of cancer genes and biomarkers. We show the remarkable capacity of this technology to discover CNVs, micro-copy number alterations and even gene fusions. However, these platforms require excellent genomic DNA quality and do not tolerate relatively small imperfections related to the whole genome amplification. 相似文献

2.

ATAQS: A computational software tool for high throughput transition optimization and validation for selected reaction monitoring mass spectrometry

Mi-Youn K Brusniak Sung-Tat Kwok Mark Christiansen David Campbell Lukas Reiter Paola Picotti Ulrike Kusebauch Hector Ramos Eric W Deutsch Jingchun Chen Robert L Moritz Ruedi Aebersold 《BMC bioinformatics》2011,12(1):1-15

Background

Copy number variants (CNVs), including deletions, amplifications, and other rearrangements, are common in human and cancer genomes. Copy number data from array comparative genome hybridization (aCGH) and next-generation DNA sequencing is widely used to measure copy number variants. Comparison of copy number data from multiple individuals reveals recurrent variants. Typically, the interior of a recurrent CNV is examined for genes or other loci associated with a phenotype. However, in some cases, such as gene truncations and fusion genes, the target of variant lies at the boundary of the variant.

Results

We introduce Neighborhood Breakpoint Conservation (NBC), an algorithm for identifying rearrangement breakpoints that are highly conserved at the same locus in multiple individuals. NBC detects recurrent breakpoints at varying levels of resolution, including breakpoints whose location is exactly conserved and breakpoints whose location varies within a gene. NBC also identifies pairs of recurrent breakpoints such as those that result from fusion genes. We apply NBC to aCGH data from 36 primary prostate tumors and identify 12 novel rearrangements, one of which is the well-known TMPRSS2-ERG fusion gene. We also apply NBC to 227 glioblastoma tumors and predict 93 novel rearrangements which we further classify as gene truncations, germline structural variants, and fusion genes. A number of these variants involve the protein phosphatase PTPN12 suggesting that deregulation of PTPN12, via a variety of rearrangements, is common in glioblastoma.

Conclusions

We demonstrate that NBC is useful for detection of recurrent breakpoints resulting from copy number variants or other structural variants, and in particular identifies recurrent breakpoints that result in gene truncations or fusion genes. Software is available at http://http.//cs.brown.edu/people/braphael/software.html. 相似文献

3.

Integration of DNA copy number alterations and transcriptional expression analysis in human gastric cancer

Fan B Dachrut S Coral H Yuen ST Chu KM Law S Zhang L Ji J Leung SY Chen X 《PloS one》2012,7(4):e29824

Background

Genomic instability with frequent DNA copy number alterations is one of the key hallmarks of carcinogenesis. The chromosomal regions with frequent DNA copy number gain and loss in human gastric cancer are still poorly defined. It remains unknown how the DNA copy number variations contributes to the changes of gene expression profiles, especially on the global level.

Principal Findings

We analyzed DNA copy number alterations in 64 human gastric cancer samples and 8 gastric cancer cell lines using bacterial artificial chromosome (BAC) arrays based comparative genomic hybridization (aCGH). Statistical analysis was applied to correlate previously published gene expression data obtained from cDNA microarrays with corresponding DNA copy number variation data to identify candidate oncogenes and tumor suppressor genes. We found that gastric cancer samples showed recurrent DNA copy number variations, including gains at 5p, 8q, 20p, 20q, and losses at 4q, 9p, 18q, 21q. The most frequent regions of amplification were 20q12 (7/72), 20q12–20q13.1 (12/72), 20q13.1–20q13.2 (11/72) and 20q13.2–20q13.3 (6/72). The most frequent deleted region was 9p21 (8/72). Correlating gene expression array data with aCGH identified 321 candidate oncogenes, which were overexpressed and showed frequent DNA copy number gains; and 12 candidate tumor suppressor genes which were down-regulated and showed frequent DNA copy number losses in human gastric cancers. Three networks of significantly expressed genes in gastric cancer samples were identified by ingenuity pathway analysis.

Conclusions

This study provides insight into DNA copy number variations and their contribution to altered gene expression profiles during human gastric cancer development. It provides novel candidate driver oncogenes or tumor suppressor genes for human gastric cancer, useful pathway maps for the future understanding of the molecular pathogenesis of this malignancy, and the construction of new therapeutic targets. 相似文献

4.

Modulation of gene expression in drug resistant Leishmania is associated with gene amplification, gene deletion and chromosome aneuploidy

Ubeda JM Légaré D Raymond F Ouameur AA Boisvert S Rigault P Corbeil J Tremblay MJ Olivier M Papadopoulou B Ouellette M 《Genome biology》2008,9(7):R115-16

相似文献

5.

Functional analysis and comparative genomics of expressed sequence tags from the lycophyte Selaginella moellendorffii

Jing-Ke Weng Milos Tanurdzic Clint Chapple 《BMC genomics》2005,6(1):1-13

Background

DNA microarrays are widely used in gene expression analyses. To increase throughput and minimize costs without reducing gene expression data obtained, we investigated whether four mRNA samples can be analyzed simultaneously by applying four different fluorescent dyes.

Results

Following tests for cross-talk of fluorescence signals, Alexa 488, Alexa 594, Cyanine 3 and Cyanine 5 were selected for hybridizations. For self-hybridizations, a single RNA sample was labelled with all dyes and hybridized on commercial cDNA arrays or on in-house spotted oligonucleotide arrays. Correlation coefficients for all combinations of dyes were above 0.9 on the cDNA array. On the oligonucleotide array they were above 0.8, except combinations with Alexa 488, which were approximately 0.5. Standard deviation of expression differences for replicate spots were similar on the cDNA array for all dye combinations, but on the oligonucleotide array combinations with Alexa 488 showed a higher variation.

Conclusion

In conclusion, the four dyes can be used simultaneously for gene expression experiments on the tested cDNA array, but only three dyes can be used on the tested oligonucleotide array. This was confirmed by hybridizations of control with test samples, as all combinations returned similar numbers of differentially expressed genes with comparable effects on gene expression. 相似文献

6.

Bayesian estimation of genomic copy number with single nucleotide polymorphism genotyping arrays

Beibei Guo Alejandro Villagran Marina Vannucci Jian Wang Caleb Davis Tsz-Kwong Man Ching Lau Rudy Guerra 《BMC research notes》2010,3(1):1-18

Background

The identification of copy number aberration in the human genome is an important area in cancer research. We develop a model for determining genomic copy numbers using high-density single nucleotide polymorphism genotyping microarrays. The method is based on a Bayesian spatial normal mixture model with an unknown number of components corresponding to true copy numbers. A reversible jump Markov chain Monte Carlo algorithm is used to implement the model and perform posterior inference.

Results

The performance of the algorithm is examined on both simulated and real cancer data, and it is compared with the popular CNAG algorithm for copy number detection.

Conclusions

We demonstrate that our Bayesian mixture model performs at least as well as the hidden Markov model based CNAG algorithm and in certain cases does better. One of the added advantages of our method is the flexibility of modeling normal cell contamination in tumor samples. 相似文献

7.

Expression profiling with RNA from formalin-fixed, paraffin-embedded material

Andrea Oberli Vlad Popovici Mauro Delorenzi Anna Baltzer Janine Antonov Sybille Matthey Stefan Aebi Hans Jörg Altermatt Rolf Jaggi 《BMC medical genomics》2008,1(1):1-15

相似文献

8.

Micro-Scale Genomic DNA Copy Number Aberrations as Another Means of Mutagenesis in Breast Cancer

Hann-Hsiang Chao Xiaping He Joel S. Parker Wei Zhao Charles M. Perou 《PloS one》2012,7(12)

Introduction

In breast cancer, the basal-like subtype has high levels of genomic instability relative to other breast cancer subtypes with many basal-like-specific regions of aberration. There is evidence that this genomic instability extends to smaller scale genomic aberrations, as shown by a previously described micro-deletion event in the PTEN gene in the Basal-like SUM149 breast cancer cell line.

Methods

We sought to identify if small regions of genomic DNA copy number changes exist by using a high density, gene-centric Comparative Genomic Hybridizations (CGH) array on cell lines and primary tumors. A custom tiling array for CGH (244,000 probes, 200 bp tiling resolution) was created to identify small regions of genomic change, which was focused on previously identified basal-like-specific, and general cancer genes. Tumor genomic DNA from 94 patients and 2 breast cancer cell lines was labeled and hybridized to these arrays. Aberrations were called using SWITCHdna and the smallest 25% of SWITCHdna-defined genomic segments were called micro-aberrations (<64 contiguous probes, ∼ 15 kb).

Results

Our data showed that primary tumor breast cancer genomes frequently contained many small-scale copy number gains and losses, termed micro-aberrations, most of which are undetectable using typical-density genome-wide aCGH arrays. The basal-like subtype exhibited the highest incidence of these events. These micro-aberrations sometimes altered expression of the involved gene. We confirmed the presence of the PTEN micro-amplification in SUM149 and by mRNA-seq showed that this resulted in loss of expression of all exons downstream of this event. Micro-aberrations disproportionately affected the 5′ regions of the affected genes, including the promoter region, and high frequency of micro-aberrations was associated with poor survival.

Conclusion

Using a high-probe-density, gene-centric aCGH microarray, we present evidence of small-scale genomic aberrations that can contribute to gene inactivation. These events may contribute to tumor formation through mechanisms not detected using conventional DNA copy number analyses. 相似文献

9.

Muscle Research and Gene Ontology: New standards for improved data integration

Erika Feltrin Stefano Campanaro Alexander D Diehl Elisabeth Ehler Georgine Faulkner Jennifer Fordham Chiara Gardin Midori Harris David Hill Ralph Knoell Paolo Laveder Lorenza Mittempergher Alessandra Nori Carlo Reggiani Vincenzo Sorrentino Pompeo Volpe Ivano Zara Giorgio Valle Jennifer Deegan née Clark 《BMC medical genomics》2009,2(1):1-8

相似文献

10.

Identification of molecular pathways affected by pterostilbene, a natural dimethylether analog of resveratrol

Zhiqiang Pan Ameeta K Agarwal Tao Xu Qin Feng Scott R Baerson Stephen O Duke Agnes M Rimando 《BMC medical genomics》2008,1(1):1-13

相似文献

11.

tigaR: integrative significance analysis of temporal differential gene expression induced by genomic abnormalities

Viktorian Miok Saskia M Wilting Mark A van de Wiel Annelieke Jaspers Paula I van Noort Ruud H Brakenhoff Peter JF Snijders Renske DM Steenbergen Wessel N van Wieringen 《BMC bioinformatics》2014,15(1)

Background

To determine which changes in the host cell genome are crucial for cervical carcinogenesis, a longitudinal in vitro model system of HPV-transformed keratinocytes was profiled in a genome-wide manner. Four cell lines affected with either HPV16 or HPV18 were assayed at 8 sequential time points for gene expression (mRNA) and gene copy number (DNA) using high-resolution microarrays. Available methods for temporal differential expression analysis are not designed for integrative genomic studies.

Results

Here, we present a method that allows for the identification of differential gene expression associated with DNA copy number changes over time. The temporal variation in gene expression is described by a generalized linear mixed model employing low-rank thin-plate splines. Model parameters are estimated with an empirical Bayes procedure, which exploits integrated nested Laplace approximation for fast computation. Iteratively, posteriors of hyperparameters and model parameters are estimated. The empirical Bayes procedure shrinks multiple dispersion-related parameters. Shrinkage leads to more stable estimates of the model parameters, better control of false positives and improvement of reproducibility. In addition, to make estimates of the DNA copy number more stable, model parameters are also estimated in a multivariate way using triplets of features, imposing a spatial prior for the copy number effect.

Conclusion

With the proposed method for analysis of time-course multilevel molecular data, more profound insight may be gained through the identification of temporal differential expression induced by DNA copy number abnormalities. In particular, in the analysis of an integrative oncogenomics study with a time-course set-up our method finds genes previously reported to be involved in cervical carcinogenesis. Furthermore, the proposed method yields improvements in sensitivity, specificity and reproducibility compared to existing methods. Finally, the proposed method is able to handle count (RNAseq) data from time course experiments as is shown on a real data set.

Electronic supplementary material

The online version of this article (doi:10.1186/1471-2105-15-327) contains supplementary material, which is available to authorized users. 相似文献

12.

Skeletal muscle alterations and exercise performance decrease in erythropoietin-deficient mice: a comparative study

Laurence Mille-Hamard Veronique L Billat Elodie Henry Blandine Bonnamy Florence Joly Philippe Benech Eric Barrey 《BMC medical genomics》2012,5(1):1-20

相似文献

13.

Identification of DNA methylation changes associated with human gastric cancer

Jung-Hoon Park Jinah Park Jung Kyoon Choi Jaemyun Lyu Min-Gyun Bae Young-Gun Lee Jae-Bum Bae Dong Yoon Park Han-Kwang Yang Tae-You Kim Young-Joon Kim 《BMC medical genomics》2011,4(1):1-15

Background

Multiple breast cancer gene expression profiles have been developed that appear to provide similar abilities to predict outcome and may outperform clinical-pathologic criteria; however, the extent to which seemingly disparate profiles provide additive prognostic information is not known, nor do we know whether prognostic profiles perform equally across clinically defined breast cancer subtypes. We evaluated whether combining the prognostic powers of standard breast cancer clinical variables with a large set of gene expression signatures could improve on our ability to predict patient outcomes.

Methods

Using clinical-pathological variables and a collection of 323 gene expression "modules", including 115 previously published signatures, we build multivariate Cox proportional hazards models using a dataset of 550 node-negative systemically untreated breast cancer patients. Models predictive of pathological complete response (pCR) to neoadjuvant chemotherapy were also built using this approach.

Results

We identified statistically significant prognostic models for relapse-free survival (RFS) at 7 years for the entire population, and for the subgroups of patients with ER-positive, or Luminal tumors. Furthermore, we found that combined models that included both clinical and genomic parameters improved prognostication compared with models with either clinical or genomic variables alone. Finally, we were able to build statistically significant combined models for pathological complete response (pCR) predictions for the entire population.

Conclusions

Integration of gene expression signatures and clinical-pathological factors is an improved method over either variable type alone. Highly prognostic models could be created when using all patients, and for the subset of patients with lymph node-negative and ER-positive breast cancers. Other variables beyond gene expression and clinical-pathological variables, like gene mutation status or DNA copy number changes, will be needed to build robust prognostic models for ER-negative breast cancer patients. This combined clinical and genomics model approach can also be used to build predictors of therapy responsiveness, and could ultimately be applied to other tumor types. 相似文献

14.

Array-comparative genomic hybridization reveals loss of SOCS6 is associated with poor prognosis in primary lung squamous cell carcinoma

Sriram KB Larsen JE Savarimuthu Francis SM Wright CM Clarke BE Duhig EE Brown KM Hayward NK Yang IA Bowman RV Fong KM 《PloS one》2012,7(2):e30398

Background

Primary tumor recurrence commonly occurs after surgical resection of lung squamous cell carcinoma (SCC). Little is known about the genes driving SCC recurrence.

Methods

We used array comparative genomic hybridization (aCGH) to identify genes affected by copy number alterations that may be involved in SCC recurrence. Training and test sets of resected primary lung SCC were assembled. aCGH was used to determine genomic copy number in a training set of 62 primary lung SCCs (28 with recurrence and 34 with no evidence of recurrence) and the altered copy number of candidate genes was confirmed by quantitative PCR (qPCR). An independent test set of 72 primary lung SCCs (20 with recurrence and 52 with no evidence of recurrence) was used for biological validation. mRNA expression of candidate genes was studied using qRT-PCR. Candidate gene promoter methylation was evaluated using methylation microarrays and Sequenom EpiTYPER analysis.

Results

18q22.3 loss was identified by aCGH as being significantly associated with recurrence (p = 0.038). Seven genes within 18q22.3 had aCGH copy number loss associated with recurrence but only SOCS6 copy number was both technically replicated by qPCR and biologically validated in the test set. SOCS6 copy number loss correlated with reduced mRNA expression in the study samples and in the samples with copy number loss, there was a trend for increased methylation, albeit non-significant. Overall survival was significantly poorer in patients with SOCS6 loss compared to patients without SOCS6 loss in both the training (30 vs. 43 months, p = 0.023) and test set (27 vs. 43 months, p = 0.010).

Conclusion

Reduced copy number and mRNA expression of SOCS6 are associated with disease recurrence in primary lung SCC and may be useful prognostic biomarkers. 相似文献

15.

Breaking the waves: improved detection of copy number variation from microarray-based comparative genomic hybridization

下载免费PDF全文

Marioni JC Thorne NP Valsesia A Fitzgerald T Redon R Fiegler H Andrews TD Stranger BE Lynch AG Dermitzakis ET Carter NP Tavaré S Hurles ME 《Genome biology》2007,8(10):R228-14

Background

Large-scale high throughput studies using microarray technology have established that copy number variation (CNV) throughout the genome is more frequent than previously thought. Such variation is known to play an important role in the presence and development of phenotypes such as HIV-1 infection and Alzheimer's disease. However, methods for analyzing the complex data produced and identifying regions of CNV are still being refined.

Results

We describe the presence of a genome-wide technical artifact, spatial autocorrelation or 'wave', which occurs in a large dataset used to determine the location of CNV across the genome. By removing this artifact we are able to obtain both a more biologically meaningful clustering of the data and an increase in the number of CNVs identified by current calling methods without a major increase in the number of false positives detected. Moreover, removing this artifact is critical for the development of a novel model-based CNV calling algorithm - CNVmix - that uses cross-sample information to identify regions of the genome where CNVs occur. For regions of CNV that are identified by both CNVmix and current methods, we demonstrate that CNVmix is better able to categorize samples into groups that represent copy number gains or losses.

Conclusion

Removing artifactual 'waves' (which appear to be a general feature of array comparative genomic hybridization (aCGH) datasets) and using cross-sample information when identifying CNVs enables more biological information to be extracted from aCGH experiments designed to investigate copy number variation in normal individuals. 相似文献

16.

Adipose tissue gene expression analysis reveals changes in inflammatory, mitochondrial respiratory and lipid metabolic pathways in obese insulin-resistant subjects

Jarkko Soronen Pirkka-Pekka Laurila Jussi Naukkarinen Ida Surakka Samuli Ripatti Matti Jauhiainen Vesa M Olkkonen Hannele Yki-Järvinen 《BMC medical genomics》2012,5(1):1-13

Background

To elucidate gene expression associated with copy number changes, we performed a genome-wide copy number and expression microarray analysis of 25 pairs of gastric tissues.

Methods

We applied laser capture microdissection (LCM) to obtain samples for microarray experiments and profiled DNA copy number and gene expression using 244K CGH Microarray and Human Exon 1.0 ST Microarray.

Results

Obviously, gain at 8q was detected at the highest frequency (70%) and 20q at the second (63%). We also identified molecular genetic divergences for different TNM-stages or histological subtypes of gastric cancers. Interestingly, the C20orf11 amplification and gain at 20q13.33 almost separated moderately differentiated (MD) gastric cancers from poorly differentiated (PD) type. A set of 163 genes showing the correlations between gene copy number and expression was selected and the identified genes were able to discriminate matched adjacent noncancerous samples from gastric cancer samples in an unsupervised two-way hierarchical clustering. Quantitative RT-PCR analysis for 4 genes (C20orf11, XPO5, PUF60, and PLOD3) of the 163 genes validated the microarray results. Notably, some candidate genes (MCM4 and YWHAZ) and its adjacent genes such as PRKDC, UBE2V2, ANKRD46, ZNF706, and GRHL2, were concordantly deregulated by genomic aberrations.

Conclusions

Taken together, our results reveal diverse chromosomal region alterations for different TNM-stages or histological subtypes of gastric cancers, which is helpful in researching clinicopathological classification, and highlight several interesting genes as potential biomarkers for gastric cancer. 相似文献

17.

Transcriptional analysis of highly syntenic regions between Medicago truncatula and Glycine max using tiling microarrays

Li L He H Zhang J Wang X Bai S Stolc V Tongprasit W Young ND Yu O Deng XW 《Genome biology》2008,9(3):R57-13

相似文献

18.

CNAReporter: a GenePattern pipeline for the generation of clinical reports of genomic alterations

Yuri Kotliarov Serdar Bozdag Hangjiong Cheng Stefan Wuchty Jean-Claude Zenklusen Howard A Fine 《BMC medical genomics》2010,3(1):1-5

Background

Genomic copy number alterations are widely associated with a broad range of human tumors and offer the potential to be used as a diagnostic tool. Especially in the emerging era of personalized medicine medical informatics tools that allow the fast visualization and analysis of genomic alterations of a patient's genomic profile for diagnostic and potential treatment purposes increasingly gain importance.

Results

We developed CNAReporter, a software tool that allows users to visualize SNP-specific data obtained from Affymetrix arrays and generate PDF-reports as output. We combined standard algorithms for the analysis of chromosomal alterations, utilizing the widely applied GenePattern framework. As an example, we show genome analyses of two patients with distinctly different CNA profiles using the tool.

Conclusions

Glioma subtypes, characterized by different genomic alterations, are often treated differently but can be difficult to differentiate pathologically. CNAReporter offers a user-friendly way to visualize and analyse genomic changes of any given tumor genomic profile, thereby leading to an accurate diagnosis and patient-specific treatment. 相似文献

19.

Assessment of algorithms for high throughput detection of genomic copy number variation in oligonucleotide microarray data

Ágnes Baross Allen D Delaney H Irene Li Tarun Nayar Stephane Flibotte Hong Qian Susanna Y Chan Jennifer Asano Adrian Ally Manqiu Cao Patricia Birch Mabel Brown-John Nicole Fernandes Anne Go Giulia Kennedy Sylvie Langlois Patrice Eydoux JM Friedman Marco A Marra 《BMC bioinformatics》2007,8(1):1-18

Background

Genomic deletions and duplications are important in the pathogenesis of diseases, such as cancer and mental retardation, and have recently been shown to occur frequently in unaffected individuals as polymorphisms. Affymetrix GeneChip whole genome sampling analysis (WGSA) combined with 100 K single nucleotide polymorphism (SNP) genotyping arrays is one of several microarray-based approaches that are now being used to detect such structural genomic changes. The popularity of this technology and its associated open source data format have resulted in the development of an increasing number of software packages for the analysis of copy number changes using these SNP arrays.

Results

We evaluated four publicly available software packages for high throughput copy number analysis using synthetic and empirical 100 K SNP array data sets, the latter obtained from 107 mental retardation (MR) patients and their unaffected parents and siblings. We evaluated the software with regards to overall suitability for high-throughput 100 K SNP array data analysis, as well as effectiveness of normalization, scaling with various reference sets and feature extraction, as well as true and false positive rates of genomic copy number variant (CNV) detection.

Conclusion

We observed considerable variation among the numbers and types of candidate CNVs detected by different analysis approaches, and found that multiple programs were needed to find all real aberrations in our test set. The frequency of false positive deletions was substantial, but could be greatly reduced by using the SNP genotype information to confirm loss of heterozygosity. 相似文献

20.

CNVassoc: Association analysis of CNV data using R

Isaac Subirana Ramon Diaz-Uriarte Gavin Lucas Juan R Gonzalez 《BMC medical genomics》2011,4(1):1-7

Background

Copy number variants (CNV) are a potentially important component of the genetic contribution to risk of common complex diseases. Analysis of the association between CNVs and disease requires that uncertainty in CNV copy-number calls, which can be substantial, be taken into account; failure to consider this uncertainty can lead to biased results. Therefore, there is a need to develop and use appropriate statistical tools. To address this issue, we have developed CNVassoc, an R package for carrying out association analysis of common copy number variants in population-based studies. This package includes functions for testing for association with different classes of response variables (e.g. class status, censored data, counts) under a series of study designs (case-control, cohort, etc) and inheritance models, adjusting for covariates. The package includes functions for inferring copy number (CNV genotype calling), but can also accept copy number data generated by other algorithms (e.g. CANARY, CGHcall, IMPUTE).

Results

Here we present a new R package, CNVassoc, that can deal with different types of CNV arising from different platforms such as MLPA o aCGH. Through a real data example we illustrate that our method is able to incorporate uncertainty in the association process. We also show how our package can also be useful when analyzing imputed data when analyzing imputed SNPs. Through a simulation study we show that CNVassoc outperforms CNVtools in terms of computing time as well as in convergence failure rate.

Conclusions

We provide a package that outperforms the existing ones in terms of modelling flexibility, power, convergence rate, ease of covariate adjustment, and requirements for sample size and signal quality. Therefore, we offer CNVassoc as a method for routine use in CNV association studies. 相似文献