共查询到13条相似文献,搜索用时 0 毫秒
1.
Tumor samples are typically heterogeneous, containing admixture by normal, non-cancerous cells and one or more subpopulations of cancerous cells. Whole-genome sequencing of a tumor sample yields reads from this mixture, but does not directly reveal the cell of origin for each read. We introduce THetA (Tumor Heterogeneity Analysis), an algorithm that infers the most likely collection of genomes and their proportions in a sample, for the case where copy number aberrations distinguish subpopulations. THetA successfully estimates normal admixture and recovers clonal and subclonal copy number aberrations in real and simulated sequencing data. THetA is available at http://compbio.cs.brown.edu/software/. 相似文献
2.
《Epigenetics》2013,8(3):225-229
Recent evidence suggests that DNA methylation changes may underlie numerous complex traits and diseases. The advent of commercial, array-based methods to interrogate DNA methylation has led to a profusion of epigenetic studies in the literature. Array-based methods, such as the popular Illumina GoldenGate and Infinium platforms, estimate the proportion of DNA methylated at single-base resolution for thousands of CpG sites across the genome. These arrays generate enormous amounts of data, but few software resources exist for efficient and flexible analysis of these data. We developed a software package called MethLAB (http://genetics.emory.edu/conneely/MethLAB) using R, an open source statistical language that can be edited to suit the needs of the user. MethLAB features a graphical user interface (GUI) with a menu-driven format designed to efficiently read in and manipulate array-based methylation data in a user-friendly manner. MethLAB tests for association between methylation and relevant phenotypes by fitting a separate linear model for each CpG site. These models can incorporate both continuous and categorical phenotypes and covariates, as well as fixed or random batch or chip effects. MethLAB accounts for multiple testing by controlling the false discovery rate (FDR) at a user-specified level. Standard output includes a spreadsheet-ready text file and an array of publication-quality figures. Considering the growing interest in and availability of DNA methylation data, there is a great need for user-friendly open source analytical tools. With MethLAB, we present a timely resource that will allow users with no programming experience to implement flexible and powerful analyses of DNA methylation data. 相似文献
3.
4.
Due to the difficulties in deep sequencing, high-throughput sequencing of ancient DNA has been limited to exceptionally well-preserved ancient materials. The primary factor is microbial attack popularly observed in the buried materials, and it causes drastic increase in relative ratio of microbial DNA in the extracted DNA. We present a unified strategy in which emulsion PCR is coupled with target enrichment followed by next-generation sequencing. The method made it possible to obtain efficiently non-duplicated reads mapped to target sequences of interest, and this can achieve deep and reliable sequencing of ancient DNA from typical materials, even though poorly preserved. 相似文献
5.
6.
Data analysis--not data production--is becoming the bottleneck in gene expression research. Data integration is necessary to cope with an ever increasing amount of data, to cross-validate noisy data sets, and to gain broad interdisciplinary views of large biological data sets. New Internet resources may help researchers to combine data sets across different gene expression platforms. However, noise and disparities in experimental protocols strongly limit data integration. A detailed review of four selected studies reveals how some of these limitations may be circumvented and illustrates what can be achieved through data integration. 相似文献
7.
8.
Fabian Ripp Christopher Felix Krombholz Yongchao Liu Mathias Weber Anne Sch?fer Bertil Schmidt Rene K?ppel Thomas Hankeln 《BMC genomics》2014,15(1)
Background
DNA-based methods like PCR efficiently identify and quantify the taxon composition of complex biological materials, but are limited to detecting species targeted by the choice of the primer assay. We show here how untargeted deep sequencing of foodstuff total genomic DNA, followed by bioinformatic analysis of sequence reads, facilitates highly accurate identification of species from all kingdoms of life, at the same time enabling quantitative measurement of the main ingredients and detection of unanticipated food components.Results
Sequence data simulation and real-case Illumina sequencing of DNA from reference sausages composed of mammalian (pig, cow, horse, sheep) and avian (chicken, turkey) species are able to quantify material correctly at the 1% discrimination level via a read counting approach. An additional metagenomic step facilitates identification of traces from animal, plant and microbial DNA including unexpected species, which is prospectively important for the detection of allergens and pathogens.Conclusions
Our data suggest that deep sequencing of total genomic DNA from samples of heterogeneous taxon composition promises to be a valuable screening tool for reference species identification and quantification in biosurveillance applications like food testing, potentially alleviating some of the problems in taxon representation and quantification associated with targeted PCR-based approaches.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-639) contains supplementary material, which is available to authorized users. 相似文献9.
David Mosen-Ansorena Naiara Telleria Silvia Veganzones Virginia De la Orden Maria Luisa Maestro Ana M Aransay 《BMC genomics》2014,15(1)
Background
Deviations in the amount of genomic content that arise during tumorigenesis, called copy number alterations, are structural rearrangements that can critically affect gene expression patterns. Additionally, copy number alteration profiles allow insight into cancer discrimination, progression and complexity. On data obtained from high-throughput sequencing, improving quality through GC bias correction and keeping false positives to a minimum help build reliable copy number alteration profiles.Results
We introduce seqCNA, a parallelized R package for an integral copy number analysis of high-throughput sequencing cancer data. The package includes novel methodology on (i) filtering, reducing false positives, and (ii) GC content correction, improving copy number profile quality, especially under great read coverage and high correlation between GC content and copy number. Adequate analysis steps are automatically chosen based on availability of paired-end mapping, matched normal samples and genome annotation.Conclusions
seqCNA, available through Bioconductor, provides accurate copy number predictions in tumoural data, thanks to the extensive filtering and better GC bias correction, while providing an integrated and parallelized workflow.Electronic supplementary material
The online version of this article (doi:10.1186/1471-2164-15-178) contains supplementary material, which is available to authorized users. 相似文献10.
Gerard Such-Sanmartín Simone SidoliEstela Ventura-Espejo Ole N. Jensen 《Biochemical and biophysical research communications》2014
We introduce the computer tool “Know Your Samples” (KYSS) for assessment and visualisation of large scale proteomics datasets, obtained by mass spectrometry (MS) experiments. KYSS facilitates the evaluation of sample preparation protocols, LC peptide separation, and MS and MS/MS performance by monitoring the number of missed cleavages, precursor ion charge states, number of protein identifications and peptide mass error in experiments. KYSS generates several different protein profiles based on protein abundances, and allows for comparative analysis of multiple experiments. KYSS was adapted for blood plasma proteomics and provides concentrations of identified plasma proteins. We demonstrate the utility of the KYSS tool for MS based proteome analysis of blood plasma and for assessment of hydrogel particles for depletion of abundant proteins in plasma. The KYSS software is open source and is freely available at http://kyssproject.github.io/. 相似文献
11.
Summary NMR View is a computer program designed for the visualization and analysis of NMR data. It allows the user to interact with a practically unlimited number of 2D, 3D and 4D NMR data files. Any number of spectral windows can be displayed on the screen in any size and location. Automatic peak picking and facilitated peak analysis features are included to aid in the assignment of complex NMR spectra. NMR View provides structure analysis features and data transfer to and from structure generation programs, allowing for a tight coupling between spectral analysis and structure generation. Visual correlation between structures and spectra can be done with the Molecular Data Viewer, a molecular graphics program with bidirectional communication to NMR View. The user interface can be customized and a command language is provided to allow for the automation of various tasks.Inquiries concerning the availability of NMR View and the Molecular Data Viewer should be sent via email to johnsonb@merck.com or to Bruce A. Johnson, Merck Research Laboratories, RY80Y-103, P.O. Box 2000, Rahway, NJ 07065, U.S.A. 相似文献
12.
H.-R. Roth G. Dolf G. Stranzinger 《TAG. Theoretical and applied genetics. Theoretische und angewandte Genetik》1987,74(1):42-48
Summary A statistical approach to the interpretation of data from gene assignment with somatic cell hybrids is presented. The observed data are analysed under a variety of hypotheses. The fit to the hypotheses is compared by means of the likelihood obtained under a given hypothesis. Two of these hypotheses are related to fundamental questions: is a gene responsible for the enzyme observation and if so, is that gene located on a specific chromosome or could it change its position and be sometimes on chromosome j and, in another hybrid line, on chromosome k? The other hypotheses concern the assignment of the gene to just one of the chromosomes.To improve the traditional data analysis approach we considered additional information: the uncertainties and possible errors of laboratory methods in all our calculations and the length of the donor chromosomes in connection with one specific hypothesis.This method allows us to account for the reliability of the investigation methods and the nature of the hybrid lines involved. Data can be evaluated at different error probabilities within a realistic range in order to compare and discuss results. 相似文献