首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
TEQC is an R/Bioconductor package for quality assessment of target enrichment experiments. Quality measures comprise specificity and sensitivity of the capture, enrichment, per-target read coverage and its relation to hybridization probe characteristics, coverage uniformity and reproducibility, and read duplicate analysis. Several diagnostic plots allow visual inspection of the data quality. AVAILABILITY AND IMPLEMENTATION: TEQC is implemented in the R language (version >2.12.0) and is available as a Bioconductor package for Linux, Windows and MacOS from www.bioconductor.org.  相似文献   

2.
beadarray: R classes and methods for Illumina bead-based data   总被引:2,自引:0,他引:2  
The R/Bioconductor package beadarray allows raw data from Illumina experiments to be read and stored in convenient R classes. Users are free to choose between various methods of image processing, background correction and normalization in their analysis rather than using the defaults in Illumina's; proprietary software. The package also allows quality assessment to be carried out on the raw data. The data can then be summarized and stored in a format which can be used by other R/Bioconductor packages to perform downstream analyses. Summarized data processed by Illumina's; BeadStudio software can also be read and analysed in the same manner. Availability: The beadarray package is available from the Bioconductor web page at www.bioconductor.org. A user's guide and example data sets are provided with the package.  相似文献   

3.
The R453Plus1Toolbox is an R/Bioconductor package for the analysis of 454 Sequencing data. Projects generated with Roche's data analysis software can be imported into R allowing advanced and customized analyses within the R/Bioconductor environment for sequencing data. Several methods were implemented extending the current functionality of Roche's software. These extensions include methods for quality assurance and annotation of detected variants. Further, a pipeline for the detection of structural variants, e.g. balanced chromosomal translocations, is provided. AVAILABILITY: The R453Plus1Toolbox is implemented in R and available at http://www.bioconductor.org/. A vignette outlining typical workflows is included in the package. CONTACT: h.klein@uni-muenster.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

4.
This article describes specific procedures for conducting quality assessment of Affymetrix GeneChip(R) soybean genome data and for performing analyses to determine differential gene expression using the open-source R programming environment in conjunction with the open-source Bioconductor software. We describe procedures for extracting those Affymetrix probe set IDs related specifically to the soybean genome on the Affymetrix soybean chip and demonstrate the use of exploratory plots including images of raw probe-level data, boxplots, density plots and M versus A plots. RNA degradation and recommended procedures from Affymetrix for quality control are discussed. An appropriate probe-level model provides an excellent quality assessment tool. To demonstrate this, we discuss and display chip pseudo-images of weights, residuals and signed residuals and additional probe-level modeling plots that may be used to identify aberrant chips. The Robust Multichip Averaging (RMA) procedure was used for background correction, normalization and summarization of the AffyBatch probe-level data to obtain expression level data and to discover differentially expressed genes. Examples of boxplots and MA plots are presented for the expression level data. Volcano plots and heatmaps are used to demonstrate the use of (log) fold changes in conjunction with ordinary and moderated t-statistics for determining interesting genes. We show, with real data, how implementation of functions in R and Bioconductor successfully identified differentially expressed genes that may play a role in soybean resistance to a fungal pathogen, Phakopsora pachyrhizi. Complete source code for performing all quality assessment and statistical procedures may be downloaded from our web source: http://css.ncifcrf.gov/services/download/MicroarraySoybean.zip.  相似文献   

5.
MOTIVATION: Microarray-based expression profiles have become a standard methodology in any high-throughput analysis. Several commercial platforms are available, each with its strengths and weaknesses. The R platform for statistical analysis and graphics is a powerful environment for the analysis of microarray data, because it has many integrated statistical methods available as well as the specialized microarray analysis project Bioconductor. Many packages have been added in the last few years increasing the range of possible analysis. Here, we report the availability of a package for reading and analyzing data from GE Healthcare Gene Expression Bioarrays within the R environment. AVAILABILITY: The software is implemented in the R language, is open source and available for download free of charge through the Bioconductor (http://www.bioconductor.org) project.  相似文献   

6.
The NCBI Gene Expression Omnibus (GEO) represents the largest public repository of microarray data. However, finding data in GEO can be challenging. We have developed GEOmetadb in an attempt to make querying the GEO metadata both easier and more powerful. All GEO metadata records as well as the relationships between them are parsed and stored in a local MySQL database. A powerful, flexible web search interface with several convenient utilities provides query capabilities not available via NCBI tools. In addition, a Bioconductor package, GEOmetadb that utilizes a SQLite export of the entire GEOmetadb database is also available, rendering the entire GEO database accessible with full power of SQL-based queries from within R. AVAILABILITY: The web interface and SQLite databases available at http://gbnci.abcc.ncifcrf.gov/geo/. The Bioconductor package is available via the Bioconductor project. The corresponding MATLAB implementation is also available at the same website.  相似文献   

7.
lumi: a pipeline for processing Illumina microarray   总被引:2,自引:0,他引:2  
Illumina microarray is becoming a popular microarray platform. The BeadArray technology from Illumina makes its preprocessing and quality control different from other microarray technologies. Unfortunately, most other analyses have not taken advantage of the unique properties of the BeadArray system, and have just incorporated preprocessing methods originally designed for Affymetrix microarrays. lumi is a Bioconductor package especially designed to process the Illumina microarray data. It includes data input, quality control, variance stabilization, normalization and gene annotation portions. In specific, the lumi package includes a variance-stabilizing transformation (VST) algorithm that takes advantage of the technical replicates available on every Illumina microarray. Different normalization method options and multiple quality control plots are provided in the package. To better annotate the Illumina data, a vendor independent nucleotide universal identifier (nuID) was devised to identify the probes of Illumina microarray. The nuID annotation packages and output of lumi processed results can be easily integrated with other Bioconductor packages to construct a statistical data analysis pipeline for Illumina data. Availability: The lumi Bioconductor package, www.bioconductor.org  相似文献   

8.
ABSTRACT: BACKGROUND: Next-generation sequencing technologies have become important tools for genome-wide studies. However, the quality scores that are assigned to each base have been shown to be inaccurate. If the quality scores are used in downstream analyses, these inaccuracies can have a significant impact on the results. RESULTS: Here we present ReQON, a tool that recalibrates the base quality scores from an input BAM file of aligned sequencing data using logistic regression. ReQON also generates diagnostic plots showing the effectiveness of the recalibration. We show that ReQON produces quality scores that are both more accurate, in the sense that they more closely correspond to the probability of a sequencing error, and do a better job of discriminating between sequencing errors and non-errors than the original quality scores. We also compare ReQON to other available recalibration tools and show that ReQON is less biased and performs favorably in terms of quality score accuracy. CONCLUSION: ReQON is an open source software package, written in R and available through Bioconductor, for recalibrating base quality scores for next-generation sequencing data. ReQON produces a new BAM file with more accurate quality scores, which can improve the results of downstream analysis, and produces several diagnostic plots showing the effectiveness of the recalibration.  相似文献   

9.
High-density single nucleotide polymorphism microarrays (SNP chips) provide information on a subject's genome, such as copy number and genotype (heterozygosity/homozygosity) at a SNP. While fluorescence in situ hybridization and karyotyping reveal many abnormalities, SNP chips provide a higher resolution map of the human genome that can be used to detect, e.g., aneuploidies, microdeletions, microduplications and loss of heterozygosity (LOH). As a variety of diseases are linked to such chromosomal abnormalities, SNP chips promise new insights for these diseases by aiding in the discovery of such regions, and may suggest targets for intervention. The R package SNPchip contains classes and methods useful for storing, visualizing and analyzing high density SNP data. Originally developed from the SNPscan web-tool, SNPchip utilizes S4 classes and extends other open source R tools available at Bioconductor. This has numerous advantages, including the ability to build statistical models for SNP-level data that operate on instances of the class, and to communicate with other R packages that add additional functionality. AVAILABILITY: The package is available from the Bioconductor web page at www.bioconductor.org. SUPPLEMENTARY INFORMATION: The supplementary material as described in this article (case studies, installation guidelines and R code) is available from http://biostat.jhsph.edu/~iruczins/publications/sm/  相似文献   

10.
MOTIVATION: The IntAct repository is one of the largest and most widely used databases for the curation and storage of molecular interaction data. These datasets need to be analyzed by computational methods. Software packages in the statistical environment R provide powerful tools for conducting such analyses. RESULTS: We introduce Rintact, a Bioconductor package that allows users to transform PSI-MI XML2.5 interaction data files from IntAct into R graph objects. On these, they can use methods from R and Bioconductor for a variety of tasks: determining cohesive subgraphs, computing summary statistics, fitting mathematical models to the data or rendering graphical layouts. Rintact provides a programmatic interface to the IntAct repository and allows the use of the analytic methods provided by R and Bioconductor. AVAILABILITY: Rintact is freely available at http://bioconductor.org  相似文献   

11.
12.
MOTIVATION: Functional analyses based on the association of Gene Ontology (GO) terms to genes in a selected gene list are useful bioinformatic tools and the GOstats package has been widely used to perform such computations. In this paper we report significant improvements and extensions such as support for conditional testing. RESULTS: We discuss the capabilities of GOstats, a Bioconductor package written in R, that allows users to test GO terms for over or under-representation using either a classical hypergeometric test or a conditional hypergeometric that uses the relationships among GO terms to decorrelate the results. AVAILABILITY: GOstats is available as an R package from the Bioconductor project: http://bioconductor.org  相似文献   

13.
SUMMARY: OTUbase is an R package designed to facilitate the analysis of operational taxonomic unit (OTU) data and sequence classification (taxonomic) data. Currently there are programs that will cluster sequence data into OTUs and/or classify sequence data into known taxonomies. However, there is a need for software that can take the summarized output of these programs and organize it into easily accessed and manipulated formats. OTUbase provides this structure and organization within R, to allow researchers to easily manipulate the data with the rich library of R packages currently available for additional analysis. AVAILABILITY: OTUbase is an R package available through Bioconductor. It can be found at http://www.bioconductor.org/packages/release/bioc/html/OTUbase.html.  相似文献   

14.
MSnbase is an R/Bioconductor package for the analysis of quantitative proteomics experiments that use isobaric tagging. It provides an exploratory data analysis framework for reproducible research, allowing raw data import, quality control, visualization, data processing and quantitation. MSnbase allows direct integration of quantitative proteomics data with additional facilities for statistical analysis provided by the Bioconductor project. AVAILABILITY: MSnbase is implemented in R (version ≥ 2.13.0) and available at the Bioconductor web site (http://www.bioconductor.org/). Vignettes outlining typical workflows, input/output capabilities and detailing underlying infrastructure are included in the package.  相似文献   

15.
SUMMARY: SScore is an R package that facilitates the comparison of gene expression between Affymetrix GeneChips using the S-score algorithm. The S-score algorithm uses probe level data directly to assess differences in gene expression, without requiring a preliminary separate step of probe set expression summary estimation. Therefore, the algorithm avoids introduction of error associated with the expression summary estimation process and has been demonstrated to improve the accuracy of identifying differentially expressed genes. The S-score produces accurate results even when few or no replicates are available. AVAILABILITY: The R package SScore is available from Bioconductor at http://www.bioconductor.org  相似文献   

16.
17.
18.
SUMMARY: The nucleotide sequences of the probes on a microarray can be used for a variety of purposes in the analysis of microarray experiments. We describe software and a paradigm for the creation of data packages for curating, distributing and working with probe sequence data in a uniform, across-types-of-microarrays manner. While the implementation is specific to the Bioconductor project, the ideas and general strategies are more general and could be easily adopted by other projects. AVAILABILITY: The R package matchprobes is available under LGPL at http://www.bioconductor.org SUPPLEMENTARY INFORMATION: The package contains documentation in the form of a vignette and manual pages.  相似文献   

19.
Identification of biopolymer motifs represents a key step in the analysis of biological sequences. The MEME Suite is a widely used toolkit for comprehensive analysis of biopolymer motifs; however, these tools are poorly integrated within popular analysis frameworks like the R/Bioconductor project, creating barriers to their use. Here we present memes, an R package that provides a seamless R interface to a selection of popular MEME Suite tools. memes provides a novel “data aware” interface to these tools, enabling rapid and complex discriminative motif analysis workflows. In addition to interfacing with popular MEME Suite tools, memes leverages existing R/Bioconductor data structures to store the multidimensional data returned by MEME Suite tools for rapid data access and manipulation. Finally, memes provides data visualization capabilities to facilitate communication of results. memes is available as a Bioconductor package at https://bioconductor.org/packages/memes, and the source code can be found at github.com/snystrom/memes.  相似文献   

20.
We present MeV+R, an integration of the JAVA MultiExperiment Viewer program with Bioconductor packages. This integration of MultiExperiment Viewer and R is easily extensible to other R packages and provides users with point and click access to traditionally command line driven tools written in R. We demonstrate the ability to use MultiExperiment Viewer as a graphical user interface for Bioconductor applications in microarray data analysis by incorporating three Bioconductor packages, RAMA, BRIDGE and iterativeBMA.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号