首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
Sequence biases in large scale gene expression profiling data   总被引:1,自引:2,他引:1  
We present the results of a simple, statistical assay that measures the G+C content sensitivity bias of gene expression experiments without the requirement of a duplicate experiment. We analyse five gene expression profiling methods: Affymetrix GeneChip, Long Serial Analysis of Gene Expression (LongSAGE), LongSAGELite, ‘Classic’ Massively Parallel Signature Sequencing (MPSS) and ‘Signature’ MPSS. We demonstrate the methods have systematic and random errors leading to a different G+C content sensitivity. The relationship between this experimental error and the G+C content of the probe set or tag that identifies each gene influences whether the gene is detected and, if detected, the level of gene expression measured. LongSAGE has the least bias, while Signature MPSS shows a strong bias to G+C rich tags and Affymetrix data show different bias depending on the data processing method (MAS 5.0, RMA or GC-RMA). The bias in the Affymetrix data primarily impacts genes expressed at lower levels. Despite the larger sampling of the MPSS library, SAGE identifies significantly more genes (60% more RefSeq genes in a single comparison).  相似文献   

3.
4.
Genotyping and annotation of Affymetrix SNP arrays   总被引:1,自引:0,他引:1  
In this paper we develop a new method for genotyping Affymetrix single nucleotide polymorphism (SNP) array. The method is based on (i) using multiple arrays at the same time to determine the genotypes and (ii) a model that relates intensities of individual SNPs to each other. The latter point allows us to annotate SNPs that have poor performance, either because of poor experimental conditions or because for one of the alleles the probes do not behave in a dose–response manner. Generally, our method agrees well with a method developed by Affymetrix. When both methods make a call they agree in 99.25% (using standard settings) of the cases, using a sample of 113 Affymetrix 10k SNP arrays. In the majority of cases where the two methods disagree, our method makes a genotype call, whereas the method by Affymetrix makes a no call, i.e. the genotype of the SNP is not determined. By visualization it is indicated that our method is likely to be correct in majority of these cases. In addition, we demonstrate that our method produces more SNPs that are in concordance with Hardy–Weinberg equilibrium than the method by Affymetrix. Finally, we have validated our method on HapMap data and shown that the performance of our method is comparable to other methods.  相似文献   

5.
MOTIVATION: Modern strategies for mapping disease loci require efficient genotyping of a large number of known polymorphic sites in the genome. The sensitive and high-throughput nature of hybridization-based DNA microarray technology provides an ideal platform for such an application by interrogating up to hundreds of thousands of single nucleotide polymorphisms (SNPs) in a single assay. Similar to the development of expression arrays, these genotyping arrays pose many data analytic challenges that are often platform specific. Affymetrix SNP arrays, e.g. use multiple sets of short oligonucleotide probes for each known SNP, and require effective statistical methods to combine these probe intensities in order to generate reliable and accurate genotype calls. RESULTS: We developed an integrated multi-SNP, multi-array genotype calling algorithm for Affymetrix SNP arrays, MAMS, that combines single-array multi-SNP (SAMS) and multi-array, single-SNP (MASS) calls to improve the accuracy of genotype calls, without the need for training data or computation-intensive normalization procedures as in other multi-array methods. The algorithm uses resampling techniques and model-based clustering to derive single array based genotype calls, which are subsequently refined by competitive genotype calls based on (MASS) clustering. The resampling scheme caps computation for single-array analysis and hence is readily scalable, important in view of expanding numbers of SNPs per array. The MASS update is designed to improve calls for atypical SNPs, harboring allele-imbalanced binding affinities, that are difficult to genotype without information from other arrays. Using a publicly available data set of HapMap samples from Affymetrix, and independent calls by alternative genotyping methods from the HapMap project, we show that our approach performs competitively to existing methods. AVAILABILITY: R functions are available upon request from the authors.  相似文献   

6.
A benchmark for Affymetrix GeneChip expression measures   总被引:11,自引:0,他引:11  
  相似文献   

7.
Microarray blob-defect removal improves array analysis   总被引:1,自引:0,他引:1  
MOTIVATION: New generation Affymetrix oligonucleotide microarrays often have blob-like image defects that will require investigators to either repeat their hybridization assays or analyze their data with the defects left in place. We investigated the effect of analyzing a spike-in experiment on Affymetrix ENCODE tiling arrays in the presence of simulated blobs covering between 1 and 9% of the array area. Using two different ChIP-chip tiling array analysis programs (Affymetrix tiling array software, TAS, and model-based analysis of tiling arrays, MAT), we found that even the smallest blob defects significantly decreased the sensitivity and increased the false discovery rate (FDR) of the spike-in target prediction. RESULTS: We introduced a new software tool, the microarray blob remover (MBR), which allows rapid visualization, detection and removal of various blob defects from the .CEL files of different types of Affymetrix microarrays. It is shown that using MBR significantly improves the sensitivity and FDR of a tiling array analysis compared to leaving the affected probes in the analysis. AVAILABILITY: The MBR software and the sample array .CEL files used in this article are available at: http://liulab.dfci.harvard.edu/Software/MBR/MBR.htm  相似文献   

8.
9.
Zapala MA  Lockhart DJ  Pankratz DG  Garcia AJ  Barlow C  Lockhart DJ 《Genome biology》2002,3(6):software0001.1-software00019
Two HTML-based programs were developed to analyze and filter gene-expression data: 'Bullfrog' for Affymetrix oligonucleotide arrays and 'Spot' for custom cDNA arrays. The programs provide intuitive data-filtering tools through an easy-to-use interface. A background subtraction and normalization program for cDNA arrays was also built that provides an informative summary report with data-quality assessments. These programs are freeware to aid in the analysis of gene-expression results and facilitate the search for genes responsible for interesting biological processes and phenotypes.  相似文献   

10.
affy--analysis of Affymetrix GeneChip data at the probe level   总被引:32,自引:0,他引:32  
MOTIVATION: The processing of the Affymetrix GeneChip data has been a recent focus for data analysts. Alternatives to the original procedure have been proposed and some of these new methods are widely used. RESULTS: The affy package is an R package of functions and classes for the analysis of oligonucleotide arrays manufactured by Affymetrix. The package is currently in its second release, affy provides the user with extreme flexibility when carrying out an analysis and make it possible to access and manipulate probe intensity data. In this paper, we present the main classes and functions in the package and demonstrate how they can be used to process probe-level data. We also demonstrate the importance of probe-level analysis when using the Affymetrix GeneChip platform.  相似文献   

11.
12.
13.
Exon arrays provide accurate assessments of gene expression   总被引:1,自引:0,他引:1  
We have developed a strategy for estimating gene expression on Affymetrix Exon arrays. The method includes a probe-specific background correction and a probe selection strategy in which a subset of probes with highly correlated intensities across multiple samples are chosen to summarize gene expression. Our results demonstrate that the proposed background model offers improvements over the default Affymetrix background correction and that Exon arrays may provide more accurate measurements of gene expression than traditional 3' arrays.  相似文献   

14.
15.
16.
MOTIVATION: Affymetrix GeneChip arrays are currently the most widely used microarray technology. Many summarization methods have been developed to provide gene expression levels from Affymetrix probe-level data. Most of the currently popular methods do not provide a measure of uncertainty for the expression level of each gene. The use of probabilistic models can overcome this limitation. A full hierarchical Bayesian approach requires the use of computationally intensive MCMC methods that are impractical for large datasets. An alternative computationally efficient probabilistic model, mgMOS, uses Gamma distributions to model specific and non-specific binding with a latent variable to capture variations in probe affinity. Although promising, the main limitations of this model are that it does not use information from multiple chips and does not account for specific binding to the mismatch (MM) probes. RESULTS: We extend mgMOS to model the binding affinity of probe-pairs across multiple chips and to capture the effect of specific binding to MM probes. The new model, multi-mgMOS, provides improved accuracy, as demonstrated on some bench-mark datasets and a real time-course dataset, and is much more computationally efficient than a competing hierarchical Bayesian approach that requires MCMC sampling. We demonstrate how the probabilistic model can be used to estimate credibility intervals for expression levels and their log-ratios between conditions. AVAILABILITY: Both mgMOS and the new model multi-mgMOS have been implemented in an R package, which is available at http://www.bioinf.man.ac.uk/resources/puma.  相似文献   

17.
We present a framework for detecting probes in oligonucleotide microarrays that may add significant error to measurements in hybridization experiments. Four types of so-called degenerate probe behavior are considered: secondary structure formation, self-dimerization, cross-hybridization, and dimerization. The framework uses a well-established model for computing the free energy of nucleic acid sequence hybridization and a novel method for the detection of patterns in hybridization experiment data. Our primary result is the identification of unique patterns in hybridization experiment data that are shown to correlate with each type of degenerate probe behavior. A support function for identifying degenerate probes from a large set of hybridization experiments is given and some preliminary experimental results are given for the Affymetrix HuGeneFL GeneChip. Finally, we show a strong relationship between the Affymetrix discrimination measure for a probe and the free-energy estimate from theoretical models of hybridization. In particular, probes on the HuGeneFL GeneChip with high free-energy estimates (weak hybridization) have almost always approximately zero discrimination. The framework can be applied to any Affymetrix oligonucleotide array, and the software is made freely available to the community.  相似文献   

18.
19.
20.
This article describes specific procedures for conducting quality assessment of Affymetrix GeneChip(R) soybean genome data and for performing analyses to determine differential gene expression using the open-source R programming environment in conjunction with the open-source Bioconductor software. We describe procedures for extracting those Affymetrix probe set IDs related specifically to the soybean genome on the Affymetrix soybean chip and demonstrate the use of exploratory plots including images of raw probe-level data, boxplots, density plots and M versus A plots. RNA degradation and recommended procedures from Affymetrix for quality control are discussed. An appropriate probe-level model provides an excellent quality assessment tool. To demonstrate this, we discuss and display chip pseudo-images of weights, residuals and signed residuals and additional probe-level modeling plots that may be used to identify aberrant chips. The Robust Multichip Averaging (RMA) procedure was used for background correction, normalization and summarization of the AffyBatch probe-level data to obtain expression level data and to discover differentially expressed genes. Examples of boxplots and MA plots are presented for the expression level data. Volcano plots and heatmaps are used to demonstrate the use of (log) fold changes in conjunction with ordinary and moderated t-statistics for determining interesting genes. We show, with real data, how implementation of functions in R and Bioconductor successfully identified differentially expressed genes that may play a role in soybean resistance to a fungal pathogen, Phakopsora pachyrhizi. Complete source code for performing all quality assessment and statistical procedures may be downloaded from our web source: http://css.ncifcrf.gov/services/download/MicroarraySoybean.zip.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号