首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
There is an urgent need for bioinformatic methods that allow integrative analysis of multiple microarray data sets. While previous studies have mainly concentrated on reproducibility of gene expression levels within or between different platforms, we propose a novel meta-analytic method that takes into account the vast amount of available probe-level information to combine the expression changes across different studies. We first show that the comparability of relative expression changes and the consistency of differentially expressed genes between different Affymetrix array generations can be considerably improved by determining the expression changes at the probe-level and by considering the latest information on probe-level sequence matching instead of the probe annotations provided by the manufacturer. With the improved probe-level expression change estimates, data from different generations of Affymetrix arrays can be combined more effectively. This will allow for the full exploitation of existing results when designing and analyzing new experiments.  相似文献   

2.
MOTIVATION: Modern strategies for mapping disease loci require efficient genotyping of a large number of known polymorphic sites in the genome. The sensitive and high-throughput nature of hybridization-based DNA microarray technology provides an ideal platform for such an application by interrogating up to hundreds of thousands of single nucleotide polymorphisms (SNPs) in a single assay. Similar to the development of expression arrays, these genotyping arrays pose many data analytic challenges that are often platform specific. Affymetrix SNP arrays, e.g. use multiple sets of short oligonucleotide probes for each known SNP, and require effective statistical methods to combine these probe intensities in order to generate reliable and accurate genotype calls. RESULTS: We developed an integrated multi-SNP, multi-array genotype calling algorithm for Affymetrix SNP arrays, MAMS, that combines single-array multi-SNP (SAMS) and multi-array, single-SNP (MASS) calls to improve the accuracy of genotype calls, without the need for training data or computation-intensive normalization procedures as in other multi-array methods. The algorithm uses resampling techniques and model-based clustering to derive single array based genotype calls, which are subsequently refined by competitive genotype calls based on (MASS) clustering. The resampling scheme caps computation for single-array analysis and hence is readily scalable, important in view of expanding numbers of SNPs per array. The MASS update is designed to improve calls for atypical SNPs, harboring allele-imbalanced binding affinities, that are difficult to genotype without information from other arrays. Using a publicly available data set of HapMap samples from Affymetrix, and independent calls by alternative genotyping methods from the HapMap project, we show that our approach performs competitively to existing methods. AVAILABILITY: R functions are available upon request from the authors.  相似文献   

3.
We describe a novel algorithm (ChipStat) for detecting gene-expression changes utilizing probe-level comparisons of replicate Affymetrix oligonucleotide microarray data. A combined detection approach is shown to yield greater sensitivity than a number of widely used methodologies including SAM, dChip and logit-T. Using this approach, we identify alterations in functional pathways during murine neonatal-pubertal mammary development that include the coordinate upregulation of major urinary proteins and the downregulation of loci exhibiting reciprocal imprinting.  相似文献   

4.

Background  

To identify differentially expressed genes across experimental conditions in oligonucleotide microarray experiments, existing statistical methods commonly use a summary of probe-level expression data for each probe set and compare replicates of these values across conditions using a form of the t-test or rank sum test. Here we propose the use of a statistical method that takes advantage of the built-in redundancy architecture of high-density oligonucleotide arrays.  相似文献   

5.
In the past several years, oligonucleotide microarrays have emerged as a widely used tool for the simultaneous, non-biased measurement of expression levels for thousands of genes. Several challenges exist in successfully utilizing this biotechnology; principal among these is analysis of microarray data. An experiment to measure differential gene expression can consist of a dozen microarrays, each consisting of over a hundred thousand data points. Previously, we have described the use of a novel algorithm for analyzing oligonucleotide microarrays and assessing changes in gene expression. This algorithm describes changes in expression in terms of the statistical significance (S-score) of change, which combines signals detected by multiple probe pairs according to an error model characteristic of oligonucleotide arrays. Software is available that simplifies the use of the application of this algorithm so that it may be applied to improving the analysis of oligonucleotide microarray data. The application of this method to problems of the central nervous system is discussed.  相似文献   

6.
MOTIVATION: Although copy-number aberrations are known to contribute to the diversity of the human DNA and cause various diseases, many aberrations and their phenotypes are still to be explored. The recent development of single-nucleotide polymorphism (SNP) arrays provides researchers with tools for calling genotypes and identifying chromosomal aberrations at an order-of-magnitude greater resolution than possible a few years ago. The fundamental problem in array-based copy-number (CN) analysis is to obtain CN estimates at a single-locus resolution with high accuracy and precision such that downstream segmentation methods are more likely to succeed. RESULTS: We propose a preprocessing method for estimating raw CNs from Affymetrix SNP arrays. Its core utilizes a multichip probe-level model analogous to that for high-density oligonucleotide expression arrays. We extend this model by adding an adjustment for sequence-specific allelic imbalances such as cross-hybridization between allele A and allele B probes. We focus on total CN estimates, which allows us to further constrain the probe-level model to increase the signal-to-noise ratio of CN estimates. Further improvement is obtained by controlling for PCR effects. Each part of the model is fitted robustly. The performance is assessed by quantifying how well raw CNs alone differentiate between one and two copies on Chromosome X (ChrX) at a single-locus resolution (27kb) up to a 200kb resolution. The evaluation is done with publicly available HapMap data. AVAILABILITY: The proposed method is available as part of an open-source R package named aroma.affymetrix. Because it is a bounded-memory algorithm, any number of arrays can be analyzed.  相似文献   

7.
Zhu B  Ping G  Shinohara Y  Zhang Y  Baba Y 《Genomics》2005,85(6):657-665
As the data generated by microarray technology continue to amass, it is necessary to compare and combine gene expression data from different platforms. To evaluate the performance of cDNA and long oligonucleotide (60-mer) arrays, we generated gene expression profiles for two cancer cell lines and compared the data between the two platforms. All 6182 unique genes represented on both platforms were included in the analysis. A limited correlation (r = 0.4708) was obtained and the difference in measurement of low-expression genes was considered to contribute to the limited correlation. Further restriction of the data set to differentially expressed genes detected in cDNA microarrays (1205 genes) and oligonucleotide arrays (1325 genes) showed modest correlations of 0.7076 and 0.6441 between the two platforms. Quantitative real-time PCR measurements of a set of 10 genes showed better correlation with oligonucleotide arrays. Our results demonstrate that there is substantial variation in the data generated from cDNA and 60-mer oligonucleotide arrays. Although general agreement was observed in measurements of differentially expressed genes, we suggest that data from different platforms could not be directly amassed.  相似文献   

8.
In the past several years, oligonucleotide microarrays have emerged as a widely used tool for the simultaneous, non-biased measurement of expression levels for thousands of genes. Several challenges exist in successfully utilizing this biotechnology; principal among these is analysis of microarray data. An experiment to measure differential gene expression can consist of a dozen microarrays, each consisting of over a hundred thousand data points. Previously, we have described the use of a novel algorithm for analyzing oligonucleotide microarrays and assessing changes in gene expression [J. Mol. Biol. 317 (2002) 225]. This algorithm describes changes in expression in terms of the statistical significance (S-score) of change, which combines signals detected by multiple probe pairs according to an error model characteristic of oligonucleotide arrays. Software is available that simplifies the use of the application of this algorithm so that it may be applied to improving the analysis of oligonucleotide microarray data. The application of this method to problems of the central nervous system is discussed.  相似文献   

9.
Microarrays are used to study gene expression in a variety of biological systems. A number of different platforms have been developed, but few studies exist that have directly compared the performance of one platform with another. The goal of this study was to determine array variation by analyzing the same RNA samples with three different array platforms. Using gene expression responses to benzo[a]pyrene exposure in normal human mammary epithelial cells (NHMECs), we compared the results of gene expression profiling using three microarray platforms: photolithographic oligonucleotide arrays (Affymetrix), spotted oligonucleotide arrays (Amersham), and spotted cDNA arrays (NCI). While most previous reports comparing microarrays have analyzed pre-existing data from different platforms, this comparison study used the same sample assayed on all three platforms, allowing for analysis of variation from each array platform. In general, poor correlation was found with corresponding measurements from each platform. Each platform yielded different gene expression profiles, suggesting that while microarray analysis is a useful discovery tool, further validation is needed to extrapolate results for broad use of the data. Also, microarray variability needs to be taken into consideration, not only in the data analysis but also in specific probe selection for each array type.  相似文献   

10.
几种基因芯片技术的比较   总被引:5,自引:0,他引:5  
基因芯片技术因其自身的优越性将成为基因组序列分析和基因表达研究的主要工具。但是由于不同的实验室使用不同的技术平台 ,从 2 5到 80个碱基的寡核苷酸芯片和长度从数百到几千个碱基的cDNA阵列。不同技术平台的存在 ,为芯片资料的利用和芯片领域的标准化分析带来了困难。通过对不同技术平台的研究比较 ,短的和长的寡核苷酸芯片技术具有明显的优势 ,可望成为将来主要的技术方法。  相似文献   

11.
A probe-level model for analysis of GeneChip gene-expression data is presented which identified more than 10,000 single-feature polymorphisms (SFP) between two barley genotypes. The method has good sensitivity, as 67% of known single-nucleotide polymorphisms (SNP) were called as SFPs. This method is applicable to all oligonucleotide microarray data, accounts for SNP effects in gene-expression data and represents an efficient and versatile approach for highly parallel marker identification in large genomes.  相似文献   

12.
MOTIVATION: A major focus of current cancer research is to identify genes that can be used as markers for prognosis and diagnosis, and as targets for therapy. Microarray technology has been applied extensively for this purpose, even though it has been reported that the agreement between microarray platforms is poor. A critical question is: how can we best combine the measurements of matched genes across microarray platforms to develop diagnostic and prognostic tools related to the underlying biology? RESULTS: We introduce a statistical approach within a Bayesian framework to combine the microarray data on matched genes from three investigations of gene expression profiling of B-cell chronic lymphocytic leukemia (CLL) and normal B cells (NBC) using three different microarray platforms, oligonucleotide arrays, cDNA arrays printed on glass slides and cDNA arrays printed on nylon membranes. Using this approach, we identified a number of genes that were consistently differentially expressed between CLL and NBC samples.  相似文献   

13.
We have conducted a study to compare the variability in measured gene expression levels associated with three types of microarray platforms. Total RNA samples were obtained from liver tissue of four male mice, two each from inbred strains A/J and C57BL/6J. The same four samples were assayed on Affymetrix Mouse Genome Expression Set 430 GeneChips (MOE430A and MOE430B), spotted cDNA microarrays, and spotted oligonucleotide microarrays using eight arrays of each type. Variances associated with measurement error were observed to be comparable across all microarray platforms. The MOE430A GeneChips and cDNA arrays had higher precision across technical replicates than the MOE430B GeneChips and oligonucleotide arrays. The Affymetrix platform showed the greatest range in the magnitude of expression levels followed by the oligonucleotide arrays. We observed good concordance in both estimated expression level and statistical significance of common genes between the Affymetrix MOE430A GeneChip and the oligonucleotide arrays. Despite their apparently high precision, cDNA arrays showed poor concordance with other platforms.  相似文献   

14.
DNA microarray technology has been widely used to simultaneously determine the expression levels of thousands of genes. A variety of approaches have been used, both in the implementation of this technology and in the analysis of the large amount of expression data. However, several practical issues still have not been resolved in a satisfactory manner, and among the most critical is the lack of agreement in the results obtained in different array platforms. In this study, we present a comparison of several microarray platforms [Affymetrix oligonucleotide arrays, custom complementary DNA (cDNA) arrays, and custom oligo arrays printed with oligonucleotides from three different sources] as well as analysis of various methods used for microarray target preparation and the reference design. The results indicate that the pairwise correlations of expression levels between platforms are relative low overall but that the log ratios of the highly expressed genes are strongly correlated, especially between Affymetrix and cDNA arrays. The microarray measurements were compared with quantitative real-time-polymerase chain reaction (QRT-PCR) results for 23 genes, and the varying degrees of agreement for each platform were characterized. We have also developed and tested a double amplification method which allows the use of smaller amounts of starting material. The added round of amplification produced reproducible results as compared to the arrays hybridized with single round amplified targets. Finally, the reliability of using a universal RNA reference for two-channel microarrays was tested and the results suggest that comparisons of multiple experimental conditions using the same control can be accurate.  相似文献   

15.
16.
C57BL/6J (B6) and DBA/2J (D2) are two of the most commonly used inbred mouse strains in neuroscience research. However, the only currently available mouse genome is based entirely on the B6 strain sequence. Subsequently, oligonucleotide microarray probes are based solely on this B6 reference sequence, making their application for gene expression profiling comparisons across mouse strains dubious due to their allelic sequence differences, including single nucleotide polymorphisms (SNPs). The emergence of next-generation sequencing (NGS) and the RNA-Seq application provides a clear alternative to oligonucleotide arrays for detecting differential gene expression without the problems inherent to hybridization-based technologies. Using RNA-Seq, an average of 22 million short sequencing reads were generated per sample for 21 samples (10 B6 and 11 D2), and these reads were aligned to the mouse reference genome, allowing 16,183 Ensembl genes to be queried in striatum for both strains. To determine differential expression, 'digital mRNA counting' is applied based on reads that map to exons. The current study compares RNA-Seq (Illumina GA IIx) with two microarray platforms (Illumina MouseRef-8 v2.0 and Affymetrix MOE 430 2.0) to detect differential striatal gene expression between the B6 and D2 inbred mouse strains. We show that by using stringent data processing requirements differential expression as determined by RNA-Seq is concordant with both the Affymetrix and Illumina platforms in more instances than it is concordant with only a single platform, and that instances of discordance with respect to direction of fold change were rare. Finally, we show that additional information is gained from RNA-Seq compared to hybridization-based techniques as RNA-Seq detects more genes than either microarray platform. The majority of genes differentially expressed in RNA-Seq were only detected as present in RNA-Seq, which is important for studies with smaller effect sizes where the sensitivity of hybridization-based techniques could bias interpretation.  相似文献   

17.
We introduce a statistical model for microarray gene expression data that comprises data calibration, the quantification of differential expression, and the quantification of measurement error. In particular, we derive a transformation h for intensity measurements, and a difference statistic Deltah whose variance is approximately constant along the whole intensity range. This forms a basis for statistical inference from microarray data, and provides a rational data pre-processing strategy for multivariate analyses. For the transformation h, the parametric form h(x)=arsinh(a+bx) is derived from a model of the variance-versus-mean dependence for microarray intensity data, using the method of variance stabilizing transformations. For large intensities, h coincides with the logarithmic transformation, and Deltah with the log-ratio. The parameters of h together with those of the calibration between experiments are estimated with a robust variant of maximum-likelihood estimation. We demonstrate our approach on data sets from different experimental platforms, including two-colour cDNA arrays and a series of Affymetrix oligonucleotide arrays.  相似文献   

18.
Are data from different gene expression microarray platforms comparable?   总被引:8,自引:0,他引:8  
Many commercial and custom-made microarray formats are routinely used for large-scale gene expression surveys. Here, we sought to determine the level of concordance between microarray platforms by analyzing breast cancer cell lines with in situ synthesized oligonucleotide arrays (Affymetrix HG-U95v2), commercial cDNA microarrays (Agilent Human 1 cDNA), and custom-made cDNA microarrays from a sequence-validated 13K cDNA library. Gene expression data from the commercial platforms showed good correlations across the experiments (r = 0.78-0.86), whereas the correlations between the custom-made and either of the two commercial platforms were lower (r = 0.62-0.76). Discrepant findings were due to clone errors on the custom-made microarrays, old annotations, or unknown causes. Even within platform, there can be several ways to analyze data that may influence the correlation between platforms. Our results indicate that combining data from different microarray platforms is not straightforward. Variability of the data represents a challenge for developing future diagnostic applications of microarrays.  相似文献   

19.
MOTIVATION: There is a very large and growing level of effort toward improving the platforms, experiment designs, and data analysis methods for microarray expression profiling. Along with a growing richness in the approaches there is a growing confusion among most scientists as to how to make objective comparisons and choices between them for different applications. There is a need for a standard framework for the microarray community to compare and improve analytical and statistical methods. RESULTS: We report on a microarray data set comprising 204 in-situ synthesized oligonucleotide arrays, each hybridized with two-color cDNA samples derived from 20 different human tissues and cell lines. Design of the approximately 24 000 60mer oligonucleotides that report approximately 2500 known genes on the arrays, and design of the hybridization experiments, were carried out in a way that supports the performance assessment of alternative data processing approaches and of alternative experiment and array designs. We also propose standard figures of merit for success in detecting individual differential expression changes or expression levels, and for detecting similarities and differences in expression patterns across genes and experiments. We expect this data set and the proposed figures of merit will provide a standard framework for much of the microarray community to compare and improve many analytical and statistical methods relevant to microarray data analysis, including image processing, normalization, error modeling, combining of multiple reporters per gene, use of replicate experiments, and sample referencing schemes in measurements based on expression change. AVAILABILITY/SUPPLEMENTARY INFORMATION: Expression data and supplementary information are available at http://www.rii.com/publications/2003/HE_SDS.htm  相似文献   

20.
Three different software packages for the probe-level analysis of high-density oligonucleotide microarray data were compared using an experiment-derived data set that was validated using real-time PCR. The efficiency with which these three programs could identify true positives in this data set was assessed. In addition, estimates of false-positive and false-negative rates were determined. The performance of the programs using very small data sets was also compared, and recommendations for use are suggested.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号