首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Gene expression analysis by means of microarrays is based on the sequence-specific binding of RNA to DNA oligonucleotide probes and its measurement using fluorescent labels. The binding of RNA fragments involving sequences other than the intended target is problematic because it adds a chemical background to the signal, which is not related to the expression degree of the target gene. The article presents a molecular signature of specific and nonspecific hybridization with potential consequences for gene expression analysis. We analyzed the signal intensities of perfect match (PM) and mismatch (MM) probes of GeneChip microarrays to specify the effect of specific and nonspecific hybridization. We found that these events give rise to different relations between the PM and MM intensities as function of the middle base of the PM, namely a triplet-like (C > G approximately T > A > 0) and a duplet-like (C approximately T > 0 > G approximately A) pattern of the PM-MM log-intensity difference upon binding of specific and nonspecific RNA fragments, respectively. The systematic behavior of the intensity difference can be rationalized on the level of basepairings of DNA/RNA oligonucleotide duplexes in the middle of the probe sequence. Nonspecific binding is characterized by the reversal of the central Watson-Crick (WC) pairing for each PM/MM probe pair, whereas specific binding refers to the combination of a WC and a self-complementary (SC) pairing in PM and MM probes, respectively. The Gibbs free energy contribution of WC pairs to duplex stability is asymmetric for purines and pyrimidines of the PM and decreases according to C > G approximately T > A. SC pairings on the average only weakly contribute to duplex stability. The intensity of complementary MM introduces a systematic source of variation which decreases the precision of expression measures based on the MM intensities.  相似文献   

2.
Li C  Hung Wong W 《Genome biology》2001,2(8):research0032.1-research003211

Background

A model-based analysis of oligonucleotide expression arrays we developed previously uses a probe-sensitivity index to capture the response characteristic of a specific probe pair and calculates model-based expression indexes (MBEI). MBEI has standard error attached to it as a measure of accuracy. Here we investigate the stability of the probe-sensitivity index across different tissue types, the reproducibility of results in replicate experiments, and the use of MBEI in perfect match (PM)-only arrays.

Results

Probe-sensitivity indexes are stable across tissue types. The target gene's presence in many arrays of an array set allows the probe-sensitivity index to be estimated accurately. We extended the model to obtain expression values for PM-only arrays, and found that the 20-probe PM-only model is comparable to the 10-probe PM/MM difference model, in terms of the expression correlations with the original 20-probe PM/MM difference model. MBEI method is able to extend the reliable detection limit of expression to a lower mRNA concentration. The standard errors of MBEI can be used to construct confidence intervals of fold changes, and the lower confidence bound of fold change is a better ranking statistic for filtering genes. We can assign reliability indexes for genes in a specific cluster of interest in hierarchical clustering by resampling clustering trees. A software dChip implementing many of these analysis methods is made available.

Conclusions

The model-based approach reduces the variability of low expression estimates, and provides a natural method of calculating expression values for PM-only arrays. The standard errors attached to expression values can be used to assess the reliability of downstream analysis.  相似文献   

3.
Empirical establishment of oligonucleotide probe design criteria   总被引:6,自引:0,他引:6  
Criteria for the design of gene-specific and group-specific oligonucleotide probes were established experimentally via an oligonucleotide array that contained perfect match (PM) and mismatch probes (50-mers and 70-mers) based upon four genes. The effects of probe-target identity, continuous stretch, mismatch position, and hybridization free energy on specificity were tested. Little hybridization was observed at a probe-target identity of < or =85% for both 50-mer and 70-mer probes. PM signal intensities (33 to 48%) were detected at a probe-target identity of 94% for 50-mer oligonucleotides and 43 to 55% for 70-mer probes at a probe-target identity of 96%. When the effects of sequence identity and continuous stretch were considered independently, a stretch probe (>15 bases) contributed an additional 9% of the PM signal intensity compared to a nonstretch probe (< or =15 bases) at the same identity level. Cross-hybridization increased as the length of continuous stretch increased. A 35-base stretch for 50-mer probes or a 50-base stretch for 70-mer probes had approximately 55% of the PM signal. Little cross-hybridization was observed for probes with a minimal binding free energy greater than -30 kcal/mol for 50-mer probes or -40 kcal/mol for 70-mer probes. Based on the experimental results, a set of criteria are suggested for the design of gene-specific and group-specific oligonucleotide probes, and the experimentally established criteria should provide valuable information for new software and algorithms for microarray-based studies.  相似文献   

4.

Background  

Affymetrix gene expression arrays incorporate paired perfect match (PM) and mismatch (MM) probes to distinguish true signals from those arising from cross-hybridization events. A MM signal often shows greater intensity than a PM signal; we propose that one underlying cause is the presence of allelic variants arising from single nucleotide polymorphisms (SNPs). To annotate and characterize SNP contributions to anomalous probe binding behavior we have developed a software tool called AffyMAPSDetector.  相似文献   

5.
Empirical Establishment of Oligonucleotide Probe Design Criteria   总被引:11,自引:0,他引:11  
Criteria for the design of gene-specific and group-specific oligonucleotide probes were established experimentally via an oligonucleotide array that contained perfect match (PM) and mismatch probes (50-mers and 70-mers) based upon four genes. The effects of probe-target identity, continuous stretch, mismatch position, and hybridization free energy on specificity were tested. Little hybridization was observed at a probe-target identity of ≤85% for both 50-mer and 70-mer probes. PM signal intensities (33 to 48%) were detected at a probe-target identity of 94% for 50-mer oligonucleotides and 43 to 55% for 70-mer probes at a probe-target identity of 96%. When the effects of sequence identity and continuous stretch were considered independently, a stretch probe (>15 bases) contributed an additional 9% of the PM signal intensity compared to a nonstretch probe (≤15 bases) at the same identity level. Cross-hybridization increased as the length of continuous stretch increased. A 35-base stretch for 50-mer probes or a 50-base stretch for 70-mer probes had approximately 55% of the PM signal. Little cross-hybridization was observed for probes with a minimal binding free energy greater than −30 kcal/mol for 50-mer probes or −40 kcal/mol for 70-mer probes. Based on the experimental results, a set of criteria are suggested for the design of gene-specific and group-specific oligonucleotide probes, and the experimentally established criteria should provide valuable information for new software and algorithms for microarray-based studies.  相似文献   

6.
7.
MOTIVATION: Oligonucleotide expression arrays exhibit systematic and reproducible variation produced by the multiple distinct probes used to represent a gene. Recently, a gene expression index has been proposed that explicitly models probe effects, and provides improved fits of hybridization intensity for arrays containing perfect match (PM) and mismatch (MM) probe pairs. RESULTS: Here we use a combination of analytical arguments and empirical data to show directly that the estimates provided by model-based expression indexes are superior to those provided by commercial software. The improvement is greatest for genes in which probe effects vary substantially, and modeling the PM and MM intensities separately is superior to using the PM-MM differences. To empirically compare expression indexes, we designed a mixing experiment involving three groups of human fibroblast cells (serum starved, serum stimulated, and a 50:50 mixture of starved/stimulated), with six replicate HuGeneFL arrays in each group. Careful spiking of control genes provides evidence that 88-98% of the genes on the array are detectably transcribed, and that the model-based estimates can accurately detect the presence versus absence of a gene. The use of extensive replication from single RNA sources enables exploration of the technical variability of the array.  相似文献   

8.

Background  

Microarray technology is a high-throughput method for measuring the expression levels of thousand of genes simultaneously. The observed intensities combine a non-specific binding, which is a major disadvantage with microarray data. The Affymetrix GeneChip assigned a mismatch (MM) probe with the intention of measuring non-specific binding, but various opinions exist regarding usefulness of MM measures. It should be noted that not all observed intensities are associated with expressed genes and many of those are associated with unexpressed genes, of which measured values express mere noise due to non-specific binding, cross-hybridization, or stray signals. The implicit assumption that all genes are expressed leads to poor performance of microarray data analyses. We assume two functional states of a gene - expressed or unexpressed - and propose a robust method to estimate gene expression states using an order relationship between PM and MM measures.  相似文献   

9.
10.

Background  

The preprocessing of gene expression data obtained from several platforms routinely includes the aggregation of multiple raw signal intensities to one expression value. Examples are the computation of a single expression measure based on the perfect match (PM) and mismatch (MM) probes for the Affymetrix technology, the summarization of bead level values to bead summary values for the Illumina technology or the aggregation of replicated measurements in the case of other technologies including real-time quantitative polymerase chain reaction (RT-qPCR) platforms. The summarization of technical replicates is also performed in other "-omics" disciplines like proteomics or metabolomics.  相似文献   

11.
In this paper we report exploratory analyses of high-density oligonucleotide array data from the Affymetrix GeneChip system with the objective of improving upon currently used measures of gene expression. Our analyses make use of three data sets: a small experimental study consisting of five MGU74A mouse GeneChip arrays, part of the data from an extensive spike-in study conducted by Gene Logic and Wyeth's Genetics Institute involving 95 HG-U95A human GeneChip arrays; and part of a dilution study conducted by Gene Logic involving 75 HG-U95A GeneChip arrays. We display some familiar features of the perfect match and mismatch probe (PM and MM) values of these data, and examine the variance-mean relationship with probe-level data from probes believed to be defective, and so delivering noise only. We explain why we need to normalize the arrays to one another using probe level intensities. We then examine the behavior of the PM and MM using spike-in data and assess three commonly used summary measures: Affymetrix's (i) average difference (AvDiff) and (ii) MAS 5.0 signal, and (iii) the Li and Wong multiplicative model-based expression index (MBEI). The exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values. We evaluate the four expression summary measures using the dilution study data, assessing their behavior in terms of bias, variance and (for MBEI and RMA) model fit. Finally, we evaluate the algorithms in terms of their ability to detect known levels of differential expression using the spike-in data. We conclude that there is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities.  相似文献   

12.
Hybridization of rRNAs to microarrays is a promising approach for prokaryotic and eukaryotic species identification. Typically, the amount of bound target is measured by fluorescent intensity and it is assumed that the signal intensity is directly related to the target concentration. Using thirteen different eukaryotic LSU rRNA target sequences and 7693 short perfect match oligonucleotide probes, we have assessed current approaches for predicting signal intensities by comparing Gibbs free energy (ΔG°) calculations to experimental results. Our evaluation revealed a poor statistical relationship between predicted and actual intensities. Although signal intensities for a given target varied up to 70-fold, none of the predictors were able to fully explain this variation. Also, no combination of different free energy terms, as assessed by principal component and neural network analyses, provided a reliable predictor of hybridization efficiency. We also examined the effects of single-base pair mismatch (MM) (all possible types and positions) on signal intensities of duplexes. We found that the MM effects differ from those that were predicted from solution-based hybridizations. These results recommend against the application of probe design software tools that use thermodynamic parameters to assess probe quality for species identification. Our results imply that the thermodynamic properties of oligonucleotide hybridization are by far not yet understood.  相似文献   

13.

Background

High-density oligonucleotide microarrays provide a powerful tool for assessing differential mRNA expression levels. Characterizing the noise resulting from the enzymatic and hybridization steps, called type I noise, is essential for attributing significance measures to the differential expression scores. We introduce scoring functions for expression ratios, and associated quality measures. Both the PM (Perfect Match) probes and PM-MM differentials (MM is the single MisMatch) are considered as raw intensities. We then characterize the log-ratio noise structure using robust estimates of their intensity dependent variance.

Results

We show the relationships between the obtained ratios and their quality measures. The complementarity of PM and PM-MM methods is emphasized by the probe sets signal to noise measures. Using a large set of replicate experiments, we demonstrate that the noise structure in the log-ratios very closely follows a local log-normal distribution for both the PM and PM-MM cases. Therefore, significance relative to the type I noise can be quantified reliably using the local STD. We discuss the intensity dependence of the STD and show that ratio scores >1.25 are significant in the mid- to high-intensity range.

Conclusions

The ratio noise structure inherent to high-density oligonucleotide arrays can be well described in terms of local log-normal ratio distributions with characteristic intensity dependence. Therefore, robust estimates of the local STD of these distributions provide a simple and powerful way for assessing significance (relative to type I noise) in differential gene expression. This approach will be helpful for improving the reliability of predictions from hybridization experiments in general.  相似文献   

14.
Sequence dependence of cross-hybridization on short oligo microarrays   总被引:9,自引:3,他引:6  
One of the critical problems in the short oligo microarray technology is how to deal with cross-hybridization that produces spurious data. Little is known about the details of cross-hybridization effect at molecular level. Here, we report a free energy analysis of cross-hybridization on short oligo microarrays using data from a spike-in study. Our analysis revealed that cross-hybridization on the arrays is mostly caused by oligo fragments with a run of 10–16 nt complementary to the probes. Mismatches were estimated to be energetically much more costly in cross-hybridization than that in gene-specific hybridization, implying that the sources of cross-hybridization must be very different between a PM–MM probe pair. Consequently, it is unreliable to use MM probe signal to track cross-hybridizing signal on a corresponding PM probe. Our results also showed that the oligo fragments tend to bind to the 5′ ends of the probes, and are rarely seen at the 3′ ends. These results are useful for microarray design and data analysis.  相似文献   

15.
16.
17.

Background  

Affymetrix Genechips are characterized by probe pairs, a perfect match (PM) and a mismatch (MM) probe differing by a single nucleotide. Most of the data preprocessing algorithms neglect MM signals, as it was shown that MMs cannot be used as estimators of the non-specific hybridization as originally proposed by Affymetrix. The aim of this paper is to study in detail on a large number of experiments the behavior of the average PM/MM ratio. This is taken as an indicator of the quality of the hybridization and, when compared between different chip series, of the quality of the chip design.  相似文献   

18.
MOTIVATION: The sensitivity and specificity of branched DNA (bDNA) assays are derived in part through the judicious design of the capture and label extender probes. To minimize non-specific hybridization (NSH) events, which elevate assay background, candidate probes must be computer screened for complementarity with generic sequences present in the assay. RESULTS: We present a software application which allows for rapid and flexible design of bDNA probesets for novel targets. It includes an algorithm for estimating the magnitude of NSH contribution to background, a mechanism for removing probes with elevated contributions, a methodology for the simultaneous design of probesets for multiple targets, and a graphical user interface which guides the user through the design steps. AVAILABILITY: The program is available as a commercial package through the Pharmaceutical Drug Discovery program at Chiron Diagnostics.  相似文献   

19.
MOTIVATION: Microarrays are a fast and cost-effective method of performing thousands of DNA hybridization experiments simultaneously. DNA probes are typically used to measure the expression level of specific genes. Because probes greatly vary in the quality of their hybridizations, choosing good probes is a difficult task. If one could accurately choose probes that are likely to hybridize well, then fewer probes would be needed to represent each gene in a gene-expression microarray, and, hence, more genes could be placed on an array of a given physical size. Our goal is to empirically evaluate how successfully three standard machine-learning algorithms-na?ve Bayes, decision trees, and artificial neural networks-can be applied to the task of predicting good probes. Fortunately it is relatively easy to get training examples for such a learning task: place various probes on a gene chip, add a sample where the corresponding genes are highly expressed, and then record how well each probe measures the presence of its corresponding gene. With such training examples, it is possible that an accurate predictor of probe quality can be learned. RESULTS: Two of the learning algorithms we investigate-na?ve Bayes and neural networks-learn to predict probe quality surprisingly well. For example, in the top ten predicted probes for a given gene not used for training, on average about five rank in the top 2.5% of that gene's hundreds of possible probes. Decision-tree induction and the simple approach of using predicted melting temperature to rank probes perform significantly worse than these two algorithms. The features we use to represent probes are very easily computed and the time taken to score each candidate probe after training is minor. Training the na?ve Bayes algorithm takes very little time, and while it takes over 10 times as long to train a neural network, that time is still not very substantial (on the order of a few hours on a desktop workstation). We also report the information contained in the features we use to describe the probes. We find the fraction of cytosine in the probe to be the most informative feature. We also find, not surprisingly, that the nucleotides in the middle of the probes sequence are more informative than those at the ends of the sequence.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号