共查询到20条相似文献,搜索用时 15 毫秒
1.
This research provides a new way to measure error in microarray data in order to improve gene expression analysis. Microarray data contains many sources of error. In order to glean information about mRNA expression levels, the true signal must first be segregated from noise. This research focuses on the variation that can be captured at the spot level in cDNA microarray images. Variation at other levels, due to differences at the array, dye, and block levels, can be corrected for by a variety of existing normalization procedures. Two signal quality estimates that capture the reliability of each spot printed on a microarray are described. A parametric estimate of within-spot variance, referred to here as σ2spot, assumes that pixels follow a normal distribution and are spatially correlated. A non-parametric estimate of error, called the mean square prediction error (MSPE), assumes that spots of high quality possess pixels that are similar to their neighbors. This paper will provide a framework to use either spot quality measure in downstream analysis, specifically as weights in regression models. Using these spot quality estimates as weights can result in greater efficiency, in a statistical sense, when modeling microarray data. 相似文献
2.
This research provides a new way to measure error in microarray data in order to improve gene expression analysis.Microarray data contains many sources of error.In order to glean information about mRNA expression levels,the true signal must first be segregated from noise.This research focuses on the variation that can be captured at the spot level in cDNA microarray images.Variation at other levels,due to differences at the array,dye,and block levels,can be corrected for by a variety of existing normalizati... 相似文献
3.
4.
5.
Statistical analysis of microarray data: a Bayesian approach 总被引:2,自引:0,他引:2
The potential of microarray data is enormous. It allows us to monitor the expression of thousands of genes simultaneously. A common task with microarray is to determine which genes are differentially expressed between two samples obtained under two different conditions. Recently, several statistical methods have been proposed to perform such a task when there are replicate samples under each condition. Two major problems arise with microarray data. The first one is that the number of replicates is very small (usually 2-10), leading to noisy point estimates. As a consequence, traditional statistics that are based on the means and standard deviations, e.g. t-statistic, are not suitable. The second problem is that the number of genes is usually very large (approximately 10,000), and one is faced with an extreme multiple testing problem. Most multiple testing adjustments are relatively conservative, especially when the number of replicates is small. In this paper we present an empirical Bayes analysis that handles both problems very well. Using different parametrizations, we develop four statistics that can be used to test hypotheses about the means and/or variances of the gene expression levels in both one- and two-sample problems. The methods are illustrated using experimental data with prior knowledge. In addition, we present the result of a simulation comparing our methods to well-known statistics and multiple testing adjustments. 相似文献
6.
Shannon entropy is used to provide an estimate of the number of interpretable components in a principal component analysis.
In addition, several ad hoc stopping rules for dimension determination are reviewed and a modification of the broken stick
model is presented. The modification incorporates a test for the presence of an "effective degeneracy" among the subspaces
spanned by the eigenvectors of the correlation matrix of the data set then allocates the total variance among subspaces. A
summary of the performance of the methods applied to both published microarray data sets and to simulated data is given. 相似文献
7.
Use of a three-color cDNA microarray platform to measure and control support-bound probe for improved data quality and reproducibility 总被引:2,自引:0,他引:2
下载免费PDF全文

Hessner MJ Wang X Khan S Meyer L Schlicht M Tackes J Datta MW Jacob HJ Ghosh S 《Nucleic acids research》2003,31(11):e60
Construction methodologies for cDNA microarrays lack the ability to determine array integrity prior to hybridization, leaving the array itself a source of uncontrolled experimental variation. We solved this problem through development of a three-color cDNA array platform whereby printed probes are tagged with fluorescein and are compatible with Cy3 and Cy5 target labeling dyes when using confocal laser scanners possessing narrow bandwidths. Here we use this approach to: (i) develop a tracking system to monitor the printing of probe plates at predicted coordinates; (ii) define the quantity of immobilized probe necessary for quality hybridized array data to establish pre-hybridization array selection criteria; (iii) investigate factors that influence probe availability for hybridization; and (iv) explore the feasibility of hybridized data filtering using element fluorescein intensity. A direct and significant relationship (R2 = 0.73, P < 0.001) between pre-hybridization average fluorescein intensity and subsequent hybridized replicate consistency was observed, illustrating that data quality can be improved by selecting arrays that meet defined pre-hybridization criteria. Furthermore, we demonstrate that our three-color approach provides a means to filter spots possessing insufficient bound probe from hybridized data sets to further improve data quality. Collectively, this strategy will improve microarray data and increase its utility as a sensitive screening tool. 相似文献
8.
Hsiao LL Jensen RV Yoshida T Clark KE Blumenstock JE Gullans SR 《BioTechniques》2002,32(2):330-2, 334, 336
9.
Matthew E Ritchie Dileepa Diyagama Jody Neilson Ryan van Laar Alexander Dobrovic Andrew Holloway Gordon K Smyth 《BMC bioinformatics》2006,7(1):261-16
Background
Assessment of array quality is an essential step in the analysis of data from microarray experiments. Once detected, less reliable arrays are typically excluded or "filtered" from further analysis to avoid misleading results. 相似文献10.
Background
The imputation of missing values is necessary for the efficient use of DNA microarray data, because many clustering algorithms and some statistical analysis require a complete data set. A few imputation methods for DNA microarray data have been introduced, but the efficiency of the methods was low and the validity of imputed values in these methods had not been fully checked. 相似文献11.
The development of microarray technology allows the simultaneous measurement of the expression of many thousands of genes. The information gained offers an unprecedented opportunity to fully characterize biological processes. However, this challenge will only be successful if new tools for the efficient integration and interpretation of large datasets are available. One of these tools, pathway analysis, involves looking for consistent but subtle changes in gene expression by incorporating either pathway or functional annotations. We review several methods of pathway analysis and compare the performance of three, the binomial distribution, z scores, and gene set enrichment analysis, on two microarray datasets. Pathway analysis is a promising tool to identify the mechanisms that underlie diseases, adaptive physiological compensatory responses and new avenues for investigation. 相似文献
12.
Rychlewski L Kschischo M Dong L Schutkowski M Reimer U 《Journal of molecular biology》2004,336(2):307-311
Protein kinases play an important role in cellular signalling. The reliable prediction of their substrates is of high importance for the deciphering of signalling pathways. A recently developed peptide microarray technology for the charcterisation of protein kinases delivers data on the individual phosphorylation status of each single member of a large peptide library. This data can be used to approximate the substrate specificity of the investigated kinase. We present an approach to process the collected information using a combination of a weight matrix approach and a nearest neighbor approach. Experiments with the protein-tyrosine kinase Abl are conducted to validate the results. Randomly selected peptides (1433) are used to estimate the substrate preferences of the kinase. The obtained prediction results are compared with standard methods. The new approach is tested further on bona fide Abl phosphorylation sites. 相似文献
13.
14.
Reijmers TH Maliepaard C van den Broeck HC Kessler RW Toonen MA van der Voet H 《Journal of bioinformatics and computational biology》2005,3(4):891-913
Both cDNA microarray and spectroscopic data provide indirect information about the chemical compounds present in the biological tissue under consideration. In this paper simple univariate and bivariate measures are used to investigate correlations between both types of high dimensional analyses. A large dataset of 42 hemp samples on which 3456 cDNA clones and 351 NIR wavelengths have been measured, was analyzed using graphical representations. For this purpose we propose clustered correlation and clustered discrimination images. Large, tissue-related differences are seen to dominate the cDNA-NIR correlation structure but smaller, more difficult to detect, variety-related differences can be found at specific cDNA clone/NIR wavelength combinations. 相似文献
15.
16.
cDNA芯片阳性对照的制备及在芯片敏感性分析中的应用 总被引:2,自引:0,他引:2
cDNA芯片是一种高通量基因表达谱分析技术,在生理病理条件下细胞基因表达谱分析,新基因发现和功能研究等方面具有广阔应用前景。CDNA芯片阳性对照的选取以及CDNA芯片检测敏感性是芯片成功应用的关键问题之一。以在系统发育上与人类基因同源性小的荧火虫荧光素酶基因材料,制备了用于人类和其他动物基因表达谱CDNA芯片的通用型阳性对照探针和相应的mRNA参照物,经反转录对mRNA参照物进行Cy3荧光标记并与DNA芯片杂交后发现,mRNA参照物能特异性地与荧光酶基因cDNA片断杂交,而与人β-肌动蛋白基因,人G3PDH基因以及λDNA/HINDⅢ无杂交反应。把mRNA参照物以不同比例加入HepG2总RNA中,以反转录荧光标记后与CDNA芯片杂交,结果发现当总RNA中的MRNA含量为1/10^4稀释(即mRNA分子个数约为10^8个)时,CDNA芯片基本检测不出mRNA标记产物的杂交信号。而且,cDNA芯片检测的信号强度与芯片上固定的探针浓度密切相关,当探针浓度为2g/L时,杂交信号最强,随着探针浓度下降芯片的杂交信号趋于减弱。CDNA芯片通用型阳性参照物的制备以及应用于CDNA芯片检测敏感性研究为CDNA芯片应用于人和其他动物基因表达谱高通量分析和新基因功能研究提供了技术基础和理论依据。 相似文献
17.
18.
19.
Beisvåg V Kauffmann A Malone J Foy C Salit M Schimmel H Bongcam-Rudloff E Landegren U Parkinson H Huber W Brazma A Sandvik AK Kuiper M 《BioTechniques》2011,50(1):27-31
While minimum information about a microarray experiment (MIAME) standards have helped to increase the value of the microarray data deposited into public databases like ArrayExpress and Gene Expression Omnibus (GEO), limited means have been available to assess the quality of this data or to identify the procedures used to normalize and transform raw data. The EMERALD FP6 Coordination Action was designed to deliver approaches to assess and enhance the overall quality of microarray data and to disseminate these approaches to the microarray community through an extensive series of workshops, tutorials, and symposia. Tools were developed for assessing data quality and used to demonstrate how the removal of poor-quality data could improve the power of statistical analyses and facilitate analysis of multiple joint microarray data sets. These quality metrics tools have been disseminated through publications and through the software package arrayQualityMetrics. Within the framework provided by the Ontology of Biomedical Investigations, ontology was developed to describe data transformations, and software ontology was developed for gene expression analysis software. In addition, the consortium has advocated for the development and use of external reference standards in microarray hybridizations and created the Molecular Methods (MolMeth) database, which provides a central source for methods and protocols focusing on microarray-based technologies. 相似文献
20.
Quality by design (QbD) is a current structured approach to design processes yielding a quality product. Knowledge and process understanding cannot be achieved without proper experimental data; hence requirements for measurement error and frequency of measurement of bioprocess variables have to be defined. In this contribution, a model-based approach is used to investigate impact factors on calculated rates to predict the obtainable information from real-time measurements (= signal quality). Measurement error, biological activity, and averaging window (= period of observation) were identified as biggest impact factors on signal quality. Moreover, signal quality has been set in context with a quantifiable measure using statistical error testing, which can be used as a benchmark for process analytics and exploitation of data. Results have been validated with data from an E. coli batch process. This approach is useful to get an idea which process dynamics can be observed with a given bioprocess setup and sampling strategy beforehand. 相似文献