共查询到20条相似文献,搜索用时 0 毫秒
1.
This article focuses on microarray experiments with two or more factors in which treatment combinations of the factors corresponding to the samples paired together onto arrays are not completely random. A main effect of one (or more) factor(s) is confounded with arrays (the experimental blocks). This is called a split-plot microarray experiment. We utilise an analysis of variance (ANOVA) model to assess differentially expressed genes for between-array and within-array comparisons that are generic under a split-plot microarray experiment. Instead of standard t- or F-test statistics that rely on mean square errors of the ANOVA model, we use a robust method, referred to as 'a pooled percentile estimator', to identify genes that are differentially expressed across different treatment conditions. We illustrate the design and analysis of split-plot microarray experiments based on a case application described by Jin et al. A brief discussion of power and sample size for split-plot microarray experiments is also presented. 相似文献
2.
3.
Salicru Miquel Vives Sergi Zheng Tian 《IEEE/ACM transactions on computational biology and bioinformatics / IEEE, ACM》2009,6(4):594-604
Cluster analysis has proven to be a useful tool for investigating the association structure among genes in a microarray data set. There is a rich literature on cluster analysis and various techniques have been developed. Such analyses heavily depend on an appropriate (dis)similarity measure. In this paper, we introduce a general clustering approach based on the confidence interval inferential methodology, which is applied to gene expression data of microarray experiments. Emphasis is placed on data with low replication (three or five replicates). The proposed method makes more efficient use of the measured data and avoids the subjective choice of a dissimilarity measure. This new methodology, when applied to real data, provides an easy-to-use bioinformatics solution for the cluster analysis of microarray experiments with replicates (see the Appendix). Even though the method is presented under the framework of microarray experiments, it is a general algorithm that can be used to identify clusters in any situation. The method's performance is evaluated using simulated and publicly available data set. Our results also clearly show that our method is not an extension of the conventional clustering method based on correlation or euclidean distance. 相似文献
4.
In this article we propose two practical types of designs for large time-course, dual-channel microarray experiments. One type consists of several interwoven loops, and the other type combines reference and loop designs. By representing the experiment as a graph, where the timepoints are nodes and the arrays are edges, we demonstrate how the time contrasts between any two timepoints can be estimated, provided that there is a path of edges linking them. In addition, we give a general formula for the variance of such contrasts. The efficiency of the proposed designs is evaluated by estimating the variances of the log-ratios of the comparisons of interest. 相似文献
5.
David W. Schindler 《Ecosystems》1998,1(4):323-334
The results of bottle and mesocosm experiments were compared with those obtained in whole-ecosystem experiments at the Experimental Lakes Area. Unless they can be cleverly designed to mimic major ecosystem processes and community compositions, smaller-scale experiments often give highly replicable, but spurious, answers. Problems with appropriate scaling are difficult to deduce without direct comparisons with whole-ecosystem experiments. Reasons are many, but include inappropriate spatial scales to include whole communities, in particular predators and nocturnally active animals; temporal scales that are too short to assess accurately the response of slow-responding organisms and biogeochemical processes; and elimination of key littoral–pelagic and catchment–lake interactions. Identical studies of limnological processes in lakes of a large range of sizes reveals that scaling correction is also necessary when extrapolating from small lakes to large ones. Accurate management decisions cannot be made with confidence unless ecosystem scales are studied. Received 26 March 1998; accepted 14 May 1998. 相似文献
6.
Background
As microarray technology has become mature and popular, the selection and use of a small number of relevant genes for accurate classification of samples has arisen as a hot topic in the circles of biostatistics and bioinformatics. However, most of the developed algorithms lack the ability to handle multiple classes, arguably a common application. Here, we propose an extension to an existing regularization algorithm, called Threshold Gradient Descent Regularization (TGDR), to specifically tackle multi-class classification of microarray data. When there are several microarray experiments addressing the same/similar objectives, one option is to use a meta-analysis version of TGDR (Meta-TGDR), which considers the classification task as a combination of classifiers with the same structure/model while allowing the parameters to vary across studies. However, the original Meta-TGDR extension did not offer a solution to the prediction on independent samples. Here, we propose an explicit method to estimate the overall coefficients of the biomarkers selected by Meta-TGDR. This extension permits broader applicability and allows a comparison between the predictive performance of Meta-TGDR and TGDR using an independent testing set.Results
Using real-world applications, we demonstrated the proposed multi-TGDR framework works well and the number of selected genes is less than the sum of all individualized binary TGDRs. Additionally, Meta-TGDR and TGDR on the batch-effect adjusted pooled data approximately provided same results. By adding Bagging procedure in each application, the stability and good predictive performance are warranted.Conclusions
Compared with Meta-TGDR, TGDR is less computing time intensive, and requires no samples of all classes in each study. On the adjusted data, it has approximate same predictive performance with Meta-TGDR. Thus, it is highly recommended. 相似文献7.
Douglas Hayden Peter Lazar David Schoenfeld for The Inflammation the Host Response to Injury Investigators 《PloS one》2009,4(6)
We propose permutation tests based on the pairwise distances between microarrays to compare location, variability, or equivalence of gene expression between two populations. For these tests the entire microarray or some pre-specified subset of genes is the unit of analysis. The pairwise distances only have to be computed once so the procedure is not computationally intensive despite the high dimensionality of the data. An R software package, permtest, implementing the method is freely available from the Comprehensive R Archive Network at http://cran.r-project.org. 相似文献
8.
9.
Microarrays have been useful in understanding various biological processes by allowing the simultaneous study of the expression of thousands of genes. However, the analysis of microarray data is a challenging task. One of the key problems in microarray analysis is the classification of unknown expression profiles. Specifically, the often large number of non-informative genes on the microarray adversely affects the performance and efficiency of classification algorithms. Furthermore, the skewed ratio of sample to variable poses a risk of overfitting. Thus, in this context, feature selection methods become crucial to select relevant genes and, hence, improve classification accuracy. In this study, we investigated feature selection methods based on gene expression profiles and protein interactions. We found that in our setup, the addition of protein interaction information did not contribute to any significant improvement of the classification results. Furthermore, we developed a novel feature selection method that relies exclusively on observed gene expression changes in microarray experiments, which we call “relative Signal-to-Noise ratio” (rSNR). More precisely, the rSNR ranks genes based on their specificity to an experimental condition, by comparing intrinsic variation, i.e. variation in gene expression within an experimental condition, with extrinsic variation, i.e. variation in gene expression across experimental conditions. Genes with low variation within an experimental condition of interest and high variation across experimental conditions are ranked higher, and help in improving classification accuracy. We compared different feature selection methods on two time-series microarray datasets and one static microarray dataset. We found that the rSNR performed generally better than the other methods. 相似文献
10.
Color alteration is one of the major indicators of the maturity and level of ripeness of fruits. A strong relationship also exists between the color and acidity of fruits. Three different spectral analyses have been conducted in the visible region of the measured spectrum to quantify the acidity or pH of B10 carambola. Spectral linearisation, gradient shift, and normalisation analyses within the wavelength range of 550 and 675?nm have been applied on the spectra of intact carambola samples. These two wavelength points are selected because of their best response and strong link to carotenoid and chlorophyll contents in carambola. The spectra are measured through reflectance and interactance measurement techniques. High coefficients of determination (R2 >0.7) generated for all analyses indicate that a strong relationship exists between the presented color analyses and the acidity of the carambolas. Interactance has a better accuracy and precision in measuring the carambola acidity compared with the reflectance technique. 相似文献
11.
12.
Because of the high operation costs involved in microarray experiments, the determination of the number of replicates required to detect a gene significantly differentially expressed in a given multiple-testing procedure is of considerable significance. Calculation of power/replicate numbers required in multiple-testing procedures provides design guidance for microarray experiments. Based on this model and by choice of a multiple-testing procedure, expression noises based on permutation resampling can be considerably minimized. The method for mixture distribution model is suitable to various microarray data types obtained from single noise sources, or from multiple noise sources. By using the biological replicate number required in microarray experiments for a given power or by determining the power required to detect a gene significantly differentially expressed, given the sample size, or the best multiple-testing method can be chosen. As an example, a single-distribution model of t-statistic was fitted to an observed microarray dataset of 3 000 genes responsive to stroke in rat, and then used to calculate powers of four popular multiple-testing procedures to detect a gene of an expression change D. The results show that the B-procedure had the lowest power to detect a gene of small change among the multiple-testing procedures, whereas the BH-procedure had the highest power. However, all multiple-testing procedures had the same power to identify a gene having the largest change. Similar to a single test, the power of the BH-procedure to detect a small change does not vary as the number of genes increases, but powers of the other three multiple-testing procedures decline as the number of genes increases. 相似文献
13.
鉴于基因芯片实验的造价,在基因芯片实验设计中,首要考虑的因素是需要多少重复才能检测出一个具有显著差异表达的基因。计算多重检验法要求的重复数(样本大小)或功效可为基因芯片实验设计提供重要的参考。为此,本文基于置换重抽样法构建了一种基因表达噪声混合分布模型。该方法适用各类基因表达数据,即无论是基因表达单噪声源或是多噪声源都可行。应用混合模型和多重检验法并给定统计功效。研究者能在基因芯片实验中获得所需要的最少生物学重复数:或者根据样本大小来确定测定一个显著差异表达的基因所具有的检验功效;或者根据样本大小和统计检验功效,选择最好的统计测验方法。本文以一组在老鼠中与中风有关的3000个基因的基因芯片实验所获得的数据为例,应用该方法拟和后组建了一个单分布模型(即表达单噪声源的分布模型)。根据该模型,我们计算了4种多重检验法在鉴定一个具有表达差异(D)值的基因中所需要的统计功效。结果表明。检测一个小的差异D值,4种多重检验法中B方法的统计功效最低,而BH方法最高。但是,对于鉴定一个具有最大表达差异的基因时,4种方法有相同的鉴定功效。与传统的单个检验法一样,BH方法检测一个小的变化所需要的效率不会随基因数目增加而改变,其他3种多重检验法的检测功效则随基因数目增加而降低。 相似文献
14.
15.
16.
A major challenge in structural biology is to characterize structures of proteins and their assemblies in solution. At low resolution, such a characterization may be achieved by small angle x-ray scattering (SAXS). Because SAXS analyses often require comparing profiles calculated from many atomic models against those determined by experiment, rapid and accurate profile computation from molecular structures is needed. We developed fast open-source x-ray scattering (FoXS) for profile computation. To match the experimental profile within the experimental noise, FoXS explicitly computes all interatomic distances and implicitly models the first hydration layer of the molecule. For assessing the accuracy of the modeled hydration layer, we performed contrast variation experiments for glucose isomerase and lysozyme, and found that FoXS can accurately represent density changes of this layer. The hydration layer model was also compared with a SAXS profile calculated for the explicit water molecules in the high-resolution structures of glucose isomerase and lysozyme. We tested FoXS on eleven protein, one DNA, and two RNA structures, revealing superior accuracy and speed versus CRYSOL, AquaSAXS, the Zernike polynomials-based method, and Fast-SAXS-pro. In addition, we demonstrated a significant correlation of the SAXS score with the accuracy of a structural model. Moreover, FoXS utility for analyzing heterogeneous samples was demonstrated for intrinsically flexible XLF-XRCC4 filaments and Ligase III-DNA complex. FoXS is extensively used as a standalone web server as a component of integrative structure determination by programs IMP, Chimera, and BILBOMD, as well as in other applications that require rapidly and accurately calculated SAXS profiles. 相似文献
17.
18.
Dina Schneidman-Duhovny Michal Hammel John?A. Tainer Andrej Sali 《Biophysical journal》2013,105(4):962-974
A major challenge in structural biology is to characterize structures of proteins and their assemblies in solution. At low resolution, such a characterization may be achieved by small angle x-ray scattering (SAXS). Because SAXS analyses often require comparing profiles calculated from many atomic models against those determined by experiment, rapid and accurate profile computation from molecular structures is needed. We developed fast open-source x-ray scattering (FoXS) for profile computation. To match the experimental profile within the experimental noise, FoXS explicitly computes all interatomic distances and implicitly models the first hydration layer of the molecule. For assessing the accuracy of the modeled hydration layer, we performed contrast variation experiments for glucose isomerase and lysozyme, and found that FoXS can accurately represent density changes of this layer. The hydration layer model was also compared with a SAXS profile calculated for the explicit water molecules in the high-resolution structures of glucose isomerase and lysozyme. We tested FoXS on eleven protein, one DNA, and two RNA structures, revealing superior accuracy and speed versus CRYSOL, AquaSAXS, the Zernike polynomials-based method, and Fast-SAXS-pro. In addition, we demonstrated a significant correlation of the SAXS score with the accuracy of a structural model. Moreover, FoXS utility for analyzing heterogeneous samples was demonstrated for intrinsically flexible XLF-XRCC4 filaments and Ligase III-DNA complex. FoXS is extensively used as a standalone web server as a component of integrative structure determination by programs IMP, Chimera, and BILBOMD, as well as in other applications that require rapidly and accurately calculated SAXS profiles. 相似文献
19.
Background
With the explosion in data generated using microarray technology by different investigators working on similar experiments, it is of interest to combine results across multiple studies. 相似文献20.