首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
DNA microarray technology provides useful tools for profiling global gene expression patterns in different cell/tissue samples. One major challenge is the large number of genes relative to the number of samples. The use of all genes can suppress or reduce the performance of a classification rule due to the noise of nondiscriminatory genes. Selection of an optimal subset from the original gene set becomes an important prestep in sample classification. In this study, we propose a family-wise error (FWE) rate approach to selection of discriminatory genes for two-sample or multiple-sample classification. The FWE approach controls the probability of the number of one or more false positives at a prespecified level. A public colon cancer data set is used to evaluate the performance of the proposed approach for the two classification methods: k nearest neighbors (k-NN) and support vector machine (SVM). The selected gene sets from the proposed procedure appears to perform better than or comparable to several results reported in the literature using the univariate analysis without performing multivariate search. In addition, we apply the FWE approach to a toxicogenomic data set with nine treatments (a control and eight metals, As, Cd, Ni, Cr, Sb, Pb, Cu, and AsV) for a total of 55 samples for a multisample classification. Two gene sets are considered: the gene set omegaF formed by the ANOVA F-test, and a gene set omegaT formed by the union of one-versus-all t-tests. The predicted accuracies are evaluated using the internal and external crossvalidation. Using the SVM classification, the overall accuracies to predict 55 samples into one of the nine treatments are above 80% for internal crossvalidation. OmegaF has slightly higher accuracy rates than omegaT. The overall predicted accuracies are above 70% for the external crossvalidation; the two gene sets omegaT and omegaF performed equally well.  相似文献   

2.

Background  

Three-color microarray experiments can be performed to assess drug effects on the genomic scale. The methodology may be useful in shortening the cycle, reducing the cost, and improving the efficiency in drug discovery and development compared with the commonly used dual-color technology. A visualization tool, the hexaMplot, is able to show the interrelations of gene expressions in normal-disease-drug samples in three-color microarray data. However, it is not enough to assess the complicated drug therapeutic effects based on the plot alone. It is important to explore more effective tools so that a deeper insight into gene expression patterns can be gained with three-color microarrays.  相似文献   

3.
J M Nam 《Biometrics》1987,43(3):701-705
A simple approximate formula for sample sizes for detecting a linear trend in proportions is derived. The formulas for both the uncorrected and corrected Cochran-Armitage test are given. For two binomial proportions these reduce to those given by Casagrande, Pike, and Smith (1978, Biometrics 34, 483-486). Some numerical results of a power study for small sample sizes show that the nominal power corresponding to the approximate sample size is a reasonably good approximation to the actual power.  相似文献   

4.

Background  

Time-course microarray experiments are widely used to study the temporal profiles of gene expression. Storey et al. (2005) developed a method for analyzing time-course microarray studies that can be applied to discovering genes whose expression trajectories change over time within a single biological group, or those that follow different time trajectories among multiple groups. They estimated the expression trajectories of each gene using natural cubic splines under the null (no time-course) and alternative (time-course) hypotheses, and used a goodness of fit test statistic to quantify the discrepancy. The null distribution of the statistic was approximated through a bootstrap method. Gene expression levels in microarray data are often complicatedly correlated. An accurate type I error control adjusting for multiple testing requires the joint null distribution of test statistics for a large number of genes. For this purpose, permutation methods have been widely used because of computational ease and their intuitive interpretation.  相似文献   

5.
Practical FDR-based sample size calculations in microarray experiments   总被引:5,自引:2,他引:3  
Motivation: Owing to the experimental cost and difficulty inobtaining biological materials, it is essential to considerappropriate sample sizes in microarray studies. With the growinguse of the False Discovery Rate (FDR) in microarray analysis,an FDR-based sample size calculation is essential. Method: We describe an approach to explicitly connect the samplesize to the FDR and the number of differentially expressed genesto be detected. The method fits parametric models for degreeof differential expression using the Expectation–Maximizationalgorithm. Results: The applicability of the method is illustrated withsimulations and studies of a lung microarray dataset. We proposeto use a small training set or published data from relevantbiological settings to calculate the sample size of an experiment. Availability: Code to implement the method in the statisticalpackage R is available from the authors. Contact: jhu{at}mdanderson.org  相似文献   

6.

Background  

In a time-course microarray experiment, the expression level for each gene is observed across a number of time-points in order to characterize the temporal trajectories of the gene-expression profiles. For many of these experiments, the scientific aim is the identification of genes for which the trajectories depend on an experimental or phenotypic factor. There is an extensive recent body of literature on statistical methodology for addressing this analytical problem. Most of the existing methods are based on estimating the time-course trajectories using parametric or non-parametric mean regression methods. The sensitivity of these regression methods to outliers, an issue that is well documented in the statistical literature, should be of concern when analyzing microarray data.  相似文献   

7.
MOTIVATION: Spotted arrays are often printed with probes in duplicate or triplicate, but current methods for assessing differential expression are not able to make full use of the resulting information. The usual practice is to average the duplicate or triplicate results for each probe before assessing differential expression. This results in the loss of valuable information about genewise variability. RESULTS: A method is proposed for extracting more information from within-array replicate spots in microarray experiments by estimating the strength of the correlation between them. The method involves fitting separate linear models to the expression data for each gene but with a common value for the between-replicate correlation. The method greatly improves the precision with which the genewise variances are estimated and thereby improves inference methods designed to identify differentially expressed genes. The method may be combined with empirical Bayes methods for moderating the genewise variances between genes. The method is validated using data from a microarray experiment involving calibration and ratio control spots in conjunction with spiked-in RNA. Comparing results for calibration and ratio control spots shows that the common correlation method results in substantially better discrimination of differentially expressed genes from those which are not. The spike-in experiment also confirms that the results may be further improved by empirical Bayes smoothing of the variances when the sample size is small. AVAILABILITY: The methodology is implemented in the limma software package for R, available from the CRAN repository http://www.r-project.org  相似文献   

8.
A novel method of assessing muscle function in the common marmoset was developed as part of a multidisciplinary long-term study. The method involved home cage presentation of a weight-pulling task. Over a 4-5 month period, 38 of 42 animals were successfully trained to displace weights of up to 920 g (mean 612+/-20 g). Performance, following initial training, was stable and independent of gender or body weight.  相似文献   

9.
An experimental strategy for quality control of antibody microarray analyses is proposed. The method utilizes proteins that are prepared for regular antibody microarray experiments. There is no need to use exogenous positive or negative reference markers and no need to determine the absolute concentration of each individual protein in the sample. Validation experiments support the basic principle of the proposed approach. This method can be a useful tool for assessing the outcome accuracy of microarray experiments.  相似文献   

10.
Workman C  Jensen LJ  Jarmer H  Berka R  Gautier L  Nielser HB  Saxild HH  Nielsen C  Brunak S  Knudsen S 《Genome biology》2002,3(9):research0048.1-research004816

Background  

Microarray data are subject to multiple sources of variation, of which biological sources are of interest whereas most others are only confounding. Recent work has identified systematic sources of variation that are intensity-dependent and non-linear in nature. Systematic sources of variation are not limited to the differing properties of the cyanine dyes Cy5 and Cy3 as observed in cDNA arrays, but are the general case for both oligonucleotide microarray (Affymetrix GeneChips) and cDNA microarray data. Current normalization techniques are most often linear and therefore not capable of fully correcting for these effects.  相似文献   

11.
Matsui S  Noma H 《Biometrics》2011,67(4):1225-1235
Summary In microarray screening for differentially expressed genes using multiple testing, assessment of power or sample size is of particular importance to ensure that few relevant genes are removed from further consideration prematurely. In this assessment, adequate estimation of the effect sizes of differentially expressed genes is crucial because of its substantial impact on power and sample‐size estimates. However, conventional methods using top genes with largest observed effect sizes would be subject to overestimation due to random variation. In this article, we propose a simple estimation method based on hierarchical mixture models with a nonparametric prior distribution to accommodate random variation and possible large diversity of effect sizes across differential genes, separated from nuisance, nondifferential genes. Based on empirical Bayes estimates of effect sizes, the power and false discovery rate (FDR) can be estimated to monitor them simultaneously in gene screening. We also propose a power index that concerns selection of top genes with largest effect sizes, called partial power. This new power index could provide a practical compromise for the difficulty in achieving high levels of usual overall power as confronted in many microarray experiments. Applications to two real datasets from cancer clinical studies are provided.  相似文献   

12.
13.
The determination of the fate of a compound following administration can be performed using the disposition method with 14C-labeled substances, which also allow the measurement of metabolism with CO2 as an expired end product. To substitute the laborious CO2-collection in washing bottles as carbonate a simple instrumentation was built for continuous 14CO2-measurement. The air from the metabolic cage is led in thin layer through a chamber fitted to a foot-monitor, the output of which is online for computation. The instrument is sensitive and calibration is easy.  相似文献   

14.

Background  

DNA microarrays are popular tools for measuring gene expression of biological samples. This ever increasing popularity is ensuring that a large number of microarray studies are conducted, many of which with data publicly available for mining by other investigators. Under most circumstances, validation of differential expression of genes is performed on a gene to gene basis. Thus, it is not possible to generalize validation results to the remaining majority of non-validated genes or to evaluate the overall quality of these studies.  相似文献   

15.
A simple transformation for sets of range sizes   总被引:2,自引:0,他引:2  
Transformation of data to normality may be illuminating and useful statistically. There are two standard families of transformations, power transformations for positive numbers. bounded at the left, and folded transformations for proportions, bounded both at the left and the right. It has been shown that there is no one satisfactory power transformation for range size data. However, such measures are limited to the right as well as the left, and we consider applying folded transformations to them. Seven data sets of range sizes recorded by 10 km squares are studied. Six are British (native and introduced plants, mammals, dragonflies and two breeding bird surveys) the seventh is of Swiss breeding birds. Using these we show that the right hand limit of the distribution can be estimated and the best folded transformation found. In all cases the right hand limit is larger than the range size of the most widespread species and smaller than the notional scope of the survey. In all cases the logit or Hog, the logarithmic folded transformation, is satisfactory: in five cases it is the best. It is well known that abundance is approximately (though not exactly) log-normally distributed. The relationship of that to our discovery that range size data are approximately logit-normal is discussed. There is no fully satisfactory explanation for either observation at present.  相似文献   

16.
Manipulating single molecules and systems of molecules with mechanical force is a powerful technique to examine their physical properties. Applying force requires attachment of the target molecule to larger objects using some sort of molecular tether, such as a strand of DNA. DNA handle attachment often requires difficult manipulations of the target molecule, which can preclude attachment to unstable, hard to obtain, and/or large, complex targets. Here we describe a method for covalent DNA handle attachment to proteins that simply requires the addition of a preprepared reagent to the protein and a short incubation. The handle attachment method developed here provides a facile approach for studying the biomechanics of biological systems.  相似文献   

17.
The value of chromatin diminution (CD) in different species of freshwater cyclopoid copepods can differ significantly. The biological and evolutionary roles of these differences remain unclear. To expand the knowledge on CD distribution and magnitude in this group of copepods, a quick method for its evaluation was required. This study proposes a simple approach for CD assessment in copepods using quantitative realtime PCR (qPCR). The magnitude of changes in the genome size was assessed by comparing fluorescence curves of qPCR fragments of target genes for pre- and post-diminution materials. The method was tested on four cyclopoid copepods species. In Cyclops kolensis, CD was assessed as 95.3 ± 1.2; in Acanthocyclops vernalis it was assessed at 94.6 ± 0.8%; at C. insignis, it was 82.3 ± 5.2%; and for the first time, CD was found in Megacyclops viridis at 91.1 ± 2.6%. The advantages of our approach are its rapidity, simplicity and minimal requirements of materials studied.  相似文献   

18.
JINLIANG WANG 《Molecular ecology》2009,18(10):2148-2164
Equations for the effective size ( Ne ) of a population were derived in terms of the frequencies of a pair of offspring taken at random from the population being sibs sharing the same one or two parents. Based on these equations, a novel method (called sibship assignment method) was proposed to infer Ne from the sibship frequencies estimated from a sibship assignment analysis, using the multilocus genotypes of a sample of offspring taken at random from a single cohort in a population. Comparative analyses of extensive simulated data and some empirical data clearly demonstrated that the sibship assignment method is much more accurate [measured by the root mean squared error, RMSE, of 1/(2 Ne )] than other methods such as the heterozygote excess method, the linkage disequilibrium method, and the temporal method. The RMSE of 1/(2 Ne ) from the sibship assignment method is typically a small fraction of that from other methods. The new method is also more general and flexible than other methods. It can be applied to populations with nonoverlapping generations of both diploid and haplodiploid species under random or nonrandom mating, using either codominant or dominant markers. It can also be applied to the estimation of Ne for a subpopulation with immigration. With some modification, it could be applied to monoecious diploid populations with self-fertilization, and to populations with overlapping generations.  相似文献   

19.
20.
Sample size has long been one of the basic issues since the start of the DNA barcoding initiative and the global biodiversity investigation. As a contribution to resolving this problem, we propose a simple resampling approach to estimate several key sampling sizes for a DNA barcoding project. We illustrate our approach using both structured populations simulated under coalescent and real species of skipper butterflies. We found that sample sizes widely used in DNA barcoding are insufficient to assess the genetic diversity of a species, population structure impacts the estimation of the sample sizes, and hence will bias the species identification potentially.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号