首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
MOTIVATION: Experimental limitations have resulted in the popularity of parametric statistical tests as a method for identifying differentially regulated genes in microarray data sets. However, these tests assume that the data follow a normal distribution. To date, the assumption that replicate expression values for any gene are normally distributed, has not been critically addressed for Affymetrix GeneChip data. RESULTS: The normality of the expression values calculated using four different commercial and academic software packages was investigated using a data set consisting of the same target RNA applied to 59 human Affymetrix U95A GeneChips using a combination of statistical tests and visualization techniques. For the majority of probe sets obtained from each analysis suite, the expression data showed a good correlation with normality. The exception was a large number of low-expressed genes in the data set produced using Affymetrix Microarray Suite 5.0, which showed a striking non-normal distribution. In summary, our data provide strong support for the application of parametric tests to GeneChip data sets without the need for data transformation.  相似文献   

2.
3.
Summaries of Affymetrix GeneChip probe level data   总被引:9,自引:0,他引:9  
High density oligonucleotide array technology is widely used in many areas of biomedical research for quantitative and highly parallel measurements of gene expression. Affymetrix GeneChip arrays are the most popular. In this technology each gene is typically represented by a set of 11–20 pairs of probes. In order to obtain expression measures it is necessary to summarize the probe level data. Using two extensive spike-in studies and a dilution study, we developed a set of tools for assessing the effectiveness of expression measures. We found that the performance of the current version of the default expression measure provided by Affymetrix Microarray Suite can be significantly improved by the use of probe level summaries derived from empirically motivated statistical models. In particular, improvements in the ability to detect differentially expressed genes are demonstrated.  相似文献   

4.
Summary Gene expression index estimation is an essential step in analyzing multiple probe microarray data. Various modeling methods have been proposed in this area. Amidst all, a popular method proposed in Li and Wong (2001) is based on a multiplicative model, which is similar to the additive model discussed in Irizarry et al. (2003a) at the logarithm scale. Along this line, Hu et al. (2006) proposed data transformation to improve expression index estimation based on an ad hoc entropy criteria and naive grid search approach. In this work, we re‐examined this problem using a new profile likelihood‐based transformation estimation approach that is more statistically elegant and computationally efficient. We demonstrate the applicability of the proposed method using a benchmark Affymetrix U95A spiked‐in experiment. Moreover, We introduced a new multivariate expression index and used the empirical study to shows its promise in terms of improving model fitting and power of detecting differential expression over the commonly used univariate expression index. As the other important content of the work, we discussed two generally encountered practical issues in application of gene expression index: normalization and summary statistic used for detecting differential expression. Our empirical study shows somewhat different findings from the MAQC project ( MAQC, 2006 ).  相似文献   

5.
6.
Careful analysis of microarray probe design should be an obligatory component of MicroArray Quality Control (MACQ) project [Patterson et al., 2006; Shi et al., 2006] initiated by the FDA (USA) in order to provide quality control tools to researchers of gene expression profiles and to translate the microarray technology from bench to bedside. The identification and filtering of unreliable probesets are important preprocessing steps before analysis of microarray data. These steps may result in an essential improvement in the selection of differentially expressed genes, gene clustering and construction of co-regulatory expression networks. We revised genome localization of the Affymetrix U133A&B GeneChip initial (target) probe sequences, and evaluated the impact of erroneous and poorly annotated target sequences on the quality of gene expression data. We found about 25% of Affymetrix target sequences overlapping with interspersed repeats that could cause cross-hybridization effects. In total, discrepancies in target sequence annotation account for up to approximately 30% of 44692 Affymetrix probesets. We introduce a novel quality control algorithm based on target sequence mapping onto genome and GeneChip expression data analysis. To validate the quality of probesets we used expression data from large, clinically and genetically distinct groups of breast cancers (249 samples). For the first time, we quantitatively evaluated the effect of repeats and other sources of inadequate probe design on the specificity, reliability and discrimination ability of Affymetrix probesets. We propose that only functionally reliable Affymetrix probesets that passed our quality control algorithm (approximately 86%) for gene expression analysis should be utilized. The target sequence annotation and filtering is available upon request.  相似文献   

7.
MOTIVATION: The power of microarray analyses to detect differential gene expression strongly depends on the statistical and bioinformatical approaches used for data analysis. Moreover, the simultaneous testing of tens of thousands of genes for differential expression raises the 'multiple testing problem', increasing the probability of obtaining false positive test results. To achieve more reliable results, it is, therefore, necessary to apply adjustment procedures to restrict the family-wise type I error rate (FWE) or the false discovery rate. However, for the biologist the statistical power of such procedures often remains abstract, unless validated by an alternative experimental approach. RESULTS: In the present study, we discuss a multiplicity adjustment procedure applied to classical univariate as well as to recently proposed multivariate gene-expression scores. All procedures strictly control the FWE. We demonstrate that the use of multivariate scores leads to a more efficient identification of differentially expressed genes than the widely used MAS5 approach provided by the Affymetrix software tools (Affymetrix Microarray Suite 5 or GeneChip Operating Software). The practical importance of this finding is successfully validated using real time quantitative PCR and data from spike-in experiments. AVAILABILITY: The R-code of the statistical routines can be obtained from the corresponding author. CONTACT: Schuster@imise.uni-leipzig.de  相似文献   

8.
AMarge     
AMarge is a web tool for the automatic quality assessment of Affymetrix GeneChip data. It is essential to have a trustworthy set of chips in order to derive gene expression data for phenotypic analysis, and AMarge provides a complete and rigorous web-accessible tool to fulfill this need. The quality assessment steps include image plots of weights derived from a robust linear model fit of the data, a 3'/5' RNA digestion plot, and Affymetrix Microarray Suite version 5.0 (MAS 5.0) quality standard procedures. Furthermore, robust multi-array average expression values are generated in order to have a start-up expression set for the subsequent analysis. The results of the complete analysis are summarised and returned as an HTML report. AVAILABILITY: The AMarge web interface is accessible at http://nin.crg.es/cgi-binf/AMargeWeb.cgi. A mirror server is also available at http://bioinformatics.istge.it/AMarge-bin/AMargeWeb.cgi. The software implementing all these methods is part of the Bioconductor project (http://www.bioconductor.org).  相似文献   

9.
High-Density Microarray of Small-Subunit Ribosomal DNA Probes   总被引:19,自引:3,他引:16       下载免费PDF全文
Ribosomal DNA sequence analysis, originally conceived as a way to provide a universal phylogeny for life forms, has proven useful in many areas of biological research. Some of the most promising applications of this approach are presently limited by the rate at which sequences can be analyzed. As a step toward overcoming this limitation, we have investigated the use of photolithography chip technology to perform sequence analyses on amplified small-subunit rRNA genes. The GeneChip (Affymetrix Corporation) contained 31,179 20-mer oligonucleotides that were complementary to a subalignment of sequences in the Ribosomal Database Project (RDP) (B. L. Maidak et al., Nucleic Acids Res. 29:173-174, 2001). The chip and standard Affymetrix software were able to correctly match small-subunit ribosomal DNA amplicons with the corresponding sequences in the RDP database for 15 of 17 bacterial species grown in pure culture. When bacteria collected from an air sample were tested, the method compared favorably with cloning and sequencing amplicons in determining the presence of phylogenetic groups. However, the method could not resolve the individual sequences comprising a complex mixed sample. Given these results and the potential for future enhancement of this technology, it may become widely useful.  相似文献   

10.
High-density microarray of small-subunit ribosomal DNA probes   总被引:18,自引:0,他引:18  
Ribosomal DNA sequence analysis, originally conceived as a way to provide a universal phylogeny for life forms, has proven useful in many areas of biological research. Some of the most promising applications of this approach are presently limited by the rate at which sequences can be analyzed. As a step toward overcoming this limitation, we have investigated the use of photolithography chip technology to perform sequence analyses on amplified small-subunit rRNA genes. The GeneChip (Affymetrix Corporation) contained 31,179 20-mer oligonucleotides that were complementary to a subalignment of sequences in the Ribosomal Database Project (RDP) (B. L. Maidak et al., Nucleic Acids Res. 29:173-174, 2001). The chip and standard Affymetrix software were able to correctly match small-subunit ribosomal DNA amplicons with the corresponding sequences in the RDP database for 15 of 17 bacterial species grown in pure culture. When bacteria collected from an air sample were tested, the method compared favorably with cloning and sequencing amplicons in determining the presence of phylogenetic groups. However, the method could not resolve the individual sequences comprising a complex mixed sample. Given these results and the potential for future enhancement of this technology, it may become widely useful.  相似文献   

11.
12.
13.
MOTIVATION: A common task in analyzing microarray data is to determine which genes are differentially expressed across two kinds of tissue samples or samples obtained under two experimental conditions. Recently several statistical methods have been proposed to accomplish this goal when there are replicated samples under each condition. However, it may not be clear how these methods compare with each other. Our main goal here is to compare three methods, the t-test, a regression modeling approach (Thomas et al., Genome Res., 11, 1227-1236, 2001) and a mixture model approach (Pan et al., http://www.biostat.umn.edu/cgi-bin/rrs?print+2001,2001a,b) with particular attention to their different modeling assumptions. RESULTS: It is pointed out that all the three methods are based on using the two-sample t-statistic or its minor variation, but they differ in how to associate a statistical significance level to the corresponding statistic, leading to possibly large difference in the resulting significance levels and the numbers of genes detected. In particular, we give an explicit formula for the test statistic used in the regression approach. Using the leukemia data of Golub et al. (Science, 285, 531-537, 1999), we illustrate these points. We also briefly compare the results with those of several other methods, including the empirical Bayesian method of Efron et al. (J. Am. Stat. Assoc., to appear, 2001) and the Significance Analysis of Microarray (SAM) method of Tusher et al. (PROC: Natl Acad. Sci. USA, 98, 5116-5121, 2001).  相似文献   

14.
Logit-t employs a logit-transformation for normalization followed by statistical testing at the probe-level. Using four publicly-available datasets, together providing 2,710 known positive incidences of differential expression and 2,913,813 known negative incidences, performance of statistical tests were: Logit-t provided 75% positive-predictive value, compared with 5% for Affymetrix Microarray Suite 5, 6% for dChip perfect match (PM)-only, and 9% for Robust Multi-array Analysis at the p < 0.01 threshold. Logit-t provided 70% sensitivity, Microarray Suite 5 provided 46%, dChip provided 53% and Robust Multi-array Analysis provided 63%.  相似文献   

15.
特马豆克阶是奥陶系底部第一个阶,笔石是特马豆克阶高分辨率地层划分与对比的重要化石类群。江南斜坡带是我国早奥陶世特马豆克期漂浮笔石分异度和丰度最高的相区之一,位于该区的湖南益阳南坝剖面,发育有完整的上特马豆克阶笔石地层,特马豆克阶-弗洛阶界线附近地层连续,上特马豆克阶笔石地层研究已取得较大进展,但下特马豆克阶地层缺乏系统研究。近年来,通过对该剖面笔石标本的不间断采集,新识别出下特马豆克阶笔石带Rhabdinopora flabelliformis parabola带。到目前为止,湖南益阳南坝剖面特马豆克阶可以识别出5个笔石带,自下而上依次为:Rhabdinopora flabelliformis parabola带、Adelograptus tenellus带、Aorograptus victoriae带、Araneograptus murrayi带以及Hunnegraptus copiosus带。基于目前已识别出的笔石带,参考国内外同期笔石地层资料,本文详细讨论华南特马豆克期笔石带序列,并与国内外同期地层进行精确对比。  相似文献   

16.
Prostate cancer is one of the most common malignancies.The development and progression of prostate cancer are driven by a series of genetic and epigenetic events including gene amplification that activates oncogenes and chromosomal deletion that inactivates tumor suppressor genes.Whereas gene amplification occurs in human prostate cancer,gene deletion is more common,and a large number of chromosomal regions have been identified to have frequent deletion in prostate cancer,suggesting that tumor suppressor inactivation is more common than oncogene activation in prostatic carcinogenesis (Knuutila et al.,1998,1999;Dong,2001).Among the most frequently deleted chromosomal regions in prostate cancer,target genes such as NKX3-1 from 8p21,PTENfrom 10q23 andATBF1 from 16q22 have been identified by different approaches (He et al.,1997;Li et al.,1997;Sun et al.,2005),and deletion of these genes in mouse prostates has been demonstrated to induce and/or promote prostatic carcinogenesis.For example,knockout of Nkx3-1 in mice induces hyperplasia and dysplasia (Bhatia-Gaur et al.,1999;Abdulkadir et al.,2002) and promotes prostatic tumorigenesis (Abate-Shen et al.,2003),while knockout of Pten alone causes prostatic neoplasia (Wang et al.,2003).Therefore,gene deletion plays a causal role in prostatic carcinogenesis (Dong,2001).  相似文献   

17.
In this paper we report exploratory analyses of high-density oligonucleotide array data from the Affymetrix GeneChip system with the objective of improving upon currently used measures of gene expression. Our analyses make use of three data sets: a small experimental study consisting of five MGU74A mouse GeneChip arrays, part of the data from an extensive spike-in study conducted by Gene Logic and Wyeth's Genetics Institute involving 95 HG-U95A human GeneChip arrays; and part of a dilution study conducted by Gene Logic involving 75 HG-U95A GeneChip arrays. We display some familiar features of the perfect match and mismatch probe (PM and MM) values of these data, and examine the variance-mean relationship with probe-level data from probes believed to be defective, and so delivering noise only. We explain why we need to normalize the arrays to one another using probe level intensities. We then examine the behavior of the PM and MM using spike-in data and assess three commonly used summary measures: Affymetrix's (i) average difference (AvDiff) and (ii) MAS 5.0 signal, and (iii) the Li and Wong multiplicative model-based expression index (MBEI). The exploratory data analyses of the probe level data motivate a new summary measure that is a robust multi-array average (RMA) of background-adjusted, normalized, and log-transformed PM values. We evaluate the four expression summary measures using the dilution study data, assessing their behavior in terms of bias, variance and (for MBEI and RMA) model fit. Finally, we evaluate the algorithms in terms of their ability to detect known levels of differential expression using the spike-in data. We conclude that there is no obvious downside to using RMA and attaching a standard error (SE) to this quantity using a linear model which removes probe-specific affinities.  相似文献   

18.
Severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) is a novel coronavirus that causes the outbreak of coronavirus disease 2019 (COVID-19) (Li et al., 2020a). Viral nucleic acid testing is the standard method for the laboratory diagnosis of COVID-19 (Wu et al., 2020a; Zhu et al., 2020). Currently, a variety of qPCR-based detection kits are used for laboratory-based detection and confirmation of SARS-CoV-2 infection (Corman et al., 2020; Hussein et al., 2020; Ruhan et al., 2020; Veyer et al., 2020). Conventional qPCR involves virus inactivation, nucleic acid extraction, and qPCR amplification procedures. Therefore, the process is complicated, which usually takes longer than 2 h, and requires biosafety laboratories and professional staff. Thus, qPCR is not suitable for use in field or medical units. To reduce the operation steps, automatic integrated qPCR detection systems that combine nucleic acid extraction and qPCR amplification in a sealed cartridge were developed to detect viruses in clinical samples (Li et al., 2020b). However, the detection time is still longer than 1 h. Therefore, rapid nucleic acid detection systems are needed to further improve the detection efficiency.  相似文献   

19.
20.
The life-long addition of new neurons has been documented in many regions of the vertebrate and invertebrate brain, including the hippocampus of mammals (Altman and Das, 1965; Eriksson et al., 1998; Jacobs et al., 2000), song control nuclei of birds (Alvarez-Buylla et al., 1990), and olfactory pathway of rodents (Lois and Alvarez-Buylla, 1994), insects (Cayre et al., 1996) and crustaceans (Harzsch and Dawirs, 1996; Sandeman et al., 1998; Harzsch et al., 1999; Schmidt, 2001). The possibility of persistent neurogenesis in the neocortex of primates is also being widely discussed (Gould et al., 1999; Kornack and Rakic, 2001). In these systems, an effort is underway to understand the regulatory mechanisms that control the timing and rate of neurogenesis. Hormonal cycles (Rasika et al., 1994; Harrison et al., 2001), serotonin (Gould, 1999; Brezun and Daszuta, 2000; Beltz et al., 2001), physical activity (Van Praag et al., 1999) and living conditions (Kemperman and Gage, 1999; Sandeman and Sandeman, 2000) influence the rate of neuronal proliferation and survival in a variety of organisms, suggesting that mechanisms controlling life-long neurogenesis are conserved across a range of vertebrate and invertebrate species. The present article extends these findings by demonstrating circadian control of neurogenesis. Data show a diurnal rhythm of neurogenesis among the olfactory projection neurons in the crustacean brain, with peak proliferation during the hours surrounding dusk, the most active period for lobsters. These data raise the possibility that light-controlled rhythms are a primary regulator of neuronal proliferation, and that previously-demonstrated hormonal and activity-driven influences over neurogenesis may be secondary events in a complex circadian control pathway.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号