首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In response to the rapid development of DNA Microarray Technologies, many differentially expressed genes selection algorithms have been developed, and different comparison studies of these algorithms have been done. However, it is not clear how these methods compare with each other, especially when we used different developments tools. Here, we considered three commonly used differentially expressed genes selection approaches, namely: Fold Change, T-test and SAM, using Bioinformatics Matlab Toolbox and R/BioConductor. We used two datasets, issued from the affymetrix technology, to present results of used methods and software''s in gene selection process. The results, in terms of sensitivity and specificity, indicate that the behavior of SAM is better compared to Fold Change and T-test using R/BioConductor. While, no practical differences were observed between the three gene selection methods when using Bioinformatics Matlab Toolbox. In face of our result, the ROC curve shows that: on the one hand R/BioConductor using SAM is favored for microarray selection compared to the other methods. And, on the other hand, results of the three studied gene selection methods using Bioinformatics Matlab Toolbox are still comparable for the two datasets used.  相似文献   

2.
Identifying differentially expressed (DE) genes across conditions or treatments is a typical problem in microarray experiments. In time course microarray experiments (under two or more conditions/treatments), it is sometimes of interest to identify two classes of DE genes: those with no time-condition interactions (called parallel DE genes, or PDE), and those with time-condition interactions (nonparallel DE genes, NPDE). Although many methods have been proposed for identifying DE genes in time course experiments, methods for discerning NPDE genes from the general DE genes are still lacking. We propose a functional ANOVA mixed-effect model to model time course gene expression observations. The fixed effect of (the mean curve) of the model decomposes bivariate functions of time and treatments (or experimental conditions) as in the classic ANOVA method and provides the associated notions of main effects and interactions. Random effects capture time-dependent correlation structures. In this model, identifying NPDE genes is equivalent to testing the significance of the time-condition interaction, for which an approximate F-test is suggested. We examined the performance of the proposed method on simulated datasets in comparison with some existing methods, and applied the method to a study of human reaction to the endotoxin stimulation, as well as to a cell cycle expression data set.  相似文献   

3.
Low temperature has become a major abiotic stress factor that can reduce maize yield and cause a number of economic loss. This study was designed to identify key genes and pathways associated with coldresistance of maize. The gene expression profile GSE46704, including 4 control temperature treated plants and 4 low temperature treated plants, was downloaded from the Gene Expression Omnibus database. Differentially-expressed genes (DEGs) were identified by limma package. Then, protein-protein interaction (PPI) network and module selection were constructed using Cytoscape. Moreover, the DEGs were re-matched based on the Zea mays L. gene ID and symbol data from PlantRegMap. Finally, the re-matched DEGs were performed functional and pathway enrichment analyses by the DAVID online tool. A total of 750 DEGs were screened (including 387 up-regulated and 363 down-regulated genes) In the PPI network, GRMZM2G070837_P01 and GRMZM2G114578_P01 had higher degrees. Besides, carbohydrate metabolic process, starch and sucrose metabolism and biosynthesis of secondary metabolites were significantly enriched in functional and pathway enrichment analysis. GRMZM2G070837_P01 and GRMZM2G114578_P01 might play a critical role in cold-resistance of maize. Meanwhile, carbohydrate metabolic process, starch and sucrose metabolism and biosynthesis of secondary metabolites might function in cold-resistance of maize.  相似文献   

4.
Microarray experiments contribute significantly to the progress in disease treatment by enabling a precise and early diagnosis. One of the major objectives of microarray experiments is to identify differentially expressed genes under various conditions. The statistical methods currently used to analyse microarray data are inadequate, mainly due to the lack of understanding of the distribution of microarray data. We present a nonparametric likelihood ratio (NPLR) test to identify differentially expressed genes using microarray data. The NPLR test is highly robust against extreme values and does not assume the distribution of the parent population. Simulation studies show that the NPLR test is more powerful than some of the commonly used methods, such as the two-sample t-test, the Mann-Whitney U-test and significance analysis of microarrays (SAM). When applied to microarray data, we found that the NPLR test identifies more differentially expressed genes than its competitors. The asymptotic distribution of the NPLR test statistic and the p-value function is presented. The application of the NPLR method is shown, using both synthetic and real-life data. The biological significance of some of the genes detected only by the NPLR method is discussed.  相似文献   

5.
6.
One goal of cluster analysis is to sort characteristics into groups (clusters) so that those in the same group are more highly correlated to each other than they are to those in other groups. An example is the search for groups of genes whose expression of RNA is correlated in a population of patients. These genes would be of greater interest if their common level of RNA expression were additionally predictive of the clinical outcome. This issue arose in the context of a study of trauma patients on whom RNA samples were available. The question of interest was whether there were groups of genes that were behaving similarly, and whether each gene in the cluster would have a similar effect on who would recover. For this, we develop an algorithm to simultaneously assign characteristics (genes) into groups of highly correlated genes that have the same effect on the outcome (recovery). We propose a random effects model where the genes within each group (cluster) equal the sum of a random effect, specific to the observation and cluster, and an independent error term. The outcome variable is a linear combination of the random effects of each cluster. To fit the model, we implement a Markov chain Monte Carlo algorithm based on the likelihood of the observed data. We evaluate the effect of including outcome in the model through simulation studies and describe a strategy for prediction. These methods are applied to trauma data from the Inflammation and Host Response to Injury research program, revealing a clustering of the genes that are informed by the recovery outcome.  相似文献   

7.
8.
基因芯片实验要得到可靠的生物学结论,必须基于优化的实验设计和科学的数据分析。讨论了与基因芯片数据分析方法相关的实验设计方面的几个问题,简述了差异表达分析、聚类分析及功能富集分析等分析方法及其进展,并介绍了部分软件及应用。  相似文献   

9.
Summary We consider penalized linear regression, especially for “large p, small n” problems, for which the relationships among predictors are described a priori by a network. A class of motivating examples includes modeling a phenotype through gene expression profiles while accounting for coordinated functioning of genes in the form of biological pathways or networks. To incorporate the prior knowledge of the similar effect sizes of neighboring predictors in a network, we propose a grouped penalty based on the Lγ ‐norm that smoothes the regression coefficients of the predictors over the network. The main feature of the proposed method is its ability to automatically realize grouped variable selection and exploit grouping effects. We also discuss effects of the choices of the γ and some weights inside the Lγ ‐norm. Simulation studies demonstrate the superior finite‐sample performance of the proposed method as compared to Lasso, elastic net, and a recently proposed network‐based method. The new method performs best in variable selection across all simulation set‐ups considered. For illustration, the method is applied to a microarray dataset to predict survival times for some glioblastoma patients using a gene expression dataset and a gene network compiled from some Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways.  相似文献   

10.
11.
12.
Microarrays have been useful in understanding various biological processes by allowing the simultaneous study of the expression of thousands of genes. However, the analysis of microarray data is a challenging task. One of the key problems in microarray analysis is the classification of unknown expression profiles. Specifically, the often large number of non-informative genes on the microarray adversely affects the performance and efficiency of classification algorithms. Furthermore, the skewed ratio of sample to variable poses a risk of overfitting. Thus, in this context, feature selection methods become crucial to select relevant genes and, hence, improve classification accuracy. In this study, we investigated feature selection methods based on gene expression profiles and protein interactions. We found that in our setup, the addition of protein interaction information did not contribute to any significant improvement of the classification results. Furthermore, we developed a novel feature selection method that relies exclusively on observed gene expression changes in microarray experiments, which we call “relative Signal-to-Noise ratio” (rSNR). More precisely, the rSNR ranks genes based on their specificity to an experimental condition, by comparing intrinsic variation, i.e. variation in gene expression within an experimental condition, with extrinsic variation, i.e. variation in gene expression across experimental conditions. Genes with low variation within an experimental condition of interest and high variation across experimental conditions are ranked higher, and help in improving classification accuracy. We compared different feature selection methods on two time-series microarray datasets and one static microarray dataset. We found that the rSNR performed generally better than the other methods.  相似文献   

13.
Many changes in gene expression occur in response to water-deficitstress. A challenge is to determine which changes support plantadaptation to conditions of reduced soil water content and whichoccur in response to lesions in metabolic and cellular functions.Microarray methods are being employed to catalogue all of thechanges in gene expression that occur in response to specificwater-deficit conditions. Although these methods do not measurethe amount or activities of specific proteins that functionin the water-deficit response, they do target specific biochemicaland cellular events that should be detailed in further work.Potential functions of approx. 130 genes of Arabidopsis thalianathat have been shown to be up-regulated are tabulated here.These point to signalling events, detoxification and other functionsinvolved in the cellular response to water-deficit stress. Asmicroarray techniques are refined, plant stress biologists willbe able to characterize changes in gene expression within thewhole genome in specific organs and tissues subjected to differentlevels of water-deficit stress.  相似文献   

14.
DNA微阵列技术的发展为基因表达研究提供更有效的工具。分析这些大规模基因数据主要应用聚类方法。最近,提出双聚类技术来发现子矩阵以揭示各种生物模式。多目标优化算法可以同时优化多个相互冲突的目标,因而是求解基因表达矩阵的双聚类的一种很好的方法。本文基于克隆选择原理提出了一个新奇的多目标免疫优化双聚类算法,来挖掘微阵列数据的双聚类。在两个真实数据集上的实验结果表明该方法比其他多目标进化双聚娄算法表现出更优越的性能。  相似文献   

15.
16.
17.
Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology.  相似文献   

18.
基因芯片数据的监督聚类分析   总被引:1,自引:0,他引:1  
随着后基因组时代的到来,基因芯片技术越来越多地被应用到功能基因组的研究当中。如何快速有效地分析基因芯片实验所获得的大量生物学数据,成为当前一项具有重要意义的研究工作。监督聚类(supervised clustering analysis)是聚类分析的一种,它根据样本的先验信息或假设来决定样本的分类,并据此建立判别模型,继而利用该判别模型对未知对象进行分类。该方法已经成功应用到生物医学研究中的许多领域,成为分析基因芯片数据的重要手段。  相似文献   

19.
应用基因芯片分析甘蓝型油菜柱头特异表达基因   总被引:1,自引:0,他引:1  
以甘蓝型油菜(Brassica napus)野生型(宁油10号)及其柱头授粉功能缺失突变体FS-M1为材料,使用油菜基因表达谱芯片筛选甘蓝型油菜柱头特异表达基因。在含有16 540个基因的油菜基因表达谱芯片中(43 803探针),获得了4 410条差异表达探针,选择部分差异表达基因进行实时定量PCR,所得结果与芯片检测结果相吻合。其中,野生型较FS-M1显著上调且获得209个功能注释的探针,对应198个基因,这些特异表达的基因主要富集在水解酶、转移酶、氧化还原酶和转录因子中;涉及较大的基因家族包括:细胞色素P450基因、GDSL脂肪酶/水解酶基因、ABC转运蛋白基因、myb转录因子基因、bHLH转录因子基因、过氧化物酶家族和受体激酶基因等。推测这些基因与甘蓝型油菜柱头发育及授粉功能有关。  相似文献   

20.
利用cDNA微阵列分离津田芜菁花青素生物合成相关基因   总被引:2,自引:0,他引:2  
许志茹  李玉花 《遗传》2006,28(9):1101-1106
花色素苷是植物的重要次生代谢产物, 在植物体内行使多种生理功能。利用UV-A处理48 h后津田芜菁块根变红, 以黑暗处理条件下的白色块根为对照, 与削减文库特异基因片段制备的cDNA微阵列进行杂交。UV-A处理条件下津田芜菁中表达上调的基因为81个, 表达下调的基因为47个, 表达上调的基因中包括与花青素生物合成直接相关的基因片段cytochrome P450, PAL, F3H, ANS, CHS, DFR和GST等。Northern杂交结果显示, UV-A处理48 h的津田芜菁试材中, PAL、CHS、F3H、DFR和ANS基因的表达量明显高于黑暗条件下白色块根中这些基因的表达量, 进一步验证了芯片杂交结果的可靠性。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号