首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 125 毫秒
1.
喻辉  郭政  李霞 《生物信息学》2003,1(1):15-19
我们研制了基于Gene Ontology与基因表达谱挖掘与实验条件相关的特征基因功能类的算法OntoFexed,它的特点是分别采用信息增益方法和Rand Index评价单个基因功能类与一组基因功能类鉴别差异表达基因与不差异表达基因的能力。算法的优点是充分利用了GO的结构信息来搜索特征功能类,并能给出各个抽象层次上的特征功能类。我们将OntoFexed应用于腺癌数据集和NCI60数据集,发现OntoFexed确能发掘与实验条件相关的功能类,且算法对主要的参数有较高的稳健性。  相似文献   

2.
结合基因功能分类体系Gene Ontology筛选聚类特征基因   总被引:3,自引:0,他引:3  
使用两套基因表达谱数据,按各基因的表达值方差,选择表达变异基因对样本聚类,发现一般使用方差较大的前10%的基因作为特征基因,就可以较好地对疾病样本聚类。对不同的疾病,包含聚类信息的特征基因有不同的分布特点。在此基础上,结合基因功能分类体系(Gene Ontology,GO),进一步筛选聚类的特征基因。通过检验在Gene Ontology中的每个功能类中的表达变异基因是否非随机地聚集,寻找疾病相关功能类,再根据相关功能类中的表达变异基因进行聚类分析。实验结果显示:结合基因功能体系进一步筛选表达变异基因作为聚类特征基因,可以保持或提高聚类准确性,并使得聚类结果具有明确的生物学意义。另外,发现了一些可能和淋巴瘤和白血病相关的基因。  相似文献   

3.
GO功能类与基因差异表达的关联规则挖掘算法   总被引:1,自引:0,他引:1  
针对基因功能分类体系Gene Ontology的层次结构特点,修改关联规则挖掘算法Apriori,开发“挖掘与基因差异表达关联的GO功能组合”软件(RuleGO).RuleGO以基因表达谱上的差异表达基因集合和不差异表达基因集合为输入,输出组合特征功能类与基因差异表达现象的关联规则,有助于解释基因差异表达现象的本质原因,如疾病发病机制、药物作用机理等.将RuleGO 和OntoExpress应用在结肠癌和腺癌表达谱数据集上,结果显示,RuleGO比OntoExpress能发现更多的与差异表达现象关联的特征功能类,更能看到在OntoExpress上不能发现的组合特征功能类.另外,结果显示,将规则的置信度和支持度要求设置较高时,一般只有组合功能类才能满足要求,这提示在基因表达谱分析中不宜采用单个角度的单个功能分类单元,考虑功能分类单元的组合可能更有意义.  相似文献   

4.
新疆南部维吾尔族聚居区是宫颈癌高发区. 本文旨在利用基因芯片技术筛选与维吾尔族妇女宫颈癌发生相关的基因. 首先,分别提取5例新疆维吾尔妇女宫颈癌和5例子宫肌瘤组织(对照)的mRNA,逆转录成cDNA,并用Cy3-dUTP标记子宫肌瘤组织的cDNA,用Cy5 dUTP标记宫颈癌组织的cDNA,制成芯片杂交探针.为筛选出宫颈癌组织中差异表达的基因,上述标记探针分别与含有20 000条人类基因的Affymetrix基因芯片进行杂交,杂交信号用GeneChip Scanner 3000扫描仪扫描,并用芯片图像分析软件(SAM software)分析扫描结果.筛选出的差异表达基因经GO(Gene Ontology)分析和KEGG(Kyoto Encyclopedia of Genes and Genomes)信号通路分析,确定其在宫颈癌中的作用.基因芯片筛选结果显示,在宫颈癌组织中发现2 758个差异表达基因,其中1 326个上调基因,1 432个下调基因.GO分析和KEGG信号通路分析表明,表达差异在两倍以上的基因涉及168个信号通路,包括细胞粘附分子、细胞周期以及MAPK和mTOR信号通路等.上述结果表明,基因芯片技术筛选出大量与宫颈癌发生相关的基因,其中表达差异显著的基因涉及细胞粘附分子、细胞周期和mTOR等信号通路.  相似文献   

5.
旨在探索多肽9R-P201处理肝癌HepG2细胞后基因融合、单核苷酸多态性(Single nucleotide polymorphism,SNP)突变、可变剪接等事件,并分析差异表达基因所参与的生物学进程与信号通路,以期解析多肽9R-P201在转录组水平对肝癌细胞的调控。通过转录组测序检测9R-P201处理肝癌HepG2细胞前后基因差异表达情况,tophat-fusion软件检测基因融合,SAMTOOLS软件检测SNP位点,r MATS软件鉴定可变剪接,使用基因本体(Gene Ontology,GO)和京都基因与基因组百科全书(Kyoto encyclopedia of genes and genomes,KEGG)富集分析方法对差异表达基因进行功能富集分析。结果共检测到可变剪接事件276个、SNP位点5 557个、基因融合事件45个;同时共得到显著差异表达基因403个,其中上调269个而下调134个,基因的功能富集分析结果显示差异表达基因显著富集细胞生长、迁移等肿瘤相关生物进程,并参与多条与癌症相关的信号通路。研究表明在9R-P201诱导HepG2细胞后,导致表达差异基因显著与肿瘤生物学进程和通路相关,并发生了大量可变剪接、SNP突变、基因融合等事件,这暗示着该多肽有望作为后续肝癌介入治疗潜在药物分子。  相似文献   

6.
GESTs(gene expression similarity and taxonomy similarity)是结合基因表达相似性和基因功能分类体系Gene Ontology (GO)中的功能概念相似性测度进行功能预测的新方法. 将此预测算法推广应用于蛋白质互相作用数据, 并提出了几种在蛋白质互作网络中为功能待测蛋白质筛选邻居的方法. 与已有的其它蛋白质功能预测方法不同, 新方法在学习过程中自动地从功能分类体系中的各个功能类中选择最合适的尽可能具体细致的功能类, 利用注释于其相近功能类中的互作邻居蛋白质支持对此具体功能类的预测. 使用MIPS提供的酵母蛋白质互作信息与一套基因表达谱数据, 利用特别针对GO体系结构层次特点设计的3种测度, 评价对GO知识体系中的生物过程分支进行蛋白质功能预测的效果. 结果显示, 利用文中的方法, 可以大范围预测蛋白质的精细功能. 此外, 还利用此方法对2004年底Gene Ontology上未知功能的蛋白质进行预测, 其中部分预测结果在2006年4月发布的SGD注释数据中已经得到了证实.  相似文献   

7.
猪轮状病毒(Porcine rotavirus,PoRV)是在世界范围内与严重腹泻疾病相关的主要肠道病原体,是新生仔猪肠炎和腹泻致死的重要原因之一。为了深入了解猪轮状病毒感染肠道后其致病机制以及引起宿主的抗病毒和修复机制,我们分别饲喂培养基和猪轮状病毒病毒液5d后,采取5 d、10 d、15 d、20 d、25 d小鼠空肠组织进行高通量转录组测序分析。利用基因本体论(Gene Ontology,GO)数据库功能富集分析、京都基因与基因组百科全书(Kyoto Encyclopedia of Genes and Genomes,KEGG)通路富集分析差异表达基因(Differentially Expressed Gene,DEGs),选取部分差异表达基因,进行qRT-PCR验证。结果显示,与对照组相比,5个时间段上调DEGs有849个,下调DEGs有824个。对DEGs进行5个功能富集,有共同差异基因15个。这些差异表达基因广泛参与脂质合成与代谢、免疫、细胞增殖、细胞凋亡等活动,主要注释到PPAR、NOD-like、IL-17等信号通路中。qRT-PCR验证上皮细胞增殖相关基因PBLD、C...  相似文献   

8.
为探索植物内生菌R(人参)与D(越橘)醇提取混合物RD对水稻的促进生长、增强抗病性的作用机制,采用基因芯片技术对RD处理的水稻基因组进行分析,检测RD处理水稻后差异表达基因。结果显示,检测到差异表达基因1 171个,其中上调表达基因671个,下调表达基因500个。根据基因芯片的试验结果,采用Go注释系统对差异表达基因进行功能注释表明,主要为刺激应答、转录调控、生理功能、结合功能、细胞过程、代谢功能等。代谢途径的变化存在明显差异,表达量上调代谢通路39条,表达量下调代谢途径24条。说明植物内生菌R与D醇提取混合物处理影响水稻性状的表达不是某个基因孤立、单一的作用,而是多方面、多层次共同作用的结果。  相似文献   

9.
NAC(NAM、ATAF1/2和CUC2)转录因子是植物特有的转录调控因子,在植物的器官建成、生长发育以及抵御非生物胁迫等方面发挥着至关重要的作用。该文利用基因芯片技术筛选转Sl NAC10基因拟南芥和野生型拟南芥非生物胁迫抗性相关差异表达基因,并通过实时荧光定量PCR对部分差异表达基因进行验证。芯片结果显示,差异表达2倍以上的基因有4 054个,其中与非生物胁迫相关基因有15个,与非生物胁迫相关的转录因子基因有14个,这些基因参与应答渗透胁迫、响应高盐、冷、热、高光强等胁迫。对差异表达2倍以上的基因进行GO(Gene Ontology)分析和KEGG(Kyoto Encyclopedia of Genes and Genomes)分析,发现这些基因在非生物胁迫相关的13个注释中富集,涉及相关代谢途径96个,其中包括植物激素信号转导、精氨酸和脯氨酸代谢、吲哚生物碱合成、谷胱甘肽代谢等。以上结果表明,SlNAC10可直接或间接调控多种下游基因的表达,提高植物抵御非生物胁迫的能力。  相似文献   

10.
cDNA基因芯片技术已广泛应用于生物物种间功能基因组和表达谱学研究。然而,鱼类基因芯片开发和应用相对落后。为了筛选与肉质性状相关功能基因,本研究首次试用异源斑马鱼基因cDNA芯片,对两种肉质性状明显差异的鳜鱼和鲢鱼肌肉组织中基因表达进行了比较分析。从两种鱼肌肉组织中提取总RNA,经Biotin荧光标记与拥有15617个cDNA片段的斑马鱼基因芯片(Affymetrix)杂交后,检测出375个表达基因。与鲢鱼比较,鳜鱼肌肉组织锁定的基因中有180个上调表达基因和195个下调表达基因。在鳜鱼肌肉组织180个上调基因中,49个为已知功能基因,131个为未知功能基因。根据基因文库同源功能基因分析,我们将49个已知上调基因按功能大约分为七大类,其中与肌肉结构相关基因包括肌球蛋白重链基因(MYH)、肌纤维间连接基因和细胞骨架结构基因等。同时,我们对与肉质结构性状密切相关的功能基因进行了分析,并结合与鳜鱼优良肉质结构和功能基因表达关系进行了讨论。  相似文献   

11.
Accurately identifying differentially expressed genes from microarray data is not a trivial task, partly because of poor variance estimates of gene expression signals. Here, after analyzing 380 replicated microarray experiments, we found that probesets have typical, distinct variances that can be estimated based on a large number of microarray experiments. These probeset-specific variances depend at least in part on the function of the probed gene: genes for ribosomal or structural proteins often have a small variance, while genes implicated in stress responses often have large variances. We used these variance estimates to develop a statistical test for differentially expressed genes called EVE (external variance estimation). The EVE algorithm performs better than the t-test and LIMMA on some real-world data, where external information from appropriate databases is available. Thus, EVE helps to maximize the information gained from a typical microarray experiment. Nonetheless, only a large number of replicates will guarantee to identify nearly all truly differentially expressed genes. However, our simulation studies suggest that even limited numbers of replicates will usually result in good coverage of strongly differentially expressed genes.  相似文献   

12.
13.
14.
Microarrays have become an important tool for studying the molecular basis of complex disease traits and fundamental biological processes. A common purpose of microarray experiments is the detection of genes that are differentially expressed under two conditions, such as treatment versus control or wild type versus knockout. We introduce a Laplace mixture model as a long-tailed alternative to the normal distribution when identifying differentially expressed genes in microarray experiments, and provide an extension to asymmetric over- or underexpression. This model permits greater flexibility than models in current use as it has the potential, at least with sufficient data, to accommodate both whole genome and restricted coverage arrays. We also propose likelihood approaches to hyperparameter estimation which are equally applicable in the Normal mixture case. The Laplace model appears to give some improvement in fit to data, though simulation studies show that our method performs similarly to several other statistical approaches to the problem of identification of differential expression.  相似文献   

15.
Commonly accepted intensity-dependent normalization in spotted microarray studies takes account of measurement errors in the differential expression ratio but ignores measurement errors in the total intensity, although the definitions imply the same measurement error components are involved in both statistics. Furthermore, identification of differentially expressed genes is usually considered separately following normalization, which is statistically problematic. By incorporating the measurement errors in both total intensities and differential expression ratios, we propose a measurement-error model for intensity-dependent normalization and identification of differentially expressed genes. This model is also flexible enough to incorporate intra-array and inter-array effects. A Bayesian framework is proposed for the analysis of the proposed measurement-error model to avoid the potential risk of using the common two-step procedure. We also propose a Bayesian identification of differentially expressed genes to control the false discovery rate instead of the ad hoc thresholding of the posterior odds ratio. The simulation study and an application to real microarray data demonstrate promising results.  相似文献   

16.
17.
cDNA微阵列数据中包含许多变异因素,用于检测差异表达基因和其它统计分析前,必须将这些“噪音”剔除。对数比法(背景校正、对数比转换和数据标准化)已经被广泛应用于cDNA微阵列数据分析中,然而这种方法却存在着一些亟待解决的缺陷。对此,该文提出一种非转换方法,它可免去对数比的转化过程,直接在背景校正后进行数据标准化,可以有效剔除实验“噪音”。研究结果表明:在检测差异表达基因的效率方面,非转换方法比常规的对数比法具有更好的稳健性和更高的检测功效,基因检出率和准确性大大提高。  相似文献   

18.
MOTIVATION: Gene expression experiments provide a fast and systematic way to identify disease markers relevant to clinical care. In this study, we address the problem of robust identification of differentially expressed genes from microarray data. Differentially expressed genes, or discriminator genes, are genes with significantly different expression in two user-defined groups of microarray experiments. We compare three model-free approaches: (1). nonparametric t-test, (2). Wilcoxon (or Mann-Whitney) rank sum test, and (3). a heuristic method based on high Pearson correlation to a perfectly differentiating gene ('ideal discriminator method'). We systematically assess the performance of each method based on simulated and biological data under varying noise levels and p-value cutoffs. RESULTS: All methods exhibit very low false positive rates and identify a large fraction of the differentially expressed genes in simulated data sets with noise level similar to that of actual data. Overall, the rank sum test appears most conservative, which may be advantageous when the computationally identified genes need to be tested biologically. However, if a more inclusive list of markers is desired, a higher p-value cutoff or the nonparametric t-test may be appropriate. When applied to data from lung tumor and lymphoma data sets, the methods identify biologically relevant differentially expressed genes that allow clear separation of groups in question. Thus the methods described and evaluated here provide a convenient and robust way to identify differentially expressed genes for further biological and clinical analysis.  相似文献   

19.
MOTIVATION: A primary objective of microarray studies is to determine genes which are differentially expressed under various conditions. Parametric tests, such as two-sample t-tests, may be used to identify differentially expressed genes, but they require some assumptions that are not realistic for many practical problems. Non-parametric tests, such as empirical Bayes methods and mixture normal approaches, have been proposed, but the inferences are complicated and the tests may not have as much power as parametric models. RESULTS: We propose a weakly parametric method to model the distributions of summary statistics that are used to detect differentially expressed genes. Standard maximum likelihood methods can be employed to make inferences. For illustration purposes the proposed method is applied to the leukemia data (training part) discussed elsewhere. A simulation study is conducted to evaluate the performance of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号