首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
GESTs (gene expression similarity and taxonomy similarity), a gene functional prediction approach previously proposed by us, is based on gene expression similarity and concept similarity of functional classes defined in Gene Ontology (GO). In this paper, we extend this method to protein-protein interaction data by introducing several methods to filter the neighbors in protein interaction networks for a protein of unknown function(s). Unlike other conventional methods, the proposed approach automatically selects the most appropriate functional classes as specific as possible during the learning process, and calls on genes annotated to nearby classes to support the predictions to some small-sized specific classes in GO. Based on the yeast protein-protein interaction information from MIPS and a dataset of gene expression profiles, we assess the performances of our approach for predicting protein functions to “biology process” by three measures particularly designed for functional classes organized in GO. Results show that our method is powerful for widely predicting gene functions with very specific functional terms. Based on the GO database published in December 2004, we predict some proteins whose functions were unknown at that time, and some of the predictions have been confirmed by the new SGD annotation data published in April, 2006.  相似文献   

2.
GESTs (gene expression similarity and taxonomy similarity), a gene functional prediction approach previously proposed by us, is based on gene expression similarity and concept similarity of functional classes defined in Gene Ontology (GO). In this paper, we extend this method to protein-protein interac-tion data by introducing several methods to filter the neighbors in protein interaction networks for a protein of unknown function(s). Unlike other conventional methods, the proposed approach automati-cally selects the most appropriate functional classes as specific as possible during the learning proc-ess, and calls on genes annotated to nearby classes to support the predictions to some small-sized specific classes in GO. Based on the yeast protein-protein interaction information from MIPS and a dataset of gene expression profiles, we assess the performances of our approach for predicting protein functions to “biology process” by three measures particularly designed for functional classes organ-ized in GO. Results show that our method is powerful for widely predicting gene functions with very specific functional terms. Based on the GO database published in December 2004, we predict some proteins whose functions were unknown at that time, and some of the predictions have been confirmed by the new SGD annotation data published in April, 2006.  相似文献   

3.
k-均值聚类算法是一种广泛应用于基因表达数据聚类分析中的迭代变换算法,它通常用距离法来表示基因间的关系,但不能有效的反应基因间的相互依赖的关系。为此,提出基于信息论的k-modes聚类算法,克服了以上缺点。另外,还引入了伪F统计量,一方面,可以对空间中有部分重叠的点进行有效的分类;另一方面,可以给出最佳聚类数目,从而弥补了k-modes聚类法的不足。使其成为一种非常有效的算法,从而达到较优的聚类效果。  相似文献   

4.
基于PCR的基因差异表达分析技术   总被引:2,自引:0,他引:2  
基因差异表达分析是研究许多生物学过程的分子基础的一条直接、有效的途径。自DDRT-PCR技术建立以来,一系列基于PCR的基因差异表达分析技术,如SAGE、SSH、RDA和DNA微阵列等相继发展起来,为分析和克隆差异表达的基因提供了更为快速、灵敏的工具。本对这几种方法进行了简要综述,比较了不同方法的优缺点,并展望了今后基因差异表达研究技术的发展方向。  相似文献   

5.
随着DNA芯片技术的广泛应用,基因表达数据分析已成为生命科学的研究热点之一。概述基因表达聚类技术类型、算法分类与特点、结果可视化与注释;阐述一些流行的和新型的算法;介绍17个最新相关软件包和在线web服务工具;并说明软件工具的研究趋向。  相似文献   

6.
A hybrid GA (genetic algorithm)-based clustering (HGACLUS) schema, combining merits of the Simulated Annealing, was described for finding an optimal or near-optimal set of medoids. This schema maximized the clustering success by achieving internal cluster cohesion and external cluster isolation. The performance  相似文献   

7.
8.
Array-based gene expression studies frequently serve to identify genes that are expressed differently under two or more conditions. The actual analysis of the data, however, may be hampered by a number of technical and statistical problems. Possible remedies on the level of computational analysis lie in appropriate preprocessing steps, proper normalization of the data and application of statistical testing procedures in the derivation of differentially expressed genes. This review summarizes methods that are available for these purposes and provides a brief overview of the available software tools.  相似文献   

9.
基于基因表达谱的疾病亚型特征基因挖掘方法   总被引:1,自引:0,他引:1  
在本研究中,提出了一种基于基因表达谱的疾病亚型特征基因挖掘方法,该方法基于过滤后基因表达谱,融合无监督聚类识别疾病亚型技术和提出的衡量特征基因对疾病亚型鉴别能力的模式质量测度,以嵌入的方式实现特征基因挖掘。最后将提出的方法应用于40例结肠癌组织与22例正常结肠组织中2000个基因的表达谱实验数据,结果显示:提出的方法是一种可行的疾病亚型特征基因挖掘方法,方法的优势在于可并行实现疾病亚型划分和特征基因识别。  相似文献   

10.
Assessing reliability of gene clusters from gene expression data   总被引:5,自引:0,他引:5  
The rapid development of microarray technologies has raised many challenging problems in experiment design and data analysis. Although many numerical algorithms have been successfully applied to analyze gene expression data, the effects of variations and uncertainties in measured gene expression levels across samples and experiments have been largely ignored in the literature. In this article, in the context of hierarchical clustering algorithms, we introduce a statistical resampling method to assess the reliability of gene clusters identified from any hierarchical clustering method. Using the clustering trees constructed from the resampled data, we can evaluate the confidence value for each node in the observed clustering tree. A majority-rule consensus tree can be obtained, showing clusters that only occur in a majority of the resampled trees. We illustrate our proposed methods with applications to two published data sets. Although the methods are discussed in the context of hierarchical clustering methods, they can be applied with other cluster-identification methods for gene expression data to assess the reliability of any gene cluster of interest. Electronic Publication  相似文献   

11.
基因差异表达与杂种优势形成机制探讨   总被引:6,自引:0,他引:6  
许晨璐  孙晓梅  张守攻 《遗传》2013,35(6):714-726
对杂种优势这一普遍而重要的生物学现象研究虽有百余年的历史, 但其根本机理尚未阐述清楚。继基因组组成差异及基因效应研究之后, 基因表达差异成为探寻杂种优势分子机理新的切入点。旨在通过揭示杂种中等位基因差异表达、杂种与亲本间基因差异表达的调控机制, 来认识杂种优势形成的分子机理, 从而达到指导育种实践的目的。文章概述了杂种等位基因差异表达现象及其产生机理, 总结了杂种与亲本相比所呈现出的加性、显性和超显性等多种差异基因表达模式, 归纳了表达谱研究筛选出的与杂种优势形成有关的基因, 以及某些关键生化代谢途径对杂种优势形成的贡献。但由于杂种优势机理的复杂性, 基因表达研究并没有得出统一的表达模式, 大多数杂种优势基因也不能被归属为同一类别。尽管如此, 基因表达谱研究毕竟迈出了解析杂种优势形成复杂基因表达网络的第一步, 随着表达谱技术和生物信息学的不断更新和发展, 杂种优势形成的分子机理有望在基因表达层面上取得突破。  相似文献   

12.
We study statistical methods to detect cancer genes that are over- or down-expressed in some but not all samples in a disease group. This has proven useful in cancer studies where oncogenes are activated only in a small subset of samples. We propose the outlier robust t-statistic (ORT), which is intuitively motivated from the t-statistic, the most commonly used differential gene expression detection method. Using real and simulation studies, we compare the ORT to the recently proposed cancer outlier profile analysis (Tomlins and others, 2005) and the outlier sum statistic of Tibshirani and Hastie (2006). The proposed method often has more detection power and smaller false discovery rates. Supplementary information can be found at http://www.biostat.umn.edu/~baolin/research/ort.html.  相似文献   

13.
高磊  朱明珠  郭政  李霞 《生物信息学》2006,4(3):105-108
利用基因表达谱数据,通过计算互作蛋白质的表达相关系数,来筛选、优化蛋白质互作网络。结果显示,利用经过筛选的互作数据,根据邻居计数法和卡方法进行功能预测的预测效果明显提高,距离待测蛋白质较远的邻居也包含着与待测蛋白质功能一致的信息。  相似文献   

14.
Qin LX  Self SG 《Biometrics》2006,62(2):526-533
Identification of differentially expressed genes and clustering of genes are two important and complementary objectives addressed with gene expression data. For the differential expression question, many "per-gene" analytic methods have been proposed. These methods can generally be characterized as using a regression function to independently model the observations for each gene; various adjustments for multiplicity are then used to interpret the statistical significance of these per-gene regression models over the collection of genes analyzed. Motivated by this common structure of per-gene models, we proposed a new model-based clustering method--the clustering of regression models method, which groups genes that share a similar relationship to the covariate(s). This method provides a unified approach for a family of clustering procedures and can be applied for data collected with various experimental designs. In addition, when combined with per-gene methods for assessing differential expression that employ the same regression modeling structure, an integrated framework for the analysis of microarray data is obtained. The proposed methodology was applied to two microarray data sets, one from a breast cancer study and the other from a yeast cell cycle study.  相似文献   

15.
The risk associated with exposure to hepatotoxic drugs is difficult to quantify. Animal experiments to assess their chronic toxicological impact are time consuming. New quantitative approaches to correlate gene expression changes caused by drug exposure to chronic toxicity are required. This article proposes a mathematical model entitled Toxicologic Prediction Network (TPN) to assess chronic hepatotoxicity based on subchronic hepatic gene expression data in rats. A directed graph accounts for the interactions between the drugs, differentially expressed genes and chronic hepatotoxicity. A knowledge-based mathematical model estimates phenotypical exposure risk such as toxic hepatopathy, diffuse fatty change and hepatocellular adenoma for rats. The network's edges encoding the interaction strength are determined by solving an inversion problem that minimizes the difference between the observed and the predicted relative gene expressions as well as the chronic toxicity data. A realistic case study demonstrates how chronic health risk of three halogenated aromatic hydrocarbons can be inferred from subchronic gene expression data. The advantages of the TPN are further demonstrated through two novel applications: Estimation of toxicological impact of new drugs and drug mixtures as well as rigorous determination of the optimal drug formulation to achieve maximum potency with minimum side-effects. Prediction of animal toxicity may be relevant for assessing risk for humans in the future.  相似文献   

16.
Although many numerical clustering algorithms have been applied to gene expression dataanalysis,the essential step is still biological interpretation by manual inspection.The correlation betweengenetic co-regulation and affiliation to a common biological process is what biologists expect.Here,weintroduce some clustering algorithms that are based on graph structure constituted by biological knowledge.After applying a widely used dataset,we compared the result clusters of two of these algorithms in terms ofthe homogeneity of clusters and coherence of annotation and matching ratio.The results show that theclusters of knowledge-guided analysis are the kernel parts of the clusters of Gene Ontology (GO)-Clustersoftware,which contains the genes that are most expression correlative and most consistent with biologicalfunctions.Moreover,knowledge-guided analysis seems much more applicable than GO-Cluster in a largerdataset.  相似文献   

17.
Analysis of large-scale gene expression data.   总被引:10,自引:0,他引:10  
DNA microarray technology has resulted in the generation of large complex data sets, such that the bottleneck in biological investigation has shifted from data generation, to data analysis. This review discusses some of the algorithms and tools for the analysis and organisation of microarray expression data, including clustering methods, partitioning methods, and methods for correlating expression data to other biological data.  相似文献   

18.
Currently, linear mixed model analyses of expression microarray experiments are performed either in a gene-specific or global mode. The joint analysis provides more flexibility in terms of how parameters are fitted and estimated and tends to be more powerful than the gene-specific analysis. Here we show how to implement the gene-specific linear mixed model analysis as an exact algorithm for the joint linear mixed model analysis. The gene-specific algorithm is exact, when the mixed model equations can be partitioned into unrelated components: One for all global fixed and random effects and the others for the gene-specific fixed and random effects for each gene separately. This unrelatedness holds under three conditions: (1) any gene must have the same number of replicates or probes on all arrays, but these numbers can differ among genes; (2) the residual variance of the (transformed) expression data must be homogeneous or constant across genes (other variance components need not be homogeneous) and (3) the number of genes in the experiment is large. When these conditions are violated, the gene-specific algorithm is expected to be nearly exact.  相似文献   

19.
基因表达聚类分析技术的现状与发展   总被引:5,自引:0,他引:5  
随着多个生物基因组测序的完成、DNA芯片技术的广泛应用,基因表达数据分析已成为后基因组时代的研究热点.聚类分析能将功能相关的基因按表达谱的相似程度归纳成类,有助于对未知功能的基因进行研究,是目前基因表达分析研究的主要计算技术之一.已有多种聚类分析算法用于基因表达数据分析,各种算法因其着眼点、原理等方面的差异,而各有其优缺点.如何对各种聚类算法的有效性进行分析、并开发新型的、适合于基因表达数据分析的方法已是当务之急.  相似文献   

20.
The primitive epithelium of embryonic chicken proventriculus (glandular stomach) differentiates, after day 6 of incubation, into luminal epithelium, which faces the lumen and abundantly secretes mucus, and glandular epithelium, which invaginates into mesenchyme and later expresses embryonic chicken pepsinogen (ECPg). So far it is not well understood how undifferentiated epithelial cells differentiate into these two distinct cell populations. Spasmolytic polypeptide (SP) is known to be expressed in surface mucous cells of mammalian stomach. In order to obtain the differentiation marker for proventricular luminal epithelial cells, we cloned a cDNA encoding chicken SP ( cSP ). Sequence analysis indicated that cSP has the duplicated cysteine-rich domain characteristic of SP. Examination of the spatial and temporal expression pattern of cSP gene revealed that, during embryogenesis, cSP was expressed in luminal epithelial cells of the proventriculus, gizzard, small intestine, and lung, but not the esophagus. In the proventriculus, cSP mRNA was first detected on day 8 of incubation and was localized to differentiated luminal epithelial cells. By using cSP as a molecular marker, the effects of mesenchyme on the differentiation of epithelium were analyzed in vitro . On the basis of these data, a model is presented concerning the differentiation of proventricular epithelium.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号