首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
MOTIVATION: The biologic significance of results obtained through cluster analyses of gene expression data generated in microarray experiments have been demonstrated in many studies. In this article we focus on the development of a clustering procedure based on the concept of Bayesian model-averaging and a precise statistical model of expression data. RESULTS: We developed a clustering procedure based on the Bayesian infinite mixture model and applied it to clustering gene expression profiles. Clusters of genes with similar expression patterns are identified from the posterior distribution of clusterings defined implicitly by the stochastic data-generation model. The posterior distribution of clusterings is estimated by a Gibbs sampler. We summarized the posterior distribution of clusterings by calculating posterior pairwise probabilities of co-expression and used the complete linkage principle to create clusters. This approach has several advantages over usual clustering procedures. The analysis allows for incorporation of a reasonable probabilistic model for generating data. The method does not require specifying the number of clusters and resulting optimal clustering is obtained by averaging over models with all possible numbers of clusters. Expression profiles that are not similar to any other profile are automatically detected, the method incorporates experimental replicates, and it can be extended to accommodate missing data. This approach represents a qualitative shift in the model-based cluster analysis of expression data because it allows for incorporation of uncertainties involved in the model selection in the final assessment of confidence in similarities of expression profiles. We also demonstrated the importance of incorporating the information on experimental variability into the clustering model. AVAILABILITY: The MS Windows(TM) based program implementing the Gibbs sampler and supplemental material is available at http://homepages.uc.edu/~medvedm/BioinformaticsSupplement.htm CONTACT: medvedm@email.uc.edu  相似文献   

3.
Tumor-specific gene expression patterns with gene expression profiles   总被引:1,自引:0,他引:1  
Gene expression profiles of 14 common tumors and their counterpart normal tissues were analyzed with machine learning methods to address the problem of selection of tumor-specific genes and analysis of their differential expressions in tumor tissues. First, a variation of the Relief algorithm, "RFE_Relief algorithm" was proposed to learn the relations between genes and tissue types. Then, a support vector machine was employed to find the gene subset with the best classification performance for distinguishing cancerous tissues and their counterparts. After tissue-specific genes were removed, cross validation experiments were employed to demonstrate the common deregulated expressions of the selected gene in tumor tissues. The results indicate the existence of a specific expression fingerprint of these genes that is shared in different tumor tissues, and the hallmarks of the expression patterns of these genes in cancerous tissues are summarized at the end of this paper.  相似文献   

4.
Gene expression profiles of 14 common tumors and their counterpart normal tissues were analyzed with machine learning methods to address the problem of selection of tumor-specific genes and analysis of their differential expressions in tumor tissues. First, a variation of the Relief algorithm, “RFE_Relief algorithm” was proposed to learn the relations between genes and tissue types. Then, a support vector machine was employed to find the gene subset with the best classification performance for distinguishing cancerous tissues and their counterparts. After tissue-specific genes were removed, cross validation experiments were employed to demonstrate the common deregulated expressions of the selected gene in tumor tissues. The results indicate the existence of a specific expression fingerprint of these genes that is shared in different tumor tissues, and the hallmarks of the expression patterns of these genes in cancerous tissues are summarized at the end of this paper.  相似文献   

5.
MOTIVATION: Association pattern discovery (APD) methods have been successfully applied to gene expression data. They find groups of co-regulated genes in which the genes are either up- or down-regulated throughout the identified conditions. These methods, however, fail to identify similarly expressed genes whose expressions change between up- and down-regulation from one condition to another. In order to discover these hidden patterns, we propose the concept of mining co-regulated gene profiles. Co-regulated gene profiles contain two gene sets such that genes within the same set behave identically (up or down) while genes from different sets display contrary behavior. To reduce and group the large number of similar resulting patterns, we propose a new similarity measure that can be applied together with hierarchical clustering methods. RESULTS: We tested our proposed method on two well-known yeast microarray data sets. Our implementation mined the data effectively and discovered patterns of co-regulated genes that are hidden to traditional APD methods. The high content of biologically relevant information in these patterns is demonstrated by the significant enrichment of co-regulated genes with similar functions. Our experimental results show that the Mining Attribute Profile (MAP) method is an efficient tool for the analysis of gene expression data and competitive with bi-clustering techniques.  相似文献   

6.
It is well known that chronic, excessive consumption of alcohol can cause brain damage/structural changes in the regions important for neurocognitive function. Some of the damages are permanent, while others are reversible. Molecular mechanisms underlying alcohol-induced and/or -related brain damage are largely unknown, although it is generally believed that three factors (ethanol, nutritious and hepatic factors) play important roles. Recently, we have been employing a high-throughput proteomics technology to investigate several alcohol-sensitive brain regions from uncomplicated and hepatic cirrhosis-complicated alcoholics to understand the mechanisms of alcohol effects on the CNS at the level of protein expression. The changes of protein expression profiles in the hippocampus of alcoholic subjects were firstly demonstrated using 2D gel electrophoresis-based proteomics. Protein expression profiles identified in the hippocampus of alcoholic subjects were significantly different from those previously identified by our group in other brain regions of the same alcoholic cases, possibly indicating that these different brain regions react differently to chronic alcohol ingestion at the level of protein expression. Identified changes of protein expression associated with astrocyte and oxidative stress may indicate the possibility that increased levels of CNS ammonia and reactive oxygen species induced by alcoholic mild hepatic damage/dysfunction could cause selective damage in astrocytes of the hippocampus. Although our data did not demonstrate any evidence of direct alcohol effects to induce the alteration of protein expression in association with brain damage, high-throughput neuroproteomics approaches have proved to have the potential to dissect the mechanisms of complex brain disorders. Proteomics studies on human hippocampus, an important region for neurocognitive function and psychiatric illnesses (e.g., Alzheimer’s disease, alcoholism and schizophrenia) are still sparse, and further investigation is warranted to understand the underlying mechanisms.  相似文献   

7.
It is well known that chronic, excessive consumption of alcohol can cause brain damage/structural changes in the regions important for neurocognitive function. Some of the damages are permanent, while others are reversible. Molecular mechanisms underlying alcohol-induced and/or -related brain damage are largely unknown, although it is generally believed that three factors (ethanol, nutritious and hepatic factors) play important roles. Recently, we have been employing a high-throughput proteomics technology to investigate several alcohol-sensitive brain regions from uncomplicated and hepatic cirrhosis-complicated alcoholics to understand the mechanisms of alcohol effects on the CNS at the level of protein expression. The changes of protein expression profiles in the hippocampus of alcoholic subjects were firstly demonstrated using 2D gel electrophoresis-based proteomics. Protein expression profiles identified in the hippocampus of alcoholic subjects were significantly different from those previously identified by our group in other brain regions of the same alcoholic cases, possibly indicating that these different brain regions react differently to chronic alcohol ingestion at the level of protein expression. Identified changes of protein expression associated with astrocyte and oxidative stress may indicate the possibility that increased levels of CNS ammonia and reactive oxygen species induced by alcoholic mild hepatic damage/dysfunction could cause selective damage in astrocytes of the hippocampus. Although our data did not demonstrate any evidence of direct alcohol effects to induce the alteration of protein expression in association with brain damage, high-throughput neuroproteomics approaches have proved to have the potential to dissect the mechanisms of complex brain disorders. Proteomics studies on human hippocampus, an important region for neurocognitive function and psychiatric illnesses (e.g., Alzheimer's disease, alcoholism and schizophrenia) are still sparse, and further investigation is warranted to understand the underlying mechanisms.  相似文献   

8.
9.
PCP: a program for supervised classification of gene expression profiles   总被引:1,自引:0,他引:1  
PCP (Pattern Classification Program) is an open-source machine learning program for supervised classification of patterns (vectors of measurements). The principal use of PCP in bioinformatics is design and evaluation of classifiers for use in clinical diagnostic tests based on measurements of gene expression. PCP implements leading pattern classification and gene selection algorithms and incorporates cross-validation estimation of classifier performance. Importantly, the implementation integrates gene selection and class prediction stages, which is vital for computing reliable performance estimates in small-sample scenarios. Additionally, the program includes automated and efficient model selection (optimization of parameters) for support vector machine (SVM) classifier. The distribution includes Linux and Windows/Cygwin binaries. The program can easily be ported to other platforms. AVAILABILITY: Free download at http://pcp.sourceforge.net  相似文献   

10.
Mining gene expression profiles: expression signatures as cancer phenotypes   总被引:6,自引:0,他引:6  
Many examples highlight the power of gene expression profiles, or signatures, to inform an understanding of biological phenotypes. This is perhaps best seen in the context of cancer, where expression signatures have tremendous power to identify new subtypes and to predict clinical outcomes. Although the ability to interpret the meaning of the individual genes in these signatures remains a challenge, this does not diminish the power of the signature to characterize biological states. The use of these signatures as surrogate phenotypes has been particularly important, linking diverse experimental systems that dissect the complexity of biological systems with the in vivo setting in a way that was not previously feasible.  相似文献   

11.
Lyu  Yafei  Li  Qunhua 《BMC bioinformatics》2016,17(1):51-60
Determining differentially expressed genes (DEGs) between biological samples is the key to understand how genotype gives rise to phenotype. RNA-seq and microarray are two main technologies for profiling gene expression levels. However, considerable discrepancy has been found between DEGs detected using the two technologies. Integration data across these two platforms has the potential to improve the power and reliability of DEG detection. We propose a rank-based semi-parametric model to determine DEGs using information across different sources and apply it to the integration of RNA-seq and microarray data. By incorporating both the significance of differential expression and the consistency across platforms, our method effectively detects DEGs with moderate but consistent signals. We demonstrate the effectiveness of our method using simulation studies, MAQC/SEQC data and a synthetic microRNA dataset. Our integration method is not only robust to noise and heterogeneity in the data, but also adaptive to the structure of data. In our simulations and real data studies, our approach shows a higher discriminate power and identifies more biologically relevant DEGs than eBayes, DEseq and some commonly used meta-analysis methods.  相似文献   

12.
We developed PathAct, a novel method for pathway analysis to investigate the biological and clinical implications of the gene expression profiles. The advantage of PathAct in comparison with the conventional pathway analysis methods is that it can estimate pathway activity levels for individual patient quantitatively in the form of a pathway-by-sample matrix. This matrix can be used for further analysis such as hierarchical clustering and other analysis methods. To evaluate the feasibility of PathAct, comparison with frequently used gene-enrichment analysis methods was conducted using two public microarray datasets. The dataset #1 was that of breast cancer patients, and we investigated pathways associated with triple-negative breast cancer by PathAct, compared with those obtained by gene set enrichment analysis (GSEA). The dataset #2 was another breast cancer dataset with disease-free survival (DFS) of each patient. Contribution by each pathway to prognosis was investigated by our method as well as the Database for Annotation, Visualization and Integrated Discovery (DAVID) analysis. In the dataset #1, four out of the six pathways that satisfied p < 0.05 and FDR < 0.30 by GSEA were also included in those obtained by the PathAct method. For the dataset #2, two pathways (“Cell Cycle” and “DNA replication”) out of four pathways by PathAct were commonly identified by DAVID analysis. Thus, we confirmed a good degree of agreement among PathAct and conventional methods. Moreover, several applications of further statistical analyses such as hierarchical cluster analysis by pathway activity, correlation analysis and survival analysis between pathways were conducted.  相似文献   

13.
Exposure of MOLT4 human T-cell leukemia cells to 6-Mercaptopurine (6-MP) and 6-Thioguanine (6-TG) resulted in acquired resistance associated with attenuated expression of the genes encoding concentrative nucleoside transporter 3 (CNT3) and equilibrative nucleoside transporter 2 (ENT2). To identify other alterations at the RNA and DNA levels associated with 6-MP- and 6-TG resistance, we compared here the patterns of gene expression and DNA copy number profiles of resistant sublines to those of the parental wild-type cells. The mRNA levels for two nucleoside transporters were down-regulated in both of the thiopurine-resistant sublines. Moreover, both of these cell lines expressed genes encoding the enzymes of purine nucleotide composition and synthesis, including adenylate kinase 3-like 1 and guanosine monophosphate synthetase at significantly lower levels than wild-type cells. In addition, expression of the mRNA for a specialized DNA polymerase, human terminal transferase encoded by the terminal deoxynucleotidyl transferase (DNTT) gene, was 122- and 93-fold higher in 6-TG- and 6-MP-resistant cells, respectively. The varying responses to 6-MP- and 6-TG observed here may help identify novel cellular targets and modalities of resistance to thiopurines, as well as indicating new potential approaches to individualization therapy with these drugs.  相似文献   

14.
15.
The accumulation of DNA microarray data has now made it possible to use gene expression profiles to analyse expression data. A gene expression profile contains the expression data for a given gene over various samples, and can be contrasted with an expression signature, which contains the expression data for a single sample. Gene expression profiles are most revealing when samples are grouped appropriately, either by standard clinical or pathological categories or by categories discovered through cluster analysis techniques. Expression profiles can exist at various levels of abstraction, yielding information across various tissues or across diseases within a particular tissue. Hypothesis tests may be applied to expression profiles on a large scale to identify candidate genes of interest.  相似文献   

16.
Recent advances in high throughput technologies have generated an abundance of biological information, such as gene expression, protein-protein interaction, and metabolic data. These various types of data capture different aspects of the cellular response to environmental factors. Integrating data from different measurements enhances the ability of modeling frameworks to predict cellular function more accurately and can lead to a more coherent reconstruction of the underlying regulatory network structure. Different techniques, newly developed and borrowed, have been applied for the purpose of extracting this information from experimental data. In this study, we developed a framework to integrate metabolic and gene expression profiles for a hepatocellular system. Specifically, we applied genetic algorithm and partial least square analysis to identify important genes relevant to a specific cellular function. We identified genes 1) whose expression levels quantitatively predict a metabolic function and 2) that play a part in regulating a hepatocellular function and reconstructed their role in the metabolic network. The framework 1) preprocesses the gene expression data using statistical techniques, 2) selects genes using a genetic algorithm and couples them to a partial least squares analysis to predict cellular function, and 3) reconstructs, with the assistance of a literature search, the pathways that regulate cellular function, namely intracellular triglyceride and urea synthesis. This provides a framework for identifying cellular pathways that are active as a function of the environment and in turn helps to uncover the interplay between gene and metabolic networks.  相似文献   

17.
The recent appreciation of the role played by endogenous counterregulatory mechanisms in controlling the outcome of the host inflammatory response requires specific analysis of their spatial and temporal profiles. In this study, we have focused on the glucocorticoid-regulated anti-inflammatory mediator annexin 1. Induction of peritonitis in wild-type mice rapidly (4 h) produced the expected signs of inflammation, including marked activation of resident cells (e.g., mast cells), migration of blood-borne leukocytes, mirrored by blood neutrophilia. These changes subsided after 48-96 h. In annexin 1(null) mice, the peritonitis response was exaggerated ( approximately 40% at 4 h), with increased granulocyte migration and cytokine production. In blood leukocytes, annexin 1 gene expression was activated at 4, but not 24, h postzymosan, whereas protein levels were increased at both time points. Locally, endothelial and mast cell annexin 1 gene expression was not detectable in basal conditions, whereas it was switched on during the inflammatory response. The significance of annexin 1 system plasticity in the anti-inflammatory properties of dexamethasone was assessed. Clear induction of annexin 1 gene in response to dexamethasone treatment was evident in the circulating and migrated leukocytes, and in connective tissue mast cells; this was associated with the steroid failure to inhibit leukocyte trafficking, cytokine synthesis, and mast cell degranulation in the annexin 1(null) mouse. In conclusion, understanding how inflammation is brought under control will help clarify the complex interplay between pro- and anti-inflammatory pathways operating during the host response to injury and infection.  相似文献   

18.
Gait analysis has been widely used to examine the behavioral presentation of numerous neurological disorders. Thorough murine model evaluation of the subarachnoid hemorrhage (SAH)-associated gait deficits is missing. This study measures gait deficits using a clinically relevant murine model of SAH to examine associations between gait variability and SAH-associated gene expressions. A total of 159 dynamic and static gait parameters from the endovascular perforation murine model for simulating clinical human SAH were determined using the CatWalk system. Eighty gait parameters and the mRNA expression levels of 35 of the 88 SAH-associated genes were differentially regulated in the diseased models. Totals of 42 and 38 gait parameters correlated with the 35 SAH-associated genes positively and negatively with Pearson's correlation coefficients of >0.7 and <−0.7, respectively. p-SP1453 expression in the motor cortex in SAH animal models displays a significant correlation with a subset of gait parameters associated with muscular strength and coordination of limb movements. Our data highlights a strong correlation between gait variability and SAH-associated gene expression. p-SP1453 expression could act as a biomarker to monitor SAH pathological development and a therapeutic target for SAH.  相似文献   

19.
The genotype-phenotype (GP) map consists of developmental and physiological mechanisms mapping genetic onto phenotypic variation. It determines the distribution of heritable phenotypic variance on which selection can act. Comparative studies of morphology as well as of gene regulatory networks show that the GP map itself evolves, yet little is known about the actual evolutionary mechanisms involved. The study of such mechanisms requires exploring the variation in GP maps at the population level, which presently is easier to quantify by statistical genetic methods rather than by regulatory network structures. We focus on the evolution of pleiotropy, a major structural aspect of the GP map. Pleiotropic genes affect multiple traits and underlie genetic covariance between traits, often causing evolutionary constraints. Previous quantitative genetic studies have demonstrated population-level variation in pleiotropy in the form of loci, at which genotypes differ in the genetic covariation between traits. This variation can potentially fuel evolution of the GP map under selection and/or drift. Here, we propose a developmental mechanism underlying population genetic variation in covariance and test its predictions. Specifically, the mechanism predicts that the loci identified as responsible for genetic variation in pleiotropy are involved in trait-specific epistatic interactions. We test this prediction for loci affecting allometric relationships between traits in an advanced intercross between inbred mouse strains. The results consistently support the prediction. We further find a high degree of sign epistasis in these interactions, which we interpret as an indication of adaptive gene complexes within the diverged parental lines.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号