首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
MOTIVATION: Despite the growing literature devoted to finding differentially expressed genes in assays probing different tissues types, little attention has been paid to the combinatorial nature of feature selection inherent to large, high-dimensional gene expression datasets. New flexible data analysis approaches capable of searching relevant subgroups of genes and experiments are needed to understand multivariate associations of gene expression patterns with observed phenotypes. RESULTS: We present in detail a deterministic algorithm to discover patterns of multivariate gene associations in gene expression data. The patterns discovered are differential with respect to a control dataset. The algorithm is exhaustive and efficient, reporting all existent patterns that fit a given input parameter set while avoiding enumeration of the entire pattern space. The value of the pattern discovery approach is demonstrated by finding a set of genes that differentiate between two types of lymphoma. Moreover, these genes are found to behave consistently in an independent dataset produced in a different laboratory using different arrays, thus validating the genes selected using our algorithm. We show that the genes deemed significant in terms of their multivariate statistics will be missed using other methods. AVAILABILITY: Our set of pattern discovery algorithms including a user interface is distributed as a package called Genes@Work. This package is freely available to non-commercial users and can be downloaded from our website (http://www.research.ibm.com/FunGen).  相似文献   

2.
MOTIVATION: To construct an integrated map of Drosophila segmentation gene expression from partial data taken from individual embryos. RESULTS: Spline and wavelet based registration techniques were developed to register Drosophila segmentation gene expression data. As ground control points for registration we used the locations of extrema on gene expression patterns, represented in 1D. The registration method was characterized by unprecedented high accuracy. A method for constructing the integrated pattern of gene expression at cellular resolution was designed. These patterns were constructed for 9 segmentation genes belonging to gap and pair-rule classes.  相似文献   

3.
4.
5.
MOTIVATIONS AND RESULTS: Gene groups that are significantly related to a disease can be detected by conducting a series of gene expression experiments. This work is aimed at discovering special types of gene groups that satisfy the following property. In each group, its member genes are found to be one-to-one contained in pre-determined intervals of gene expression level with a large frequency in one class of cells but are never found unanimously in these intervals in the other class of cells. We call these gene groups emerging patterns, to emphasize the patterns' frequency changes between two classes of cells. We use effective discretization and gene selection methods to obtain the most discriminatory genes. We also use efficient algorithms to derive the patterns from these genes. According to our studies on the ALL/AML dataset and the colon tumor dataset, some patterns, which consist of one or more genes, can reach a high frequency of 90%, or even 100%. In other words, they nearly or fully dominate one class of cells, even though they rarely occur in the other class. The discovered patterns are used to classify new cells with a higher accuracy than other reported methods. Based on these patterns, we also conjecture the possibility of a personalized treatment plan which converts colon tumor cells into normal cells by modulating the expression levels of a few genes.  相似文献   

6.
7.
MOTIVATION: To understand cancer etiology, it is important to explore molecular changes in cellular processes from normal state to cancerous state. Because genes interact with each other during cellular processes, carcinogenesis related genes may form differential co-expression patterns with other genes in different cell states. In this study, we develop a statistical method for identifying differential gene-gene co-expression patterns in different cell states. RESULTS: For efficient pattern recognition, we extend the traditional F-statistic and obtain an Expected Conditional F-statistic (ECF-statistic), which incorporates statistical information of location and correlation. We also propose a statistical method for data transformation. Our approach is applied to a microarray gene expression dataset for prostate cancer study. For a gene of interest, our method can select other genes that have differential gene-gene co-expression patterns with this gene in different cell states. The 10 most frequently selected genes, include hepsin, GSTP1 and AMACR, which have recently been proposed to be associated with prostate carcinogenesis. However, genes GSTP1 and AMACR cannot be identified by studying differential gene expression alone. By using tumor suppressor genes TP53, PTEN and RB1, we identify seven genes that also include hepsin, GSTP1 and AMACR. We show that genes associated with cancer may have differential gene-gene expression patterns with many other genes in different cell states. By discovering such patterns, we may be able to identify carcinogenesis related genes.  相似文献   

8.
Summary Gene co‐expressions have been widely used in the analysis of microarray gene expression data. However, the co‐expression patterns between two genes can be mediated by cellular states, as reflected by expression of other genes, single nucleotide polymorphisms, and activity of protein kinases. In this article, we introduce a bivariate conditional normal model for identifying the variables that can mediate the co‐expression patterns between two genes. Based on this model, we introduce a likelihood ratio (LR) test and a penalized likelihood procedure for identifying the mediators that affect gene co‐expression patterns. We propose an efficient computational algorithm based on iterative reweighted least squares and cyclic coordinate descent and have shown that when the tuning parameter in the penalized likelihood is appropriately selected, such a procedure has the oracle property in selecting the variables. We present simulation results to compare with existing methods and show that the LR‐based approach can perform similarly or better than the existing method of liquid association and the penalized likelihood procedure can be quite effective in selecting the mediators. We apply the proposed method to yeast gene expression data in order to identify the kinases or single nucleotide polymorphisms that mediate the co‐expression patterns between genes.  相似文献   

9.
10.
MOTIVATION: Microarrays have become a central tool in biological research. Their applications range from functional annotation to tissue classification and genetic network inference. A key step in the analysis of gene expression data is the identification of groups of genes that manifest similar expression patterns. This translates to the algorithmic problem of clustering genes based on their expression patterns. RESULTS: We present a novel clustering algorithm, called CLICK, and its applications to gene expression analysis. The algorithm utilizes graph-theoretic and statistical techniques to identify tight groups (kernels) of highly similar elements, which are likely to belong to the same true cluster. Several heuristic procedures are then used to expand the kernels into the full clusters. We report on the application of CLICK to a variety of gene expression data sets. In all those applications it outperformed extant algorithms according to several common figures of merit. We also point out that CLICK can be successfully used for the identification of common regulatory motifs in the upstream regions of co-regulated genes. Furthermore, we demonstrate how CLICK can be used to accurately classify tissue samples into disease types, based on their expression profiles. Finally, we present a new java-based graphical tool, called EXPANDER, for gene expression analysis and visualization, which incorporates CLICK and several other popular clustering algorithms. AVAILABILITY: http://www.cs.tau.ac.il/~rshamir/expander/expander.html  相似文献   

11.
The large variety of clustering algorithms and their variants can be daunting to researchers wishing to explore patterns within their microarray datasets. Furthermore, each clustering method has distinct biases in finding patterns within the data, and clusterings may not be reproducible across different algorithms. A consensus approach utilizing multiple algorithms can show where the various methods agree and expose robust patterns within the data. In this paper, we present a software package - Consense, written for R/Bioconductor - that utilizes such an approach to explore microarray datasets. Consense produces clustering results for each of the clustering methods and produces a report of metrics comparing the individual clusterings. A feature of Consense is identification of genes that cluster consistently with an index gene across methods. Utilizing simulated microarray data, sensitivity of the metrics to the biases of the different clustering algorithms is explored. The framework is easily extensible, allowing this tool to be used by other functional genomic data types, as well as other high-throughput OMICS data types generated from metabolomic and proteomic experiments. It also provides a flexible environment to benchmark new clustering algorithms. Consense is currently available as an installable R/Bioconductor package (http://www.ohsucancer.com/isrdev/consense/).  相似文献   

12.
MOTIVATION: Finding differentially expressed genes is a fundamental objective of a microarray experiment. Numerous methods have been proposed to perform this task. Existing methods are based on point estimates of gene expression level obtained from each microarray experiment. This approach discards potentially useful information about measurement error that can be obtained from an appropriate probe-level analysis. Probabilistic probe-level models can be used to measure gene expression and also provide a level of uncertainty in this measurement. This probe-level measurement error provides useful information which can help in the identification of differentially expressed genes. RESULTS: We propose a Bayesian method to include probe-level measurement error into the detection of differentially expressed genes from replicated experiments. A variational approximation is used for efficient parameter estimation. We compare this approximation with MAP and MCMC parameter estimation in terms of computational efficiency and accuracy. The method is used to calculate the probability of positive log-ratio (PPLR) of expression levels between conditions. Using the measurements from a recently developed Affymetrix probe-level model, multi-mgMOS, we test PPLR on a spike-in dataset and a mouse time-course dataset. Results show that the inclusion of probe-level measurement error improves accuracy in detecting differential gene expression. AVAILABILITY: The MAP approximation and variational inference described in this paper have been implemented in an R package pplr. The MCMC method is implemented in Matlab. Both software are available from http://umber.sbs.man.ac.uk/resources/puma.  相似文献   

13.
A nontubulogenic endothelial cell line, NP31, can be transformed by the active form of the Flt-1 kinase (BCR-FLTm1) into Tb3 cells, which show a tubulogenic property only when cultured in Matrigel. By utilizing this strict dependence of NP31 on BCR-FLTm1 and Matrigel for experimental angiogenesis, we performed microarray analyses under several conditions and found 97 genes whose dynamically regulated profiles of gene expression are divided into nine groups, in two major clusters. In one major cluster, gene expression is interdependently regulated by BCR-FLTm1 or Matrigel. The second major cluster contains genes whose expression patterns under BCR-FLTm1 influence are reversed by Matrigel. Based on these gene expression patterns in NP31 driven by BCR-FLTm1 and/or Matrigel, we propose a model in which sequential and alternate stimulation by BCR-FLTm1 and Matrigel induces cooperative regulation of subsets of genes. Microarray analyses of Tb3 under 11 different conditions revealed 5 candidate genes whose gene expression regulation is most closely associated with tubulogenesis.  相似文献   

14.
A key step in Drosophila segmentation is the establishment of periodic patterns of pair-rule gene expression in response to gap gene products. From an examination of the distribution of gap and pair-rule proteins in various mutants, we conclude that the on/off periodicity of pair-rule stripes depends on both the exact concentrations and combinations of gap proteins expressed in different embryonic cells. It has been suggested that the distribution of gap gene products depends on cross-regulatory interactions among these genes. Here we provide evidence that autoregulation also plays an important role in this process since there is a reduction in the levels of Kruppel (Kr) RNA and protein in a Kr null mutant. Once initiated by the gap genes each pair-rule stripe is bell shaped and has ill-defined margins. By the end of the fourteenth nuclear division cycle, the stripes of the pair-rule gene even-skipped (eve) sharpen and polarize, a process that is essential for the precisely localized expression of segment polarity genes. This sharpening process appears to depend on a threshold response of the eve promoter to the combinatorial action of eve and a second pair-rule gene hairy. The eve and hairy expression patterns overlap but are out of register and the cells of maximal overlap form the anterior margin of the polarized eve stripe. We propose that the relative placement of the eve and hairy stripes may be an important factor in the initiation of segment polarity.  相似文献   

15.
Both autonomously functioning thyroid nodules (AFTNs) and cold thyroid nodules (CTNs) are characterized by an increased proliferation, however, they have opposite functional activities. Therefore, with the aim to further understand the distinct molecular pathology of each entity and to discover common mechanisms like those leading to increased proliferation in both, AFTNs and CTNs, we now compared gene expression of AFTNs and CTNs with in vitro model systems (TSH-stimulated and ras-transfected primary cultures (PC)) whose gene expression patterns can be attributed to specific molecular alterations. Since combinations of co-regulated genes are more likely to reveal molecular mechanisms, we used a procedure which groups co-regulated genes within "gene sets". We found a co-regulated gene set in the AFTNs that overlaps with differential expression in TSH-stimulated PCs but not in CTNs or ras-transfected PCs. In addition to thyroid peroxidase and sialyltransferase 1, this set of co-regulated genes comprises metallothioneins and the G-protein-coupled receptor 56. Although their role in the thyroid is unknown so far, their appearance in one group indicates a functional relevance in TSH-TSH receptor-stimulated mechanisms. Furthermore, we identified down-regulated gene sets with concordant expression patterns in AFTNs, CTNs and ras-transfected PCs. However, these expression patterns are not of relevance in the TSH-stimulated PCs. These findings suggest that TSH-stimulated PCs can be used as a model of increased thyroid function (AFTNs), whereas the ras-transfected PCs better reflect the increased proliferation of both AFTNs and CTNs.  相似文献   

16.
17.
Vertebrate innate immunity is the first line of defense against an invading pathogen and has long been assumed to be largely unspecific with respect to parasite/pathogen species. However, recent phenotypic evidence suggests that immunogenetic variation, i.e. allelic variability in genes associated with the immune system, results in host-parasite genotype-by-genotype interactions and thus specific innate immune responses. Immunogenetic variation is common in all vertebrate taxa and this reflects an effective immunological function in complex environments. However, the underlying variability in host gene expression patterns as response of innate immunity to within-species genetic diversity of macroparasites in vertebrates is unknown. We hypothesized that intra-specific variation among parasite genotypes must be reflected in host gene expression patterns. Here we used high-throughput RNA-sequencing to examine the effect of parasite genotypes on gene expression patterns of a vertebrate host, the three-spined stickleback (Gasterosteus aculeatus). By infecting naïve fish with distinct trematode genotypes of the species Diplostomum pseudospathaceum we show that gene activity of innate immunity in three-spined sticklebacks depended on the identity of an infecting macroparasite genotype. In addition to a suite of genes indicative for a general response against the trematode we also find parasite-strain specific gene expression, in particular in the complement system genes, despite similar infection rates of single clone treatments. The observed discrepancy between infection rates and gene expression indicates the presence of alternative pathways which execute similar functions. This suggests that the innate immune system can induce redundant responses specific to parasite genotypes.  相似文献   

18.
19.
Organisms maintain homeostasis and abate cellular damage by altering gene expression. Coral colonies have been shown to produce unique gene expression patterns in response to different environmental stimuli. In order to understand these induced changes, the natural variation in expression of genetic biomarkers needs to be determined. In this study, an array of genes isolated from Scleractinian coral was used to track changes in gene expression within a population of Montastraea faveolata from April to October 2001 in the Florida Keys. The profiles of genes observed in this study can be divided into two groups based on expression over this time period. In spring and early summer, May through July, most of the genes show little deviation from their average level of expression. In August and September, several genes show large deviations from their average level of expression. The physiological and environmental triggers for the observed changes in gene expression have not yet been identified, but the results show that our coral stress gene array can be used to track temporal changes in gene expression in a natural coral population.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号