首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Clustering techniques have been widely used in the analysis of microarray data to group genes with similar expression profiles. The similarity of expression profiles and hence the results of clustering greatly depend on how the data has been transformed. We present a method that uses the relative expression changes between pairs of conditions and an angular transformation to define the similarity of gene expression patterns. The pairwise comparisons of experimental conditions can be chosen to reflect the purpose of clustering allowing control the definition of similarity between genes. A variational Bayes mixture modeling approach is then used to find clusters within the transformed data. The purpose of microarray data analysis is often to locate groups genes showing particular patterns of expression change and within these groups to locate specific target genes that may warrant further experimental investigation. We show that the angular transformation maps data to a representation from which information, in terms of relative regulation changes, can be automatically mined. This information can be then be used to understand the "features" of expression change important to different clusters allowing potentially interesting clusters to be easily located. Finally, we show how the genes within a cluster can be visualized in terms of their expression pattern and intensity change, allowing potential target genes to be highlighted within the clusters of interest.  相似文献   

2.
3.
A nontubulogenic endothelial cell line, NP31, can be transformed by the active form of the Flt-1 kinase (BCR-FLTm1) into Tb3 cells, which show a tubulogenic property only when cultured in Matrigel. By utilizing this strict dependence of NP31 on BCR-FLTm1 and Matrigel for experimental angiogenesis, we performed microarray analyses under several conditions and found 97 genes whose dynamically regulated profiles of gene expression are divided into nine groups, in two major clusters. In one major cluster, gene expression is interdependently regulated by BCR-FLTm1 or Matrigel. The second major cluster contains genes whose expression patterns under BCR-FLTm1 influence are reversed by Matrigel. Based on these gene expression patterns in NP31 driven by BCR-FLTm1 and/or Matrigel, we propose a model in which sequential and alternate stimulation by BCR-FLTm1 and Matrigel induces cooperative regulation of subsets of genes. Microarray analyses of Tb3 under 11 different conditions revealed 5 candidate genes whose gene expression regulation is most closely associated with tubulogenesis.  相似文献   

4.
5.
Time course microarray experiments designed to characterize the dynamic regulation of gene expression in biological systems are becoming increasingly important. One critical issue that arises when examining time course microarray data is the identification of genes that show different temporal expression patterns among biological conditions. Here we propose a Bayesian hierarchical model to incorporate important experimental factors and to account for correlated gene expression measurements over time and over different genes. A new gene selection algorithm is also presented with the model to simultaneously identify genes that show changes in expression among biological conditions, in response to time and other experimental factors of interest. The algorithm performs well in terms of the false positive and false negative rates in simulation studies. The methodology is applied to a mouse model time course experiment to correlate temporal changes in azoxymethane-induced gene expression profiles with colorectal cancer susceptibility.  相似文献   

6.
Clustering methods for microarray gene expression data   总被引:1,自引:0,他引:1  
Within the field of genomics, microarray technologies have become a powerful technique for simultaneously monitoring the expression patterns of thousands of genes under different sets of conditions. A main task now is to propose analytical methods to identify groups of genes that manifest similar expression patterns and are activated by similar conditions. The corresponding analysis problem is to cluster multi-condition gene expression data. The purpose of this paper is to present a general view of clustering techniques used in microarray gene expression data analysis.  相似文献   

7.
8.
MOTIVATION: Identifying groups of co-regulated genes by monitoring their expression over various experimental conditions is complicated by the fact that such co-regulation is condition-specific. Ignoring the context-specific nature of co-regulation significantly reduces the ability of clustering procedures to detect co-expressed genes due to additional 'noise' introduced by non-informative measurements. RESULTS: We have developed a novel Bayesian hierarchical model and corresponding computational algorithms for clustering gene expression profiles across diverse experimental conditions and studies that accounts for context-specificity of gene expression patterns. The model is based on the Bayesian infinite mixtures framework and does not require a priori specification of the number of clusters. We demonstrate that explicit modeling of context-specificity results in increased accuracy of the cluster analysis by examining the specificity and sensitivity of clusters in microarray data. We also demonstrate that probabilities of co-expression derived from the posterior distribution of clusterings are valid estimates of statistical significance of created clusters. AVAILABILITY: The open-source package gimm is available at http://eh3.uc.edu/gimm.  相似文献   

9.
Rat spinal cord contusion injury models the histopathology associated with much clinical spinal cord injury (SCI). Studies on altered gene expression after SCI in these models may identify therapeutic targets for reducing secondary injury after the initial trauma and/or enhancing recovery processes. However, complex spatial and temporal alterations after injury could complicate interpretation of changes in gene expression. To test this hypothesis, we selected six genes and studied their temporal and spatial patterns of expression at 1 h, 1, 3 and 7 days after a standardized spinal cord contusion produced by a weight drop device (10 g x 25 mm at T8). Real-time RT-PCR using TaqMan probes was employed to quantify mRNA for proteolipid protein, glyceraldehyde-3-phosphate dehydrogenase, glial fibrillary acidic protein, nestin, and the GluR2 and NR1 subunits of glutamate receptors. We found widely different temporal and spatial patterns of altered gene expression after SCI, including instances of opposing up- and down-regulation at different locations in tissue immediately adjacent to the injury site. We conclude that greater use of the reliable and extremely sensitive technique of quantitative real-time PCR for regional tissue analysis is important for understanding the altered gene expression that occurs after CNS trauma.  相似文献   

10.
11.

Background  

The underlying goal of microarray experiments is to identify gene expression patterns across different experimental conditions. Genes that are contained in a particular pathway or that respond similarly to experimental conditions could be co-expressed and show similar patterns of expression on a microarray. Using any of a variety of clustering methods or gene network analyses we can partition genes of interest into groups, clusters, or modules based on measures of similarity. Typically, Pearson correlation is used to measure distance (or similarity) before implementing a clustering algorithm. Pearson correlation is quite susceptible to outliers, however, an unfortunate characteristic when dealing with microarray data (well known to be typically quite noisy.)  相似文献   

12.
You C  Dai X  Li X  Wang L  Chen G  Xiao J  Wu C 《Plant molecular biology》2010,74(6):617-629
Leucine-rich repeat proteins constitute a large gene family and play important roles in plant growth and development. Among them, Arabidopsis PIRL is a plant-specific class of intracellular Ras-group-related leucine-rich repeat proteins. In this study, we identified eight homologues of PIRLs in rice and designated them as OsIRL proteins. We described the gene structures, chromosome localizations, protein motifs, and phylogenetic relationships of the OsIRL gene family. The expression profiles of OsIRL genes were analyzed throughout the entire rice life cycle, along with light and three hormone stress conditions, using quantitative RT-PCR and microarray data. All OsIRL genes were expressed in at least one experimental stage and exhibited divergent expression patterns, with several genes showing preferential expression at specific stages. OsIRL4 and OsIRL5 showed higher expression levels under light compared to dark. OsIRL4 and OsIRL7 exhibited significant differential expression in response to hormone treatments. Six T-DNA or Tos17 insertion lines for five individual OsIRL genes were identified and examined morphologically. The comprehensive expression profile elucidated in this investigation together with the characterized insertion lines will provide a solid foundation for in-depth dissection of OsIRL functions.  相似文献   

13.
It has been well established that gene expression data contain large amounts of random variation that affects both the analysis and the results of microarray experiments. Typically, microarray data are either tested for differential expression between conditions or grouped on the basis of profiles that are assessed temporally or across genetic or environmental conditions. While testing differential expression relies on levels of certainty to evaluate the relative worth of various analyses, cluster analysis is exploratory in nature and has not had the benefit of any judgment of statistical inference. By using a novel dissimilarity function to ascertain gene expression clusters and conditional randomization of the data space to illuminate distinctions between statistically significant clusters of gene expression patterns, we aim to provide a level of confidence to inferred clusters of gene expression data. We apply both permutation and convex hull approaches for randomization of the data space and show that both methods can provide an effective assessment of gene expression profiles whose coregulation is statistically different from that expected by random chance alone.  相似文献   

14.
MOTIVATION: Recent advances in DNA microarray technologies have made it possible to measure the expression levels of thousands of genes simultaneously under different conditions. The data obtained by microarray analyses are called expression profile data. One type of important information underlying the expression profile data is the 'genetic network,' that is, the regulatory network among genes. Graphical Gaussian Modeling (GGM) is a widely utilized method to infer or test relationships among a plural of variables. RESULTS: In this study, we developed a method combining the cluster analysis with GGM for the inference of the genetic network from the expression profile data. The expression profile data of 2467 Saccharomyces cerevisiae genes measured under 79 different conditions (Eisen et al., PROC: Natl Acad. Sci. USA, 95, 14683-14868, 1998) were used for this study. At first, the 2467 genes were classified into 34 clusters by a cluster analysis, as a preprocessing for GGM. Then, the expression levels of the genes in each cluster were averaged for each condition. The averaged expression profile data of 34 clusters were subjected to GGM, and a partial correlation coefficient matrix was obtained as a model of the genetic network of S. cerevisiae. The accuracy of the inferred network was examined by the agreement of our results with the cumulative results of experimental studies.  相似文献   

15.
The identification of the genes regulating neural progenitor cell (NPC) functions is of great importance to developmental neuroscience and neural repair. Previously, we combined genetic subtraction and microarray analysis to identify genes enriched in neural progenitor cultures. Here, we apply a strategy to further stratify the neural progenitor genes. In situ hybridization demonstrates expression in the central nervous system germinal zones of 54 clones so identified, making them highly relevant for study in brain and neural progenitor development. Using microarray analysis we find 73 genes enriched in three neural stem cell (NSC)-containing populations generated under different conditions. We use the custom microarray to identify 38 "stemness" genes, with enriched expression in the three NSC conditions and present in both embryonic stem cells and hematopoietic stem cells. However, comparison of expression profiles from these stem cell populations indicates that while there is shared gene expression, the amount of genetic overlap is no more than what would be expected by chance, indicating that different stem cells have largely different gene expression patterns. Taken together, these studies identify many genes not previously associated with neural progenitor cell biology and also provide a rational scheme for stratification of microarray data for functional analysis.  相似文献   

16.

Background

A tremendous amount of efforts have been devoted to identifying genes for diagnosis and prognosis of diseases using microarray gene expression data. It has been demonstrated that gene expression data have cluster structure, where the clusters consist of co-regulated genes which tend to have coordinated functions. However, most available statistical methods for gene selection do not take into consideration the cluster structure.

Results

We propose a supervised group Lasso approach that takes into account the cluster structure in gene expression data for gene selection and predictive model building. For gene expression data without biological cluster information, we first divide genes into clusters using the K-means approach and determine the optimal number of clusters using the Gap method. The supervised group Lasso consists of two steps. In the first step, we identify important genes within each cluster using the Lasso method. In the second step, we select important clusters using the group Lasso. Tuning parameters are determined using V-fold cross validation at both steps to allow for further flexibility. Prediction performance is evaluated using leave-one-out cross validation. We apply the proposed method to disease classification and survival analysis with microarray data.

Conclusion

We analyze four microarray data sets using the proposed approach: two cancer data sets with binary cancer occurrence as outcomes and two lymphoma data sets with survival outcomes. The results show that the proposed approach is capable of identifying a small number of influential gene clusters and important genes within those clusters, and has better prediction performance than existing methods.  相似文献   

17.
18.
This report demonstrates that the genes in the murine Hox-2 cluster display spatially and temporally dynamic patterns of expression in the transverse plane of the developing CNS. All of the Hox-2 genes exhibit changing patterns of expression that reflect events during the ontogeny of the CNS. The observed expression correlates with the timing and location of the birth of major classes of neurons in the spinal cord. Therefore, it is suggested that the Hox-2 genes act to confer rostrocaudal positional information on each successive class of newly born neurons. This analysis has also revealed a striking dorsal restriction in the patterns of Hox-2 expression in the spinal cord between 12.5 and 14.5 days of gestation, which does not appear to correlate with any morphological structure. The cellular retinol binding protein (CRBP) shows a complementary ventral staining pattern, suggesting that a number of genes are dorsoventrally restricted during the development of the CNS. The expression of Hox-2 genes has also been compared with the Hox-3.1 gene, which exhibits a markedly different dorsoventral pattern of expression. This suggests that, while genes in the different murine Hox clusters may have similar A-P domains of expression, they are responding to different dorsoventral patterning signals in the developing spinal cord.  相似文献   

19.
In contrast to mammals, salamanders have a remarkable ability to regenerate their spinal cord and recover full movement and function after tail amputation. To identify genes that may be associated with this greater regenerative ability, we designed an oligonucleotide microarray and profiled early gene expression during natural spinal cord regeneration in Ambystoma mexicanum. We sampled tissue at five early time points after tail amputation and identified genes that registered significant changes in mRNA abundance during the first 7 days of regeneration. A list of 1036 statistically significant genes was identified. Additional statistical and fold change criteria were applied to identify a smaller list of 360 genes that were used to describe predominant expression patterns and gene functions. Our results show that a diverse injury response is activated in concert with extracellular matrix remodeling mechanisms during the early acute phase of natural spinal cord regeneration. We also report gene expression similarities and differences between our study and studies that have profiled gene expression after spinal cord injury in rat. Our study illustrates the utility of a salamander model for identifying genes and gene functions that may enhance regenerative ability in mammals.  相似文献   

20.
Fuzzy J-Means and VNS methods for clustering genes from microarray data   总被引:4,自引:0,他引:4  
MOTIVATION: In the interpretation of gene expression data from a group of microarray experiments that include samples from either different patients or conditions, special consideration must be given to the pleiotropic and epistatic roles of genes, as observed in the variation of gene coexpression patterns. Crisp clustering methods assign each gene to one cluster, thereby omitting information about the multiple roles of genes. RESULTS: Here, we present the application of a local search heuristic, Fuzzy J-Means, embedded into the variable neighborhood search metaheuristic for the clustering of microarray gene expression data. We show that for all the datasets studied this algorithm outperforms the standard Fuzzy C-Means heuristic. Different methods for the utilization of cluster membership information in determining gene coregulation are presented. The clustering and data analyses were performed on simulated datasets as well as experimental cDNA microarray data for breast cancer and human blood from the Stanford Microarray Database. AVAILABILITY: The source code of the clustering software (C programming language) is freely available from Nabil.Belacel@nrc-cnrc.gc.ca  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号