首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
ABSTRACT: BACKGROUND: In the postgenome era, a prediction of response to treatment could lead to better dose selection for patients in radiotherapy. To identify a radiosensitive gene signature and elucidate related signaling pathways, four different microarray experiments were reanalyzed before radiotherapy. RESULTS: Radiosensitivity profiling data using clonogenic assay and gene expression profiling data from four published microarray platforms applied to NCI-60 cancer cell panel were used. The survival fraction at 2 Gy (SF2, range from 0 to 1) was calculated as a measure of radiosensitivity and a linear regression model was applied to identify genes or a gene set with a correlation between expression and radiosensitivity (SF2). Radiosensitivity signature genes were identified using significant analysis of microarrays (SAM) and gene set analysis was performed using a global test using linear regression model. Using the radiation-related signaling pathway and identified genes, a genetic network was generated. According to SAM, 31 genes were identified as common to all the microarray platforms and therefore a common radiosensitivity signature. In gene set analysis, functions in the cell cycle, DNA replication, and cell junction, including adherence and gap junctions were related to radiosensitivity. The integrin, VEGF, MAPK, p53, JAK-STAT and Wnt signaling pathways were overrepresented in radiosensitivity. Significant genes including ACTN1, CCND1, HCLS1, ITGB5, PFN2, PTPRC, RAB13, and WAS, which are adhesion-related molecules that were identified by both SAM and gene set analysis, and showed interaction in the genetic network with the integrin signaling pathway. CONCLUSIONS: Integration of four different microarray experiments and gene selection using gene set analysis discovered possible target genes and pathways relevant to radiosensitivity. Our results suggested that the identified genes are candidates for radiosensitivity biomarkers and that integrin signaling via adhesion molecules could be a target for radiosensitization.  相似文献   

2.

Background  

Gene set enrichment analysis (GSEA) is a microarray data analysis method that uses predefined gene sets and ranks of genes to identify significant biological changes in microarray data sets. GSEA is especially useful when gene expression changes in a given microarray data set is minimal or moderate.  相似文献   

3.
4.
SurvJamda (Survival prediction by joint analysis of microarray data) is an R package that utilizes joint analysis of microarray gene expression data to predict patients' survival and risk assessment. Joint analysis can be performed by merging datasets or meta-analysis to increase the sample size and to improve survival prognosis. The prognosis performance derived from the combined datasets can be assessed to determine which feature selection approach, joint analysis method and bias estimation provide the most robust prognosis for a given set of datasets. AVAILABILITY: The survJamda package is available at the Comprehensive R Archive Network, http://cran.r-project.org. CONTACT: hyasrebi@yahoo.com.  相似文献   

5.
We constructed a 60-mer oligonucleotide microarray on the basis of benzene monooxygenase gene diversity to develop a new technology for simultaneous detection of the functional gene diversity in environmental samples. The diversity of the monooxygenase genes associated with benzene degradation was characterized. A new polymerase chain reaction (PCR) primer set was designed using conserved regions of benzene monooxygenase gene (BO12 primer) and used for PCR-clone library analysis along with a previously designed RDEG primer which targeted the different types of benzene monooxygenase gene. We obtained 20 types of amino acid sequences with the BO12 primer and 40 with the RDEG primer. Phylogenetic analysis of the sequences obtained suggested the large diversity of the benzene monooxygenase genes. A total of 87 60-mer probes specific for each operational taxonomical unit were designed and spotted on a microarray. When genomic DNAs of single strains were used in microarray hybridization assays, corresponding sequences were successfully detected by the microarray without any false-negative signals. Hybridization with soil DNA samples showed that the microarray was able to detect sequences that were not detected in clone libraries. Constructed microarray can be a useful tool for characterizing monooxygenase gene diversity in benzene degradation. Electronic supplementary material The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

6.
Interactive semisupervised learning for microarray analysis   总被引:3,自引:0,他引:3  
Microarray technology has generated vast amounts of gene expression data with distinct patterns. Based on the premise that genes of correlated functions tend to exhibit similar expression patterns, various machine learning methods have been applied to capture these specific patterns in microarray data. However, the discrepancy between the rich expression profiles and the limited knowledge of gene functions has been a major hurdle to the understanding of cellular networks. To bridge this gap so as to properly comprehend and interpret expression data, we introduce relevance feedback to microarray analysis and propose an interactive learning framework to incorporate the expert knowledge into the decision module. In order to find a good learning method and solve two intrinsic problems in microarray data, high dimensionality and small sample size, we also propose a semisupervised learning algorithm: kernel discriminant-EM (KDEM). This algorithm efficiently utilizes a large set of unlabeled data to compensate for the insufficiency of a small set of labeled data and it extends the linear algorithm in discriminant-EM (DEM) to a kernel algorithm to handle nonlinearly separable data in a lower dimensional space. The relevance feedback technique and KDEM together construct an efficient and effective interactive semisupervised learning framework for microarray analysis. Extensive experiments on the yeast cell cycle regulation data set and Plasmodium falciparum red blood cell cycle data set show the promise of this approach  相似文献   

7.
Aim: To understand soil benzene monooxygenase gene diversity by clone library construction and microarray profiling. Methods and Results: A primer set was designed, and benzene monooxygenase gene diversity was characterized in two benzene‐amended soils. The dominant sequence types in the clone libraries were distinct between the two soils, and both sequences were assigned to novel clusters. Monooxygenase gene richness and diversity increased after benzene degradation. Oligonucleotide probes for microarray analysis were designed to detect a number of sequenced clones and reported monooxygenase genes. The microarray detected several genes that were not detected in the clone libraries of the same samples. Six probes were detected in more than one soil. Conclusions: The primer set designed in this study successfully detected diverse benzene monooxygenase genes. The level of diversity may have increased because the degradation of benzene differed from soil to soil. Microarrays have great potential in the comprehensive detection of gene richness as well as the elucidation of key genes for degradation. Significance and Impact of the Study: This study introduces a new primer set that may be used to identify diverse benzene monooxygenase genes in the environment; moreover, it demonstrates the potential of microarray technology in the profiling of environmental samples.  相似文献   

8.
Identity gene expression in Proteus mirabilis   总被引:1,自引:0,他引:1  
Swarming colonies of independent Proteus mirabilis isolates recognize each other as foreign and do not merge together, whereas apposing swarms of clonal isolates merge with each other. Swarms of mutants with deletions in the ids gene cluster do not merge with their parent. Thus, ids genes are involved in the ability of P. mirabilis to distinguish self from nonself. Here we have characterized expression of the ids genes. We show that idsABCDEF genes are transcribed as an operon, and we define the promoter region upstream of idsA by deletion analysis. Expression of the ids operon increased in late logarithmic and early stationary phases and appeared to be bistable. Approaching swarms of nonself populations led to increased ids expression and increased the abundance of ids-expressing cells in the bimodal population. This information on ids gene expression provides a foundation for further understanding the molecular details of self-nonself discrimination in P. mirabilis.  相似文献   

9.
DNA microarray is an important tool for the study of gene activities but the resultant data consisting of thousands of points are error-prone. A serious limitation in microarray analysis is the unreliability of the data generated from low signal intensities. Such data may produce erroneous gene expression ratios and cause unnecessary validation or post-analysis follow-up tasks. In this study, we describe an approach based on normal mixture modeling for determining optimal signal intensity thresholds to identify reliable measurements of the microarray elements and subsequently eliminate false expression ratios. We used univariate and bivariate mixture modeling to segregate the microarray data into two classes, low signal intensity and reliable signal intensity populations, and applied Bayesian decision theory to find the optimal signal thresholds. The bivariate analysis approach was found to be more accurate than the univariate approach; both approaches were superior to a conventional method when validated against a reference set of biological data that consisted of true and false gene expression data. Elimination of unreliable signal intensities in microarray data should contribute to the quality of microarray data including reproducibility and reliability of gene expression ratios.  相似文献   

10.
Rice (Oryza sativa) feeds over half of the global population. A web-based integrated platform for rice microarray annotation and data analysis in various biological contexts is presented, which provides a convenient query for comprehensive annotation compared with similar databases. Coupled with existing rice microarray data, it provides online analysis methods from the perspective of bioinformatics. This comprehensive bioinformatics analysis platform is composed of five modules, including data retrieval, microarray annotation, sequence analysis, results visualization and data analysis. The BioChip module facilitates the retrieval of microarray data information via identifiers of “Probe Set ID”, “Locus ID” and “Analysis Name”. The BioAnno module is used to annotate the gene or probe set based on the gene function, the domain information, the KEGG biochemical and regulatory pathways and the potential microRNA which regulates the genes. The BioSeq module lists all of the related sequence information by a microarray probe set. The BioView module provides various visual results for the microarray data. The BioAnaly module is used to analyze the rice microarray’s data set.  相似文献   

11.
MOTIVATION: Time series experiments of cDNA microarrays have been commonly used in various biological studies and conducted under a lot of experimental factors. A popular approach of time series microarray analysis is to compare one gene with another in their expression profiles, and clustering expression sequences is a typical example. On the other hand, a practically important issue in gene expression is to identify the general timing difference that is caused by experimental factors. This type of difference can be extracted by comparing a set of time series expression profiles under a factor with those under another factor, and so it would be difficult to tackle this issue by using only a current approach for time series microarray analysis. RESULTS: We have developed a systematic method to capture the timing difference in gene expression under different experimental factors, based on hidden Markov models. Our model outputs a real-valued vector at each state and has a unique state transition diagram. The parameters of our model are trained from a given set of pairwise (generally multiplewise) expression sequences. We evaluated our model using synthetic as well as real microarray datasets. The results of our experiment indicate that our method worked favourably to identify the timing ordering under different experimental factors, such as that gene expression under heat shock tended to start earlier than that under oxidative stress. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

12.
MOTIVATION: Gene set analysis allows formal testing of subtle but coordinated changes in a group of genes, such as those defined by Gene Ontology (GO) or KEGG Pathway databases. We propose a new method for gene set analysis that is based on principal component analysis (PCA) of genes expression values in the gene set. PCA is an effective method for reducing high dimensionality and capture variations in gene expression values. However, one limitation with PCA is that the latent variable identified by the first PC may be unrelated to outcome. RESULTS: In the proposed supervised PCA (SPCA) model for gene set analysis, the PCs are estimated from a selected subset of genes that are associated with outcome. As outcome information is used in the gene selection step, this method is supervised, thus called the Supervised PCA model. Because of the gene selection step, test statistic in SPCA model can no longer be approximated well using t-distribution. We propose a two-component mixture distribution based on Gumbel exteme value distributions to account for the gene selection step. We show the proposed method compares favorably to currently available gene set analysis methods using simulated and real microarray data. SOFTWARE: The R code for the analysis used in this article are available upon request, we are currently working on implementing the proposed method in an R package.  相似文献   

13.
We have evaluated the performance characteristics of three quantitative gene expression technologies and correlated their expression measurements to those of five commercial microarray platforms, based on the MicroArray Quality Control (MAQC) data set. The limit of detection, assay range, precision, accuracy and fold-change correlations were assessed for 997 TaqMan Gene Expression Assays, 205 Standardized RT (Sta)RT-PCR assays and 244 QuantiGene assays. TaqMan is a registered trademark of Roche Molecular Systems, Inc. We observed high correlation between quantitative gene expression values and microarray platform results and found few discordant measurements among all platforms. The main cause of variability was differences in probe sequence and thus target location. A second source of variability was the limited and variable sensitivity of the different microarray platforms for detecting weakly expressed genes, which affected interplatform and intersite reproducibility of differentially expressed genes. From this analysis, we conclude that the MAQC microarray data set has been validated by alternative quantitative gene expression platforms thus supporting the use of microarray platforms for the quantitative characterization of gene expression.  相似文献   

14.
We present a new computational technique (a software implementation, data sets, and supplementary information are available at http://www.enm.bris.ac.uk/lpd/) which enables the probabilistic analysis of cDNA microarray data and we demonstrate its effectiveness in identifying features of biomedical importance. A hierarchical Bayesian model, called Latent Process Decomposition (LPD), is introduced in which each sample in the data set is represented as a combinatorial mixture over a finite set of latent processes, which are expected to correspond to biological processes. Parameters in the model are estimated using efficient variational methods. This type of probabilistic model is most appropriate for the interpretation of measurement data generated by cDNA microarray technology. For determining informative substructure in such data sets, the proposed model has several important advantages over the standard use of dendrograms. First, the ability to objectively assess the optimal number of sample clusters. Second, the ability to represent samples and gene expression levels using a common set of latent variables (dendrograms cluster samples and gene expression values separately which amounts to two distinct reduced space representations). Third, in constrast to standard cluster models, observations are not assigned to a single cluster and, thus, for example, gene expression levels are modeled via combinations of the latent processes identified by the algorithm. We show this new method compares favorably with alternative cluster analysis methods. To illustrate its potential, we apply the proposed technique to several microarray data sets for cancer. For these data sets it successfully decomposes the data into known subtypes and indicates possible further taxonomic subdivision in addition to highlighting, in a wholly unsupervised manner, the importance of certain genes which are known to be medically significant. To illustrate its wider applicability, we also illustrate its performance on a microarray data set for yeast.  相似文献   

15.
MOTIVATION: Microarray designs containing millions to hundreds of millions of probes that tile entire genomes are currently being released. Within the next 2 months, our group will release a microarray data set containing over 12,000,000 microarray measurements taken from 37 mouse tissues. A problem that will become increasingly significant in the upcoming era of genome-wide exon-tiling microarray experiments is the removal of cross-hybridization noise. We present a probabilistic generative model for cross-hybridization in microarray data and a corresponding variational learning method for cross-hybridization compensation, GenXHC, that reduces cross-hybridization noise by taking into account multiple sources for each mRNA expression level measurement, as well as prior knowledge of hybridization similarities between the nucleotide sequences of microarray probes and their target cDNAs. RESULTS: The algorithm is applied to a subset of an exon-resolution genome-wide Agilent microarray data set for chromosome 16 of Mus musculus and is found to produce statistically significant reductions in cross-hybridization noise. The denoised data is found to produce enrichment in multiple gene ontology-biological process (GO-BP) functional groups. The algorithm is found to outperform robust multi-array analysis, another method for cross-hybridization compensation.  相似文献   

16.
Gene-set analysis aims to identify differentially expressedgene sets (pathways) by a phenotype in DNA microarray studies.We review here important methodological aspects of gene-setanalysis and illustrate them with varying performance of severalmethods proposed in the literature. We emphasize the importanceof distinguishing between ‘self-contained’ versus‘competitive’ methods, following Goeman and Bühlmann.We also discuss reducing a gene set to its subset, consistingof ‘core members’ that chiefly contribute to thestatistical significance of the differential expression of theinitial gene set by phenotype. Significance analysis of microarrayfor gene-set reduction (SAM-GSR) can be used for an analyticalreduction of gene sets to their core subsets. We apply SAM-GSRon a microarray dataset for identifying biological gene sets(pathways) whose gene expressions are associated with p53 mutationin cancer cell lines. Codes to implement SAM-GSR in the statisticalpackage R can be downloaded from http://www.ualberta.ca/~yyasui/homepage.html.   相似文献   

17.
SUMMARY: MeSHer uses a simple statistical approach to identify biological concepts in the form of Medical Subject Headings (MeSH terms) obtained from the PubMed database that are significantly overrepresented within the identified gene set relative to those associated with the overall collection of genes on the underlying DNA microarray platform. As a demonstration, we apply this approach to gene lists acquired from a published study of the effects of angiotensin II (Ang II) treatment on cardiac gene expression and demonstrate that this approach can aid in the interpretation of the resulting 'significant' gene set. AVAILABILITY: The software is available at http://www.tm4.org. SUPPLEMENTARY INFORMATION: Results from the analysis of significant genes from the published Ang II study.  相似文献   

18.
MOTIVATION: There is a very large and growing level of effort toward improving the platforms, experiment designs, and data analysis methods for microarray expression profiling. Along with a growing richness in the approaches there is a growing confusion among most scientists as to how to make objective comparisons and choices between them for different applications. There is a need for a standard framework for the microarray community to compare and improve analytical and statistical methods. RESULTS: We report on a microarray data set comprising 204 in-situ synthesized oligonucleotide arrays, each hybridized with two-color cDNA samples derived from 20 different human tissues and cell lines. Design of the approximately 24 000 60mer oligonucleotides that report approximately 2500 known genes on the arrays, and design of the hybridization experiments, were carried out in a way that supports the performance assessment of alternative data processing approaches and of alternative experiment and array designs. We also propose standard figures of merit for success in detecting individual differential expression changes or expression levels, and for detecting similarities and differences in expression patterns across genes and experiments. We expect this data set and the proposed figures of merit will provide a standard framework for much of the microarray community to compare and improve many analytical and statistical methods relevant to microarray data analysis, including image processing, normalization, error modeling, combining of multiple reporters per gene, use of replicate experiments, and sample referencing schemes in measurements based on expression change. AVAILABILITY/SUPPLEMENTARY INFORMATION: Expression data and supplementary information are available at http://www.rii.com/publications/2003/HE_SDS.htm  相似文献   

19.
Prostate cancer is one of the most common male malignant neoplasms; however, its causes are not completely understood. A few recent studies have used gene expression profiling of prostate cancer to identify differentially expressed genes and possible relevant pathways. However, few studies have examined the genetic mechanics of prostate cancer at the pathway level to search for such pathways. We used gene set enrichment analysis and a meta-analysis of six independent studies after standardized microarray preprocessing, which increased concordance between these gene datasets. Based on gene set enrichment analysis, there were 12 down- and 25 up-regulated mixing pathways in more than two tissue datasets, while there were two down- and two up-regulated mixing pathways in three cell datasets. Based on the meta-analysis, there were 46 and nine common pathways in the tissue and cell datasets, respectively. Three up- and 10 down-regulated crossing pathways were detected with combined gene set enrichment analysis and meta-analysis. We found that genes with small changes are difficult to detect by classic univariate statistics; they can more easily be identified by pathway analysis. After standardized microarray preprocessing, we applied gene set enrichment analysis and a meta-analysis to increase the concordance in identifying biological mechanisms involved in prostate cancer. The gene pathways that we identified could provide insight concerning the development of prostate cancer.  相似文献   

20.
目的:研制猪链球菌2型(SS2)全基因组DNA芯片,建立SS2基因表达谱技术平台。方法:利用SS2全基因组序列,挑选出2194条基因,经PCR扩增出2156条基因并将产物纯化,点样制备芯片;将芯片用于表达谱研究,采用实时定量PCR验证表达谱结果,对芯片进行可靠性分析。结果:芯片杂交数据与实时定量PCR验证显示了较高的相关性,二者相关系数r=0.87。结论:研制了一批SS2全基因组DNA芯片,并建立了基于DNA芯片的表达谱技术平台。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号