首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
We present a new WWW-based tool for plant gene analysis, the Arabidopsis Co-Expression Tool (ACT), based on a large Arabidopsis thaliana microarray data set obtained from the Nottingham Arabidopsis Stock Centre. The co-expression analysis tool allows users to identify genes whose expression patterns are correlated across selected experiments or the complete data set. Results are accompanied by estimates of the statistical significance of the correlation relationships, expressed as probability (P) and expectation (E) values. Additionally, highly ranked genes on a correlation list can be examined using the novel clique finder tool to determine the sets of genes most likely to be regulated in a similar manner. In combination, these tools offer three levels of analysis: creation of correlation lists of co-expressed genes, refinement of these lists using two-dimensional scatter plots, and dissection into cliques of co-regulated genes. We illustrate the applications of the software by analysing genes encoding functionally related proteins, as well as pathways involved in plant responses to environmental stimuli. These analyses demonstrate novel biological relationships underlying the observed gene co-expression patterns. To demonstrate the ability of the software to develop testable hypotheses on gene function within a defined biological process we have used the example of cell wall biosynthesis genes. The resource is freely available at http://www.arabidopsis.leeds.ac.uk/ACT/  相似文献   

2.
To enhance glioblastoma (GB) marker discovery, we compared gene expression in GB with human normal brain (NB) by accessing the SAGE Genie web site and compared the results with published data. Nine GB and five NB SAGE libraries were analyzed using the Digital Gene Expression Displayer (DGED); the results of DGED were tested by Northern blot analysis and RT-PCR of arbitrarily selected genes. Review of available data from the articles on gene expression profiling by microarray-based hybridization showed as few as 35 overlapped genes with increased expression in GB. Some of them were identified in four articles, but most genes were identified in three or even in two investigations. Some differences were also found between SAGE results of GB analysis. The Digital Gene Expression Displayer approach revealed 676 genes differentially expressed in GB vs. NB with cutoff ratio: twofold change and P ≤ 05. Differential expression of selected genes obtained by DGED was confirmed by Northern analysis and RT-PCR. Altogether, only 105 of 955 genes presented in published investigations were among the genes obtained by DGED. Comparison of the results obtained by microarrays and SAGE is very complicated because the authors present only the most prominent differentially expressed genes. However, even available data give quite poor overlapping of genes revealed by microarrays. Some differences between results obtained by SAGE in different investigations can be explained by high dependence on the statistical methods used. As for now, the best solution to search for molecular tumor markers is to compare all available results and to select only those genes where significant expression in tumors combined with very low expression in normal tissues was reproduced in several articles. One hundred five differentially expressed genes, common to both methods, can be included in the list of candidates for the molecular typing of GBs. Some genes, encoded cell surface or extracellular proteins may be useful for targeting gliomas with antibody-based therapy. The text was submitted by the authors in English.  相似文献   

3.
Recent developments of single molecule detection techniques and in particular the introduction of fluorescence correlation spectroscopy (FCS) led to a number of important applications in biological research. We present a unique approach for the gene expression analysis using dual-color cross-correlation. The expression assay is based on gene-specific hybridization of two dye-labeled DNA probes to a selected target gene. The counting of the dual-labeled molecules within the solution allows the quantification of the expressed gene copies in absolute numbers. As detection and analysis by FCS can be performed at the level of single molecules, there is no need for any type of amplification. We describe the gene expression assay and present data demonstrating the capacity of this novel technology. In order to prove the gene specificity, we performed experiments with gene-depleted total cDNA. The biological application was demonstrated by quantifying selected high, medium and low abundant genes in cDNA prepared from HL-60 cells.  相似文献   

4.
5.
Seventeen metals were measured in scalp hair samples from cerebral palsy patients (CPPs) and controls. Samples were collected from 95 CPPs and 93 controls. The nitric acid-perchloric acid wet digestion procedure was used for quantification of the selected metals by flame atomic absorption spectrophotometry. The concentrations of Ag, Ca, Cd, Co, Cr, Li, and Mg were significantly higher and those of Cu, Fe, K, Mn, Na, Ni, Pb, and Sb were lower in the hair of CPPs compared with controls. A strong positive correlation was found between Ca and Mg in the hair of controls but not in that of CPPs. Antimony was found significantly negative in terms of its correlation with Co and Cu in CPPs group but not in the controls. Principal component analysis (PCA) of the data extracted seven factors for CPPs and six factors for controls. Cluster analysis (CA) was also used to support the PCA results. The study evidenced some specific source of Mg and Sb in the hair of CPPs.  相似文献   

6.
Lung cancer is the most talked about cancer in the world. It is also one of the cancers that currently has a high mortality rate. The aim of our research is to find more effective therapeutic targets and prognostic markers for human lung cancer. First, we download gene expression data from the GEO database. We performed weighted co-expression network analysis on the selected genes, we then constructed scale-free networks and topological overlap matrices, and performed correlation modular analysis with the cancer group. We screened the 200 genes with the highest correlation in the cyan module for functional enrichment analysis and protein interaction network construction, found that most of them focused on cell division, tumor necrosis factor-mediated signaling pathways, cellular redox homeostasis, reactive oxygen species biosynthesis, and other processes, and were related to the cell cycle, apoptosis, HIF-1 signaling pathway, p53 signaling pathway, NF-κB signaling pathway, and several cancer disease pathways are involved. Finally, we used the GEPIA website data to perform survival analysis on some of the genes with GS > 0.6 in the cyan module. CBX3, AHCY, MRPL12, TPGB, TUBG1, KIF11, LRRC59, MRPL17, TMEM106B, ZWINT, TRIP13, and HMMR was identified as an important prognostic factor for lung cancer patients. In summary, we identified 12 mRNAs associated with lung cancer prognosis. Our study contributes to a deeper understanding of the molecular mechanisms of lung cancer and provides new insights into drug use and prognosis.  相似文献   

7.
Benthic invertebrate data from thirty-nine lakes in south-central Ontario were analyzed to determine the effect of choosing particular data standardizations, resemblance measures, and ordination methods on the resultant multivariate summaries. Logarithmic-transformed, 0–1 scaled, and ranked data were used as standardized variables with resemblance measures of Bray-Curtis, Euclidean distance, cosine distance, correlation, covariance and chi-squared distance. Combinations of these measures and standardizations were used in principal components analysis, principal coordinates analysis, non-metric multidimensional scaling, correspondence analysis, and detrended correspondence analysis. Correspondence analysis and principal components analysis using a correlation coefficient provided the most consistent results irrespective of the choice in data standardization. Other approaches using detrended correspondence analysis, principal components analysis, principal coordinates analysis, and non-metric multidimensional scaling provided less consistent results. These latter three methods produced similar results when the abundance data were replaced with ranks or standardized to a 0–1 range. The log-transformed data produced the least consistent results, whereas ranked data were most consistent. Resemblance measures such as the Bray-Curtis and correlation coefficient provided more consistent solutions than measures such as Euclidean distance or the covariance matrix when different data standardizations were used. The cosine distance based on standardized data provided results comparable to the CA and DCA solutions. Overall, CA proved most robust as it demonstrated high consistency irrespective of the data standardizations. The strong influence of data standardization on the other ordination methods emphasizes the importance of this frequently neglected stage of data analysis.  相似文献   

8.
Limiting amounts of RNA is a major issue in cDNA microarray, especially when one is dealing with fresh tissue samples. Here we describe a protocol based on template switch and T7 amplification that led to efficient and linear amplification of 1300x. Using a glass-array containing 368 genes printed in three or six replicas covering a wide range of expression levels and ratios, we determined quality and reproducibility of the data obtained from one nonamplified and two independently amplified RNAs (aRNA) derived from normal and tumor samples using replicas with dye exchange (dye-swap measurements). Overall, signal-to-noise ratio improved when we used aRNA (1.45-fold for channel 1 and 2.02-fold for channel 2), increasing by 6% the number of spots with meaningful data. Measurements arising from independent aRNA samples showed strong correlation among themselves (r(2)=0.962) and with those from the nonamplified sample (r(2)=0.975), indicating the reproducibility and fidelity of the amplification procedure. Measurement differences, i.e, spots with poor correlation between amplified and nonamplified measurements, did not show association with gene sequence, expression intensity, or expression ratio and can, therefore, be compensated with replication. In conclusion, aRNA can be used routinely in cDNA microarray analysis, leading to improved quality of data with high fidelity and reproducibility.  相似文献   

9.
DNA methylation is an early event in tumorigenesis. Here, by integrative analysis of DNA methylation and gene expression and utilizing machine learning approaches, we introduced potential diagnostic and prognostic methylation signatures for stomach cancer. Differentially-methylated positions (DMPs) and differentially-expressed genes (DEGs) were identified using The Cancer Genome Atlas (TCGA) stomach adenocarcinoma (STAD) data. A total of 256 DMPs consisting of 140 and 116 hyper- and hypomethylated positions were identified between 443 tumour and 27 nontumour STAD samples. Gene expression analysis revealed a total of 2821 DEGs with 1247 upregulated and 1574 downregulated genes. By analysing the impact of cis and trans regulation of methylation on gene expression, a dominant negative correlation between methylation and expression was observed, while for trans regulation, in hypermethylated and hypomethylated genes, there was mainly a negative and positive correlation with gene expression, respectively. To find diagnostic biomarkers, we used 28 hypermethylated probes locating in the promoter of 27 downregulated genes. By implementing a feature selection approach, eight probes were selected and then used to build a support vector machine diagnostic model, which had an area under the curve of 0.99 and 0.97 in the training and validation (GSE30601 with 203 tumour and 94 nontumour samples) cohorts, respectively. Using 412 TCGA-STAD samples with both methylation and clinical data, we also identified four prognostic probes by implementing univariate and multivariate Cox regression analysis. In summary, our study introduced potential diagnostic and prognostic biomarkers for STAD, which demands further validation.  相似文献   

10.
To enhance glioblastoma (GB) marker discovery we compared gene expression in GB with human normal brain (NB) by accessing SAGE Genie web site and compared obtained results with published data. Nine GB and five NB SAGE-libraries were analyzed using the Digital Gene Expression Displayer (DGED), the results of DGED were tested by Northern blot analysis and RT-PCR of arbitrary selected genes. Review of available data from the articles on gene expression profiling by microarray-based hybridization showed as few as 35 overlapped genes with increased expression in GB. Some of them were identified in four articles, but most genes in three or even in two investigations. There was found also some differences between SAGE results of GB analysis. Digital Gene Expression Displayer approach revealed 676 genes differentially expressed in GB vs. NB with cut-off ratio: twofold change and P < or = 0.05. Differential expression of selectedgenes obtained by DGED was confirmed by Northern analysis and RT-PCR. Altogether, only 105 of 955 genes presented in published investigations were among the genes obtained by DGED. Comparison of the results obtained by microarrays and SAGE is very complicated because authors present only the most prominent differentially expressed genes. However, even available data give quite poor overlapping of genes revealed by microarrays. Some differences between results obtained by SAGE in different investigations can be explained by high dependence on the statistical methods used. As for now, the best solution to search for molecular tumor markers is to compare all available results and to select only those genes, which significant expression in tumor combined with very low expression in normal tissues was reproduced in several articles. 105 differentially expressed genes, common to both methods, can be included in the list of candidates for the molecular typing of GBs. Some genes, encoded cell surface or extra-cellular proteins may be useful for targeting gliomas with antibody-based therapy.  相似文献   

11.
AIM: We investigated the use of non-linear, multidimensional factor analysis for the study of observational data on death from breast cancer. These data were obtained in the context of a clinical practice and not in a clinical trial. We looked into the correlations between patient characteristics and time of death and/or disease-free interval. PATIENTS AND METHODS: We first analyzed the characteristics of a population of patients that had died from breast cancer (n = 295), then of a population including patients still alive 7 years after surgery (n = 344). We used correspondence analysis (CA) which is based on chi(2)-metrics, does not assume linear relationships, and provides graphic overviews. RESULTS: The CA mapped variables (clinical stage, histoprognostic grade, node status, receptor positivity) in a way that fits in well with available knowledge on their importance as prognostic factors. We observed, however, that death occurred during three main periods (1-3, 4-7, < OR = 8 years after surgery) defined by different mixes of variables as if the disease progressed by stage rather than continuously. The CA distinguished long-term survivors (>7 years) from patients who died 8-10 years after surgery. Long-term survivors tended to be node-negative; those who died at 8-10 years tended to be the youngest patients (under 40). CONCLUSIONS: Because correspondence analysis combines the advantages of multidimensional and non-linear methods, it is a valuable exploratory tool for describing multiple correlations within a population before attempting to establish statistical significance of selected variables by more classic methods.  相似文献   

12.
The effective extraction of information from multidimensional data sets derived from phenotyping experiments is a growing challenge in biology. Data visualization tools are important resources that can aid in exploratory data analysis of complex data sets. Phenotyping experiments of model organisms produce data sets in which a large number of phenotypic measures are collected for each individual in a group. A critical initial step in the analysis of such multidimensional data sets is the exploratory analysis of data distribution and correlation. To facilitate the rapid visualization and exploratory analysis of multidimensional complex trait data, we have developed a user-friendly, web-based software tool called Phenostat. Phenostat is composed of a dynamic graphical environment that allows the user to inspect the distribution of multiple variables in a data set simultaneously. Individuals can be selected by directly clicking on the graphs and thus displaying their identity, highlighting corresponding values in all graphs, allowing their inclusion or exclusion from the analysis. Statistical analysis is provided by R package functions. Phenostat is particularly suited for rapid distribution and correlation analysis of subsets of data. An analysis of behavioral and physiologic data stemming from a large mouse phenotyping experiment using Phenostat reveals previously unsuspected correlations. Phenostat is freely available to academic institutions and nonprofit organizations and can be used from our website at .  相似文献   

13.
We compared the changes in the cells in the basal layer of normal mucosa, oral leukoplakia with dysplasia and different grades of oral squamous cell carcinoma (OSCC) using computer aided image analysis of tissue sections. We investigated three morphometric parameters: nuclear area (NA), cell area (CA) and their ratio (NA:CA). NA and NA:CA ratio showed a statistically significant increase from dysplasia to increasing grades of OSCC. Nuclear size was useful for differentiating normal tissue, potentially malignant leukoplakia and OSCC.  相似文献   

14.
DNA microarrays have been widely used in gene expression analysis of biological processes. Due to a lack of sequence information, the applications have been largely restricted to humans and a few model organisms. Presented within this study are results of the cross-species hybridization with Affymetrix human high-density oligonucleotide arrays or GeneChip® using distantly related mammalian species; cattle, pig and dog. Based on the unique feature of the Affymetrix GeneChip® where every gene is represented by multiple probes, we hypothesized that sequence conservation within mammals is high enough to generate sufficient signals from some of the probes for expression analysis. We demonstrated that while overall hybridization signals are low for cross-species hybridization, a few probes of most genes still generated signals equivalent to the same-species hybridization. By masking the poorly hybridized probes electronically, the remaining probes provided reliable data for gene expression analysis. We developed an algorithm to select the reliable probes for analysis utilizing the match/mismatch feature of GeneChip®. When comparing gene expression between two tissues using the selected probes, we found a linear correlation between the cross-species and same-species hybridization. In addition, we validated cross-species hybridization results by quantitative PCR using randomly selected genes. The method shown herein could be applied to both plant and animal research.  相似文献   

15.
MOTIVATION: A common task in microarray data analysis consists of identifying genes associated with a phenotype. When the outcomes of interest are censored time-to-event data, standard approaches assess the effect of genes by fitting univariate survival models. In this paper, we propose a Bayesian variable selection approach, which allows the identification of relevant markers by jointly assessing sets of genes. We consider accelerated failure time (AFT) models with log-normal and log-t distributional assumptions. A data augmentation approach is used to impute the failure times of censored observations and mixture priors are used for the regression coefficients to identify promising subsets of variables. The proposed method provides a unified procedure for the selection of relevant genes and the prediction of survivor functions. RESULTS: We demonstrate the performance of the method on simulated examples and on several microarray datasets. For the simulation study, we consider scenarios with large number of noisy variables and different degrees of correlation between the relevant and non-relevant (noisy) variables. We are able to identify the correct covariates and obtain good prediction of the survivor functions. For the microarray applications, some of our selected genes are known to be related to the diseases under study and a few are in agreement with findings from other researchers. AVAILABILITY: The Matlab code for implementing the Bayesian variable selection method may be obtained from the corresponding author. CONTACT: mvannucci@stat.tamu.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

16.
Yang HH  Hu Y  Buetow KH  Lee MP 《Genomics》2004,84(1):211-217
This study uses a computational approach to analyze coherence of expression of genes in pathways. Microarray data were analyzed with respect to coherent gene expression in a group of genes defined as a pathway in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Our hypothesis is that genes in the same pathway are more likely to be coordinately regulated than a randomly selected gene set. A correlation coefficient for each pair of genes in a pathway was estimated based on gene expression in normal or tumor samples, and statistically significant correlation coefficients were identified. The coherence indicator was defined as the ratio of the number of gene pairs in the pathway whose correlation coefficients are significant, divided by the total number of gene pairs in the pathway. We defined all genes that appeared in the KEGG pathways as a reference gene set. Our analysis indicated that the mean coherence indicator of pathways is significantly larger than the mean coherence indicator of random gene sets drawn from the reference gene set. Thus, the result supports our hypothesis. The significance of each individual pathway of n genes was evaluated by comparing its coherence indicator with coherence indicators of 1000 random permutation sets of n genes chosen from the reference gene set. We analyzed three data sets: two Affymetrix microarrays and one cDNA microarray. For each of the three data sets, statistically significant pathways were identified among all KEGG pathways. Seven of 96 pathways had a significant coherence indicator in normal tissue and 14 of 96 pathways had a significant coherence indicator in tumor tissue in all three data sets. The increase in the number of pathways with significant coherence indicators may reflect the fact that tumor cells have a higher rate of metabolism than normal cells. Five pathways involved in oxidative phosphorylation, ATP synthesis, protein synthesis, or RNA synthesis were coherent in both normal and tumor tissue, demonstrating that these are essential genes, a high level of expression of which is required regardless of cell type.  相似文献   

17.
Ovarian cancer is the most lethal gynaecological cancer, and resistance of platinum‐based chemotherapy is the main reason for treatment failure. The aim of the present study was to identify candidate genes involved in ovarian cancer platinum response by analysing genes from homologous recombination and Fanconi anaemia pathways. Associations between these two functional genes were explored in the study, and we performed a random walk algorithm based on reconstructed gene‐gene network, including protein‐protein interaction and co‐expression relations. Following the random walk, all genes were ranked and GSEA analysis showed that the biological functions focused primarily on autophagy, histone modification and gluconeogenesis. Based on three types of seed nodes, the top two genes were utilized as examples. We selected a total of six candidate genes (FANCA, FANCG, POLD1, KDM1A, BLM and BRCA1) for subsequent verification. The validation results of the six candidate genes have significance in three independent ovarian cancer data sets with platinum‐resistant and platinum‐sensitive information. To explore the correlation between biomarkers and clinical prognostic factors, we performed differential analysis and multivariate clinical subgroup analysis for six candidate genes at both mRNA and protein levels. And each of the six candidate genes and their neighbouring genes with a mutation rate greater than 10% were also analysed by network construction and functional enrichment analysis. In the meanwhile, the survival analysis for platinum‐treated patients was performed in the current study. Finally, the RT‐qPCR assay was used to determine the performance of candidate genes in ovarian cancer platinum response. Taken together, this research demonstrated that comprehensive bioinformatics methods could help to understand the molecular mechanism of platinum response and provide new strategies for overcoming platinum resistance in ovarian cancer treatment.  相似文献   

18.
本文选取癌症基因组图谱数据库的乳腺癌样本作为数据集,在全基因组的水平上研究乳腺癌病人从正常到发病Ⅰ期基因表达的变化,寻找与乳腺癌发病密切相关的特征基因,建立乳腺癌发生的模式识别分类方法,为乳腺癌预防及早期诊断提供理论支持.研究中,综合利用相关性、t检验、置信区间等统计学方法,建立乳腺癌发生特征基因筛选方法,获得与乳腺癌发生具有显著性差异的特征基因336个.通过机器学习方法建模,得到的分类准确率能达到98%以上,与之前乳腺癌相关的研究相比,准确率更高.同时采用KEGG(kyoto encyclopedia of genes and genomes)通路分析得到与基因显著相关(P0.05)的通路有8个,GO(gene ontology)基因功能富集分析显示与基因显著相关(P0.05)的功能有18个.最后对映射在8个通路中的一部分基因进行简要功能分析,说明了其在调控水平上的密切关系,表明识别的特征基因在乳腺癌的发生过程中有重要的作用,这对了解乳腺癌发病机理以及乳腺癌的早期诊断非常重要.  相似文献   

19.
20.
Xiao J  Wang X  Hu Z  Tang Z  Xu C 《Heredity》2007,98(6):427-435
Segregation analysis is a method of detecting major genes for quantitative traits without using marker information. It serves as an important tool in helping investigators to plan further studies such as quantitative trait loci mapping or more sophisticated genomic analyses. However, current methods of segregation analysis for a single trait typically have low statistical power. We propose a multivariate segregation analysis (MSA) that takes advantage of the correlation structure of multiple quantitative traits to detect major genes. This method not only increases the statistical power, but allows dissection of the genetic architecture underlying the trait complex. In MSA the observed phenotypes of multiple correlated traits are fitted to a multivariate Gaussian mixture model. Model parameters are estimated under the maximum likelihood framework via the expectation-maximization algorithm. The presence of major genes is tested using likelihood ratio test statistics. Pleiotropy is distinguished from close linkage by comparing three possible models using the Bayesian information criterion. Two simulation experiments were performed based on the F(2) mating design. In the first, the statistical properties of MSA under varying heritabilities and sample sizes were investigated and the results compared with those obtained from single-trait analysis. In the second simulation the efficacy of MSA in separating pleiotropy from close linkage was demonstrated. Finally, the new method was applied to real data and detected a major gene responsible for both plant height and tiller number in rice.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号