首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
2.
MOTIVATION: Most supervised classification methods are limited by the requirement for more cases than variables. In microarray data the number of variables (genes) far exceeds the number of cases (arrays), and thus filtering and pre-selection of genes is required. We describe the application of Between Group Analysis (BGA) to the analysis of microarray data. A feature of BGA is that it can be used when the number of variables (genes) exceeds the number of cases (arrays). BGA is based on carrying out an ordination of groups of samples, using a standard method such as Correspondence Analysis (COA), rather than an ordination of the individual microarray samples. As such, it can be viewed as a method of carrying out COA with grouped data. RESULTS: We illustrate the power of the method using two cancer data sets. In both cases, we can quickly and accurately classify test samples from any number of specified a priori groups and identify the genes which characterize these groups. We obtained very high rates of correct classification, as determined by jack-knife or validation experiments with training and test sets. The results are comparable to those from other methods in terms of accuracy but the power and flexibility of BGA make it an especially attractive method for the analysis of microarray cancer data.  相似文献   

3.
4.
MOTIVATION: Dilution design (Mixed tissue RNA) has been utilized by some researchers to evaluate and assess the performance of multiple microarray platforms. Current microarray data analysis approaches assume that the quantified signal intensities are linearly related to the expression of the corresponding genes in the sample. However, there are sources of nonlinearity in microarray expression measurements. Such nonlinearity study in the expressions of the RNA mixtures provides a new way to analyze gene expression data, and we argue that the nonlinearity can reveal novel information for microarray data analysis. Therefore, we proposed a statistical model, called proportion model, which is based on the linear regression analysis. To approximately quantify the nonlinearity in the dilution design, a new calibration, beta ratio (BR) was derived from the proportion model. Furthermore, a new adjusted fold change (adj-FC) was proposed to predict the true FC without nonlinearity, in particular for large FC. RESULTS: We applied our method to one microarray dilution dataset. The experimental results indicated that, to some extent, there are global biases comparing with the linear assumption for the significant genes. Further analysis of those highly expressed genes with significant nonlinearity revealed some promising results, e.g. 'poison' effect was discovered for some genes in RNA mixtures. The adj-FCs of those genes with 'poison' effect, indicate that the nonlinearity can be also caused by the inherent feature of the genes besides signal noise and technical variation. Moreover, when percentage of overlapping genes (POG) was used as a cross-platform consistency measure, adj-FC outperformed simple fold change to show that Affymetrix and Illumina platforms are consistent. AVAILABILITY: The R codes which implements all described methods, and some Supplementary material, are freely available from http://www.utdallas.edu/~ying.liu/BetaRatio.htm  相似文献   

5.
MOTIVATION: DNA microarray data analysis has been used previously to identify marker genes which discriminate cancer from normal samples. However, due to the limited sample size of each study, there are few common markers among different studies of the same cancer. With the rapid accumulation of microarray data, it is of great interest to integrate inter-study microarray data to increase sample size, which could lead to the discovery of more reliable markers. RESULTS: We present a novel, simple method of integrating different microarray datasets to identify marker genes and apply the method to prostate cancer datasets. In this study, by applying a new statistical method, referred to as the top-scoring pair (TSP) classifier, we have identified a pair of robust marker genes (HPN and STAT6) by integrating microarray datasets from three different prostate cancer studies. Cross-platform validation shows that the TSP classifier built from the marker gene pair, which simply compares relative expression values, achieves high accuracy, sensitivity and specificity on independent datasets generated using various array platforms. Our findings suggest a new model for the discovery of marker genes from accumulated microarray data and demonstrate how the great wealth of microarray data can be exploited to increase the power of statistical analysis. CONTACT: leixu@jhu.edu.  相似文献   

6.
SUMMARY: MAPS is a MicroArray Project System for management and interpretation of microarray gene expression experiment information and data. Microarray project information is organized to track experiments and results that are: (1) validated by performing analysis on stored replicate gene expression data; and (2) queried according to the biological classifications of genes deposited on microarray chips.  相似文献   

7.
Highly aggressively proliferating immortalized (HAPI) microglial cells have been used as an in vitro model for investigating key microglial functions including inflammatory, neurotoxic, and phagocytic activities. Through the use of offline strong cation-exchange fractionation followed by inline reversed-phase chromatographic separation and tandem mass spectrometric analysis on a hybrid linear ion trap-Orbitrap instrument, the HAPI microglial proteome was characterized to a depth of 3006 unique protein groups. Upon bioinformatic analysis of the HAPI proteome data set, enrichment was observed for processes relevant to microglial function including those associated with immune system response. This study marks the most comprehensive reference data set generated to date for the rat microglial proteome.  相似文献   

8.
基因芯片是一种能够同时检测大量基因在同一组织中表达情况的有力工具.利用前期工作筛选的2210个鼻咽癌差异表达基因和Biocarta信号通路资源库,构建了一个基于信号通路的基因相互作用网络.通过统计学分析,进一步筛选出一批对该基因相互作用网络具有重大影响的基因(特别是RAN、CEL、RELA).随后,采用RT-PCR方法检测候选基因在鼻咽癌活检组织中的表达,发现RAN和CEL基因在高达80%的鼻咽癌组织中高表达.进一步将网络分析结果和ArrayXPath软件分析的结果比较,共计有40%(32/80)基因结果吻合,这验证了网络分析方法的有效性和可行性.最终探索建立了新的分析基因芯片的方法.  相似文献   

9.
SpotWhatR is a user-friendly microarray data analysis tool that runs under a widely and freely available R statistical language (http://www.r-project.org) for Windows and Linux operational systems. The aim of SpotWhatR is to help the researcher to analyze microarray data by providing basic tools for data visualization, normalization, determination of differentially expressed genes, summarization by Gene Ontology terms, and clustering analysis. SpotWhatR allows researchers who are not familiar with computational programming to choose the most suitable analysis for their microarray dataset. Along with well-known procedures used in microarray data analysis, we have introduced a stand-alone implementation of the HTself method, especially designed to find differentially expressed genes in low-replication contexts. This approach is more compatible with our local reality than the usual statistical methods. We provide several examples derived from the Blastocladiella emersonii and Xylella fastidiosa Microarray Projects. SpotWhatR is freely available at http://blasto.iq.usp.br/~tkoide/SpotWhatR, in English and Portuguese versions. In addition, the user can choose between "single experiment" and "batch processing" versions.  相似文献   

10.
Aims:  To detect antimicrobial resistance genes in Salmonella isolates from turkey flocks using the microarray technology.
Methods and Results:  A 775 gene probe oligonucleotide microarray was used to detect antimicrobial resistance genes in 34 isolates. All tetracycline-resistant Salmonella harboured tet(A) , tet(C) or tet(R) , with the exception of one Salmonella serotype Heidelberg isolate. The sul1 gene was detected in 11 of 16 sulfisoxazole-resistant isolates. The aadA , aadA1 , aadA2 , strA or strB genes were found in aminoglycoside-resistant isolates of Salm. Heidelberg, Salmonella serotype Senftenberg and untypeable Salmonella . The prevalence of mobile genetic elements, such as class I integron and transposon genes, in drug-resistant Salmonella isolates suggested that these elements may contribute to the dissemination of antimicrobial resistance genes in the preharvest poultry environment. Hierarchical clustering analysis demonstrated a close relationship between drug-resistant phenotypes and the corresponding antimicrobial resistance gene profiles.
Conclusions:  Salmonella serotypes isolated from the poultry environment carry multiple genes that can render them resistant to several antimicrobials used in poultry and humans.
Significance and Impact of the Study:  Multiple antimicrobial resistance genes in environmental Salmonella isolates could be identified efficiently by microarray analysis. Hierarchical clustering analysis of the data was also found to be a useful tool for analysing emerging patterns of drug resistance.  相似文献   

11.
ABSTRACT: BACKGROUND: In the postgenome era, a prediction of response to treatment could lead to better dose selection for patients in radiotherapy. To identify a radiosensitive gene signature and elucidate related signaling pathways, four different microarray experiments were reanalyzed before radiotherapy. RESULTS: Radiosensitivity profiling data using clonogenic assay and gene expression profiling data from four published microarray platforms applied to NCI-60 cancer cell panel were used. The survival fraction at 2 Gy (SF2, range from 0 to 1) was calculated as a measure of radiosensitivity and a linear regression model was applied to identify genes or a gene set with a correlation between expression and radiosensitivity (SF2). Radiosensitivity signature genes were identified using significant analysis of microarrays (SAM) and gene set analysis was performed using a global test using linear regression model. Using the radiation-related signaling pathway and identified genes, a genetic network was generated. According to SAM, 31 genes were identified as common to all the microarray platforms and therefore a common radiosensitivity signature. In gene set analysis, functions in the cell cycle, DNA replication, and cell junction, including adherence and gap junctions were related to radiosensitivity. The integrin, VEGF, MAPK, p53, JAK-STAT and Wnt signaling pathways were overrepresented in radiosensitivity. Significant genes including ACTN1, CCND1, HCLS1, ITGB5, PFN2, PTPRC, RAB13, and WAS, which are adhesion-related molecules that were identified by both SAM and gene set analysis, and showed interaction in the genetic network with the integrin signaling pathway. CONCLUSIONS: Integration of four different microarray experiments and gene selection using gene set analysis discovered possible target genes and pathways relevant to radiosensitivity. Our results suggested that the identified genes are candidates for radiosensitivity biomarkers and that integrin signaling via adhesion molecules could be a target for radiosensitization.  相似文献   

12.
Identifying genes involved in complex neuropsychiatric disorders through classic human genetic approaches has proven difficult. To overcome that barrier, we have developed a translational approach called Convergent Functional Genomics (CFG), which cross-matches animal model microarray gene expression data with human genetic linkage data as well as human postmortem brain data and biological role data, as a Bayesian way of cross-validating findings and reducing uncertainty. Our approach produces a short list of high probability candidate genes out of the hundreds of genes changed in microarray datasets and the hundreds of genes present in a linkage peak chromosomal area. These genes can then be prioritized, pursued, and validated in an individual fashion using: (1) human candidate gene association studies and (2) cell culture and mouse transgenic models. Further bioinformatics analysis of groups of genes identified through CFG leads to insights into pathways and mechanisms that may be involved in the pathophysiology of the illness studied. This simple but powerful approach is likely generalizable to other complex, non-neuropsychiatric disorders, for which good animal models, as well as good human genetic linkage datasets and human target tissue gene expression datasets exist.  相似文献   

13.
14.
MOTIVATION: The identification of the change of gene expression in multifactorial diseases, such as breast cancer is a major goal of DNA microarray experiments. Here we present a new data mining strategy to better analyze the marginal difference in gene expression between microarray samples. The idea is based on the notion that the consideration of gene's behavior in a wide variety of experiments can improve the statistical reliability on identifying genes with moderate changes between samples. RESULTS: The availability of a large collection of array samples sharing the same platform in public databases, such as NCBI GEO, enabled us to re-standardize the expression intensity of a gene using its mean and variation in the wide variety of experimental conditions. This approach was evaluated via the re-identification of breast cancer-specific gene expression. It successfully prioritized several genes associated with breast tumor, for which the expression difference between normal and breast cancer cells was marginal and thus would have been difficult to recognize using conventional analysis methods. Maximizing the utility of microarray data in the public database, it provides a valuable tool particularly for the identification of previously unrecognized disease-related genes. AVAILABILITY: A user friendly web-interface (http://compbio.sookmyung.ac.kr/~lage/) was constructed to provide the present large-scale approach for the analysis of GEO microarray data (GS-LAGE server).  相似文献   

15.
MOTIVATION: Clustering techniques such as k-means and hierarchical clustering are commonly used to analyze DNA microarray derived gene expression data. However, the interactions between processes underlying the cell activity suggest that the complexity of the microarray data structure may not be fully represented with discrete clustering methods. RESULTS: A newly developed software tool called MILVA (microarray latent visualization and analysis) is presented here to investigate microarray data without separating gene expression profiles into discrete classes. The underpinning of the MILVA software is the two-dimensional topographic representation of multidimensional microarray data. On this basis, the interactive MILVA functions allow a continuous exploration of microarray data driven by the direct supervision of the biologist in detecting activity patterns of co-regulated genes. AVAILABILITY: The MILVA software is freely available. The software and the related documentation can be downloaded from http://www.ncrg.aston.ac.uk/Projects/milva. User 'surrey' as username and '3245' as password to login. The software is currently available for Windows platform only.  相似文献   

16.
17.
MOTIVATION: For establishing prognostic predictors of various diseases using DNA microarray analysis technology, it is desired to find selectively significant genes for constructing the prognostic model and it is also necessary to eliminate non-specific genes or genes with error before constructing the model. RESULTS: We applied projective adaptive resonance theory (PART) to gene screening for DNA microarray data. Genes selected by PART were subjected to our FNN-SWEEP modeling method for the construction of a cancer class prediction model. The model performance was evaluated through comparison with a conventional screening signal-to-noise (S2N) method or nearest shrunken centroids (NSC) method. The FNN-SWEEP predictor with PART screening could discriminate classes of acute leukemia in blinded data with 97.1% accuracy and classes of lung cancer with 90.0% accuracy, while the predictor with S2N was only 85.3 and 70.0% or the predictor with NSC was 88.2 and 90.0%, respectively. The results have proven that PART was superior for gene screening. AVAILABILITY: The software is available upon request from the authors. CONTACT: honda@nubio.nagoya-u.ac.jp  相似文献   

18.
19.
Fung ES  Ng MK 《Bioinformation》2007,2(5):230-234
One of the applications of the discriminant analysis on microarray data is to classify patient and normal samples based on gene expression values. The analysis is especially important in medical trials and diagnosis of cancer subtypes. The main contribution of this paper is to propose a simple Fisher-type discriminant method on gene selection in microarray data. In the new algorithm, we calculate a weight for each gene and use the weight values as an indicator to identify the subsets of relevant genes that categorize patient and normal samples. A l(2) - l(1) norm minimization method is implemented to the discriminant process to automatically compute the weights of all genes in the samples. The experiments on two microarray data sets have shown that the new algorithm can generate classification results as good as other classification methods, and effectively determine relevant genes for classification purpose. In this study, we demonstrate the gene selection's ability and the computational effectiveness of the proposed algorithm. Experimental results are given to illustrate the usefulness of the proposed model.  相似文献   

20.
MOTIVATION: The rapid accumulation of microarray datasets provides unique opportunities to perform systematic functional characterization of the human genome. We designed a graph-based approach to integrate cross-platform microarray data, and extract recurrent expression patterns. A series of microarray datasets can be modeled as a series of co-expression networks, in which we search for frequently occurring network patterns. The integrative approach provides three major advantages over the commonly used microarray analysis methods: (1) enhance signal to noise separation (2) identify functionally related genes without co-expression and (3) provide a way to predict gene functions in a context-specific way. RESULTS: We integrate 65 human microarray datasets, comprising 1105 experiments and over 11 million expression measurements. We develop a data mining procedure based on frequent itemset mining and biclustering to systematically discover network patterns that recur in at least five datasets. This resulted in 143,401 potential functional modules. Subsequently, we design a network topology statistic based on graph random walk that effectively captures characteristics of a gene's local functional environment. Function annotations based on this statistic are then subject to the assessment using the random forest method, combining six other attributes of the network modules. We assign 1126 functions to 895 genes, 779 known and 116 unknown, with a validation accuracy of 70%. Among our assignments, 20% genes are assigned with multiple functions based on different network environments. AVAILABILITY: http://zhoulab.usc.edu/ContextAnnotation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号