首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
2.
3.
NMPP: a user-customized NimbleGen microarray data processing pipeline   总被引:1,自引:0,他引:1  
NMPP package is a bundle of user-customized tools based on established algorithms and methods to process self-designed NimbleGen microarray data. It features a command-line-based integrative processing procedure that comprises five major functional components, namely the raw microarray data parsing and integrating module, the array spatial effect smoothing and visualization module, the probe-level multi-array normalization module, the gene expression intensity summarization module and the gene expression status inference module. AVAILABILITY: http://plantgenomics.biology.yale.edu/nmpp  相似文献   

4.
MOTIVATION: Identification of genes expressed in a cell-cycle-specific periodical manner is of great interest to understand cyclic systems which play a critical role in many biological processes. However, identification of cell-cycle regulated genes by raw microarray gene expression data directly is complicated by the factor of synchronization loss, thus remains a challenging problem. Decomposing the expression measurements and extracting synchronized expression will allow to better represent the single-cell behavior and improve the accuracy in identifying periodically expressed genes. RESULTS: In this paper, we propose a resynchronization-based algorithm for identifying cell-cycle-related genes. We introduce a synchronization loss model by modeling the gene expression measurements as a superposition of different cell populations growing at different rates. The underlying expression profile is then reconstructed through resynchronization and is further fitted to the measurements in order to identify periodically expressed genes. Results from both simulations and real microarray data show that the proposed scheme is promising for identifying cyclic genes and revealing underlying gene expression profiles. AVAILABILITY: Contact the authors. SUPPLEMENTARY INFORMATION: Supplementary data are available at: http://dsplab.eng.umd.edu/~genomics/syn/  相似文献   

5.
The MUSC DNA Microarray Database   总被引:1,自引:0,他引:1  
SUMMARY: The Medical University of South Carolina (MUSC) DNA Microarray Database is a web-accessible archive of DNA microarray data. The database was developed using the DNA microarray project/data management system, micro ArrayDB. Annotations for each DNA microarray project and associated cRNA target information are stored in a MySQL relational database and linked to array hybridization data (raw and normalized). At the discretion of investigators, data are placed into the public domain where they can be interrogated and downloaded through a web browser. In addition to serving as an online resource of gene expression data, the MUSC DNA Microarray Database is a model for other academic DNA microarray data repositories. AVAILABILITY: Browsing and downloading of MUSC DNA Microarray Database information can be done after registration at http://proteogenomics.musc.edu/pss/home.php.  相似文献   

6.
Numerous methods are available to compare results of multiple microarray studies. One of the simplest but most effective of these procedures is to examine the overlap of resulting gene lists in a Venn diagram. Venn diagrams are graphical ways of representing interactions among sets to display information that can be read easily. Here we propose a simple but effective web application creating Venn diagrams from two or three gene lists. Each gene in the group list has link to the related information in NCBI's Entrez Nucleotide database. AVAILABILITY: GeneVenn is available for free at http://mcbc.usm.edu/genevenn/  相似文献   

7.
8.
MOTIVATION: The BioArray Software Environment (BASE) is a very popular MIAME-compliant, web-based microarray data repository. However in BASE, like in most other microarray data repositories, the experiment annotation and raw data uploading can be very timeconsuming, especially for large microarray experiments. RESULTS: We developed KUTE (Karmanos Universal daTabase for microarray Experiments), as a plug-in for BASE 2.0 that addresses these issues. KUTE provides an automatic experiment annotation feature and a completely redesigned data work-flow that dramatically reduce the human-computer interaction time. For instance, in BASE 2.0 a typical Affymetrix experiment involving 100 arrays required 4 h 30 min of user interaction time forexperiment annotation, and 45 min for data upload/download. In contrast, for the same experiment, KUTE required only 28 min of user interaction time for experiment annotation, and 3.3 min for data upload/download. AVAILABILITY: http://vortex.cs.wayne.edu/kute/index.html.  相似文献   

9.
ArrayFusion annotates conventional CGH results and various types of microarray data from a range of platforms (cDNA, expression, exon, SNP, array-CGH and ChIP-on-chip) and converts them into standard formats which can be visualized in genome browsers (Affymetrix Integrated Genome Browser and GBrowse in the HapMap Project). Converted files can then be imported simultaneously into a single genome browser to benefit a collective interpretation between different array results. ArrayFusion therefore provides a new type of tool facilitating the integration of CGH and array results to provide new experimental directions. AVAILABILITY: http://microarray.ym.edu.tw/tools/arrayfusion  相似文献   

10.
MOTIVATION: A serious limitation in microarray analysis is the unreliability of the data generated from low signal intensities. Such data may produce erroneous gene expression ratios and cause unnecessary validation or post-analysis follow-up tasks. Therefore, the elimination of unreliable signal intensities will enhance reproducibility and reliability of gene expression ratios produced from microarray data. In this study, we applied fuzzy c-means (FCM) and normal mixture modeling (NMM) based classification methods to separate microarray data into reliable and unreliable signal intensity populations. RESULTS: We compared the results of FCM classification with those of classification based on NMM. Both approaches were validated against reference sets of biological data consisting of only true positives and true negatives. We observed that both methods performed equally well in terms of sensitivity and specificity. Although a comparison of the computation times indicated that the fuzzy approach is computationally more efficient, other considerations support the use of NMM for the reliability analysis of microarray data. AVAILABILITY: The classification approaches described in this paper and sample microarray data are available as Matlab( TM ) (The MathWorks Inc., Natick, MA) programs (mfiles) and text files, respectively, at http://rc.kfshrc.edu.sa/bssc/staff/MusaAsyali/Downloads.asp. The programs can be run/tested on many different computer platforms where Matlab is available. CONTACT: asyali@kfshrc.edu.sa.  相似文献   

11.
MOTIVATION: The standard L(2)-norm support vector machine (SVM) is a widely used tool for microarray classification. Previous studies have demonstrated its superior performance in terms of classification accuracy. However, a major limitation of the SVM is that it cannot automatically select relevant genes for the classification. The L(1)-norm SVM is a variant of the standard L(2)-norm SVM, that constrains the L(1)-norm of the fitted coefficients. Due to the singularity of the L(1)-norm, the L(1)-norm SVM has the property of automatically selecting relevant genes. On the other hand, the L(1)-norm SVM has two drawbacks: (1) the number of selected genes is upper bounded by the size of the training data; (2) when there are several highly correlated genes, the L(1)-norm SVM tends to pick only a few of them, and remove the rest. RESULTS: We propose a hybrid huberized support vector machine (HHSVM). The HHSVM combines the huberized hinge loss function and the elastic-net penalty. By doing so, the HHSVM performs automatic gene selection in a way similar to the L(1)-norm SVM. In addition, the HHSVM encourages highly correlated genes to be selected (or removed) together. We also develop an efficient algorithm to compute the entire solution path of the HHSVM. Numerical results indicate that the HHSVM tends to provide better variable selection results than the L(1)-norm SVM, especially when variables are highly correlated. AVAILABILITY: R code are available at http://www.stat.lsa.umich.edu/~jizhu/code/hhsvm/.  相似文献   

12.
SUMMARY: A brief overview of Tree-Maps provides the basis for understanding two new implementations of Tree-Map methods. TreeMapClusterView provides a new way to view microarray gene expression data, and GenePlacer provides a view of gene ontology annotation data. We also discuss the benefits of Tree-Maps to visualize complex hierarchies in functional genomics. AVAILABILITY: Java class files are freely available at http://mendel.mc.duke.edu/bioinformatics/ CONTACT: mccon012@mc.duke.edu SUPPLEMENTARY INFORMATION: For more information on TreeMapClusterView (see http://mendel.mc.duke.edu/bioinformatics/software/boxclusterview/), and http://mendel.mc.duke.edu/bioinformatics/software/geneplacer/).  相似文献   

13.
MOTIVATION: Our purpose is to develop a statistical modeling approach for cancer biomarker discovery and provide new insights into early cancer detection. We propose the concept of dependence network, apply it for identifying cancer biomarkers, and study the difference between the protein or gene samples from cancer and non-cancer subjects based on mass-spectrometry (MS) and microarray data. RESULTS: Three MS and two gene microarray datasets are studied. Clear differences are observed in the dependence networks for cancer and non-cancer samples. Protein/gene features are examined three at one time through an exhaustive search. Dependence networks are constructed by binding triples identified by the eigenvalue pattern of the dependence model, and are further compared to identify cancer biomarkers. Such dependence-network-based biomarkers show much greater consistency under 10-fold cross-validation than the classification-performance-based biomarkers. Furthermore, the biological relevance of the dependence-network-based biomarkers using microarray data is discussed. The proposed scheme is shown promising for cancer diagnosis and prediction. AVAILABILITY: See supplements: http://dsplab.eng.umd.edu/~genomics/dependencenetwork/  相似文献   

14.
SUMMARY: Differential gene expression detection using microarrays has received lots of research interests recently. Many methods have been proposed, including variants of F-statistics, non-parametric approaches and empirical Bayesian methods etc. The SAM statistics has been shown to have good performance in empirical studies. SAM is more like an ad hoc shrinkage method. The idea is that for small sample microarray data, it is often useful to pool information across genes to improve efficiency. Under Bayesian framework Smyth formally derived the test statistics with shrinkage using the hierarchical models. In this paper we cast differential gene expression detection in the familiar framework of linear regression model. Commonly used test statistics correspond to using least squares to estimate the regression parameters. Based on the vast literature of research on linear models, we can naturally consider other alternatives. Here we explore the penalized linear regression. We propose the penalized t-/F-statistics for two-class microarray data based on [Formula: see text] penalty. We will show that the penalized test statistics intuitively makes sense and through applications we illustrate its good performance. AVAILABILITY: Supplementary information including program codes, more detailed analysis results and R functions for the proposed methods can be found at http://www.biostat.umn.edu/~baolin/research CONTACT: baolin@biostat.umn.edu SUPPLEMENTARY INFORMATION: http://www.biostat.umn.edu/~baolin/research.  相似文献   

15.
The distributed nature of biological knowledge poses a major challenge to the interpretation of genome-scale datasets, including those derived from microarray and proteomic studies. This report describes DAVID, a web-accessible program that integrates functional genomic annotations with intuitive graphical summaries. Lists of gene or protein identifiers are rapidly annotated and summarized according to shared categorical data for Gene Ontology, protein domain, and biochemical pathway membership. DAVID assists in the interpretation of genome-scale datasets by facilitating the transition from data collection to biological meaning.  相似文献   

16.
MOTIVATION: One problem with discriminant analysis of DNA microarray data is that each sample is represented by quite a large number of genes, and many of them are irrelevant, insignificant or redundant to the discriminant problem at hand. Methods for selecting important genes are, therefore, of much significance in microarray data analysis. In the present study, a new criterion, called LS Bound measure, is proposed to address the gene selection problem. The LS Bound measure is derived from leave-one-out procedure of LS-SVMs (least squares support vector machines), and as the upper bound for leave-one-out classification results it reflects to some extent the generalization performance of gene subsets. RESULTS: We applied this LS Bound measure for gene selection on two benchmark microarray datasets: colon cancer and leukemia. We also compared the LS Bound measure with other evaluation criteria, including the well-known Fisher's ratio and Mahalanobis class separability measure, and other published gene selection algorithms, including Weighting factor and SVM Recursive Feature Elimination. The strength of the LS Bound measure is that it provides gene subsets leading to more accurate classification results than the filter method while its computational complexity is at the level of the filter method. AVAILABILITY: A companion website can be accessed at http://www.ntu.edu.sg/home5/pg02776030/lsbound/. The website contains: (1) the source code of the gene selection algorithm; (2) the complete set of tables and figures regarding the experimental study; (3) proof of the inequality (9). CONTACT: ekzmao@ntu.edu.sg.  相似文献   

17.
MOTIVATION: An important step in analyzing expression profiles from microarray data is to identify genes that can discriminate between distinct classes of samples. Many statistical approaches for assigning significance values to genes have been developed. The Comparative Marker Selection suite consists of three modules that allow users to apply and compare different methods of computing significance for each marker gene, a viewer to assess the results, and a tool to create derivative datasets and marker lists based on user-defined significance criteria. AVAILABILITY: The Comparative Marker Selection application suite is freely available as a GenePattern module. The GenePattern analysis environment is freely available at http://www.broad.mit.edu/genepattern.  相似文献   

18.
19.
MOTIVATION: There is a very large and growing level of effort toward improving the platforms, experiment designs, and data analysis methods for microarray expression profiling. Along with a growing richness in the approaches there is a growing confusion among most scientists as to how to make objective comparisons and choices between them for different applications. There is a need for a standard framework for the microarray community to compare and improve analytical and statistical methods. RESULTS: We report on a microarray data set comprising 204 in-situ synthesized oligonucleotide arrays, each hybridized with two-color cDNA samples derived from 20 different human tissues and cell lines. Design of the approximately 24 000 60mer oligonucleotides that report approximately 2500 known genes on the arrays, and design of the hybridization experiments, were carried out in a way that supports the performance assessment of alternative data processing approaches and of alternative experiment and array designs. We also propose standard figures of merit for success in detecting individual differential expression changes or expression levels, and for detecting similarities and differences in expression patterns across genes and experiments. We expect this data set and the proposed figures of merit will provide a standard framework for much of the microarray community to compare and improve many analytical and statistical methods relevant to microarray data analysis, including image processing, normalization, error modeling, combining of multiple reporters per gene, use of replicate experiments, and sample referencing schemes in measurements based on expression change. AVAILABILITY/SUPPLEMENTARY INFORMATION: Expression data and supplementary information are available at http://www.rii.com/publications/2003/HE_SDS.htm  相似文献   

20.
We present a fast, versatile and adaptive-multiscale algorithm for analyzing a wide-variety of DNA microarray data. Its primary application is in normalization of array data as well as subsequent identification of 'enriched targets', e.g. differentially expressed genes in expression profiling arrays and enriched sites in ChIP-on-chip experimental data. We show how to accommodate the unique characteristics of ChIP-on-chip data, where the set of 'enriched targets' is large, asymmetric and whose proportion to the whole data varies locally. SUPPLEMENTARY INFORMATION: Supplementary figures, related preprint, free software as well as our raw DNA microarray data with PCR validations are available at http://www.math.umn.edu/~lerman/supp/bioinfo06 as well as Bioinformatics online.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号