首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
During the initial development of microarrays, much discussion revolved around the technology itself. The discussion has now shifted to data analysis and data sharing. There is great interest in the sharing of cDNA microarray data, but several issues related to format, quality and validation will need to be resolved before microarray data can be meaningfully integrated into other molecular databases.  相似文献   

2.
3.
Sharing of microarray data has many advantages for the scientific and biomedical community, and should be advocated by neuroscience journals. The goals of sharing are manifold, and include improving analysis and confidence in results, and facilitating global comparisons between experiments, while at the same time, not penalizing those who share. The sharing of microarray data poses unique challenges relative to more generic data such as DNA sequences. These challenges are surmountable, and various sharing formats are possible. Centralized non-commercial databases are being developed to facilitate this process.  相似文献   

4.
The Microarray Gene Expression Data (MGED) society is an international organization established in 1999 for facilitating sharing of functional genomics and proteomics array data. To facilitate microarray data sharing, the MGED society has been working in establishing the relevant data standards. The three main components (which will be described in more detail later) of MGED standards are Minimum Information About a Microarray Experiment (MIAME), a document that outlines the minimum information that should be reported about a microarray experiment to enable its unambiguous interpretation and reproduction; MAGE, which consists of three parts, The Microarray Gene Expression Object Model (MAGE-OM), an XML-based document exchange format (MAGE-ML), which is derived directly from the object model, and the supporting tool kit MAGEstk; and MO, or MGED Ontology, which defines sets of common terms and annotation rules for microarray experiments, enabling unambiguous annotation and efficient queries, data analysis and data exchange without loss of meaning. We discuss here how these standards have been established, how they have evolved, and how they are used.  相似文献   

5.
Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn''t, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication.Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%–35% in 2007–2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available.First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available.These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let''s learn from those with high rates of sharing to embrace the full potential of our research output.  相似文献   

6.
MOTIVATION: The identification of the change of gene expression in multifactorial diseases, such as breast cancer is a major goal of DNA microarray experiments. Here we present a new data mining strategy to better analyze the marginal difference in gene expression between microarray samples. The idea is based on the notion that the consideration of gene's behavior in a wide variety of experiments can improve the statistical reliability on identifying genes with moderate changes between samples. RESULTS: The availability of a large collection of array samples sharing the same platform in public databases, such as NCBI GEO, enabled us to re-standardize the expression intensity of a gene using its mean and variation in the wide variety of experimental conditions. This approach was evaluated via the re-identification of breast cancer-specific gene expression. It successfully prioritized several genes associated with breast tumor, for which the expression difference between normal and breast cancer cells was marginal and thus would have been difficult to recognize using conventional analysis methods. Maximizing the utility of microarray data in the public database, it provides a valuable tool particularly for the identification of previously unrecognized disease-related genes. AVAILABILITY: A user friendly web-interface (http://compbio.sookmyung.ac.kr/~lage/) was constructed to provide the present large-scale approach for the analysis of GEO microarray data (GS-LAGE server).  相似文献   

7.
Results obtained from expression profilings of renal cell carcinoma using different “ome”‐based approaches and comprehensive data analysis demonstrated that proteome‐based technologies and cDNA microarray analyses complement each other during the discovery phase for disease‐related candidate biomarkers. The integration of the respective data revealed the uniqueness and complementarities of the different technologies. While comparative cDNA microarray analyses though restricted to up‐regulated targets largely revealed genes involved in controlling gene/protein expression (19%) and signal transduction processes (13%), proteomics/PROTEOMEX‐defined candidate biomarkers include enzymes of the cellular metabolism (36%), transport proteins (12%), and cell motility/structural molecules (10%). Candidate biomarkers defined by proteomics and PROTEOMEX are frequently shared, whereas the sharing rate between cDNA microarray and proteome‐based profilings is limited. Putative candidate biomarkers provide insights into their cellular (dys)function and their diagnostic/prognostic value but still warrant further validation in larger patient numbers. Based on the fact that merely three candidate biomarkers were shared by all applied technologies, namely annexin A4, tubulin α‐1A chain, and ubiquitin carboxyl‐terminal hydrolase L1, the analysis at a single hierarchical level of biological regulation seems to provide only limited results thus emphasizing the importance and benefit of performing rather combinatorial screenings which can complement the standard clinical predictors.  相似文献   

8.
The upcoming availability of public microarray repositories and of large compendia of gene expression information opens up a new realm of possibilities for microarray data analysis. An essential challenge is the efficient integration of microarray data generated by different research groups on different array platforms. This review focuses on the problems associated with this integration, which are: (1) the efficient access to and exchange of microarray data; (2) the validation and comparison of data from different platforms (cDNA and short and long oligonucleotides); and (3) the integrated statistical analysis of multiple data sets.  相似文献   

9.
Conception, design, and implementation of cDNA microarray experiments present a variety of bioinformatics challenges for biologists and computational scientists. The multiple stages of data acquisition and analysis have motivated the design of Expresso, a system for microarray experiment management. Salient aspects of Expresso include support for clone replication and randomized placement; automatic gridding, extraction of expression data from each spot, and quality monitoring; flexible methods of combining data from individual spots into information about clones and functional categories; and the use of inductive logic programming for higher-level data analysis and mining. The development of Expresso is occurring in parallel with several generations of microarray experiments aimed at elucidating genomic responses to drought stress in loblolly pine seedlings. The current experimental design incorporates 384 pine cDNAs replicated and randomly placed in two specific microarray layouts. We describe the design of Expresso as well as results of analysis with Expresso that suggest the importance of molecular chaperones and membrane transport proteins in mechanisms conferring successful adaptation to long-term drought stress.  相似文献   

10.
During recent years, microarrays have been firmly established as valuable tools for the discovery of novel biological phenomena. Especially in combination with whole genome sequences, microarray data can help unravel the dynamics of the expressed genome. For filamentous fungi, microarray studies have already been performed with more than 20 different species; these investigations have explored a variety of different aspects of fungal biology. In this review, I will give an overview of some of the key questions that have been addressed using microarray hybridizations with filamentous fungi, with particular focus on the analysis of co-regulated pathways and physically clustered genes, as well as on the use of microarray data to determine a molecular phenotype. Additionally, a number of useful, freely available software tools for the analysis of fungal microarray data will be discussed.  相似文献   

11.
12.
While minimum information about a microarray experiment (MIAME) standards have helped to increase the value of the microarray data deposited into public databases like ArrayExpress and Gene Expression Omnibus (GEO), limited means have been available to assess the quality of this data or to identify the procedures used to normalize and transform raw data. The EMERALD FP6 Coordination Action was designed to deliver approaches to assess and enhance the overall quality of microarray data and to disseminate these approaches to the microarray community through an extensive series of workshops, tutorials, and symposia. Tools were developed for assessing data quality and used to demonstrate how the removal of poor-quality data could improve the power of statistical analyses and facilitate analysis of multiple joint microarray data sets. These quality metrics tools have been disseminated through publications and through the software package arrayQualityMetrics. Within the framework provided by the Ontology of Biomedical Investigations, ontology was developed to describe data transformations, and software ontology was developed for gene expression analysis software. In addition, the consortium has advocated for the development and use of external reference standards in microarray hybridizations and created the Molecular Methods (MolMeth) database, which provides a central source for methods and protocols focusing on microarray-based technologies.  相似文献   

13.
MADE4: an R package for multivariate analysis of gene expression data   总被引:2,自引:0,他引:2  
SUMMARY: MADE4, microarray ade4, is a software package that facilitates multivariate analysis of microarray gene-expression data. MADE4 accepts a wide variety of gene-expression data formats. MADE4 takes advantage of the extensive multivariate statistical and graphical functions in the R package ade4, extending these for application to microarray data. In addition, MADE4 provides new graphical and visualization tools that aid in interpretation of multivariate analysis of microarray data.  相似文献   

14.
Chasing the dream: plant EST microarrays   总被引:12,自引:0,他引:12  
DNA microarray technology is poised to make an important contribution to the field of plant biology. Stimulated by recent funding programs, expressed sequence tag sequencing and microarray production either has begun or is being contemplated for most economically important plant species. Although the DNA microarray technology is still being refined, the basic methods are well established. The real challenges lie in data analysis and data management. To fully realize the value of this technology, centralized databases that are capable of storing microarray expression data and managing information from a variety of sources will be needed. These information resources are under development and will help usher in a new era in plant functional genomics.  相似文献   

15.
An object model and database for functional genomics   总被引:2,自引:0,他引:2  
MOTIVATION: Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. RESULTS: We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. AVAILABILITY: FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.  相似文献   

16.
We describe PerlMAT, a Perl microarray toolkit providing easy to use object-oriented methods for the simplified manipulation, management and analysis of microarray data. The toolkit provides objects for the encapsulation of microarray spots and reporters, several common microarray data file formats and GAL files. In addition, an analysis object provides methods for data processing, and an image object enables the visualisation of microarray data. This important addition to the Perl developer's library will facilitate more widespread use of Perl for microarray application development within the bioinformatics community. The coherent interface and well-documented code enables rapid analysis by even inexperienced Perl developers. AVAILABILITY: Software is available at http://sourceforge.net/projects/perlmat  相似文献   

17.
The microarray gene expression markup language (MAGE-ML) is a widely used XML (eXtensible Markup Language) standard for describing and exchanging information about microarray experiments. It can describe microarray designs, microarray experiment designs, gene expression data and data analysis results. We describe RMAGEML, a new Bioconductor package that provides a link between cDNA microarray data stored in MAGE-ML format and the Bioconductor framework for preprocessing, visualization and analysis of microarray experiments. AVAILABILITY: http://www.bioconductor.org. Open Source.  相似文献   

18.
MOTIVATION: The lack of microarray data management systems and databases is still one of the major problems faced by many life sciences laboratories. While developing the public repository for microarray data ArrayExpress we had to find novel solutions to many non-trivial software engineering problems. Our experience will be both relevant and useful for most bioinformaticians involved in developing information systems for a wide range of high-throughput technologies. RESULTS: ArrayExpress has been online since February 2002, growing exponentially to well over 10,000 hybridizations (as of September 2004). It has been demonstrated that our chosen design and implementation works for databases aimed at storage, access and sharing of high-throughput data. AVAILABILITY: The ArrayExpress database is available at http://www.ebi.ac.uk/arrayexpress/. The software is open source. CONTACT: ugis@ebi.ac.uk.  相似文献   

19.
The number of methods for pre-processing and analysis of gene expression data continues to increase, often making it difficult to select the most appropriate approach. We present a simple procedure for comparative estimation of a variety of methods for microarray data pre-processing and analysis. Our approach is based on the use of real microarray data in which controlled fold changes are introduced into 20% of the data to provide a metric for comparison with the unmodified data. The data modifications can be easily applied to raw data measured with any technological platform and retains all the complex structures and statistical characteristics of the real-world data. The power of the method is illustrated by its application to the quantitative comparison of different methods of normalization and analysis of microarray data. Our results demonstrate that the method of controlled modifications of real experimental data provides a simple tool for assessing the performance of data preprocessing and analysis methods.  相似文献   

20.
To date, the idea that microarray may shed the light on cellular processes by identifying groups of genes that appear to be co-expressed seems to remain a dream. This is partly because that there are some blank (meaning the knowledge is unavailable) or even erroneous areas in the fundamental theory in this field. This paper attempts to present the digest of microarray hybridization system with chemical thermodynamics, theoretically clarifying some misunderstandings and looking for answers to some critical questions around this technology, such as the mechanisms and conditions of quantitative measuring by hybridization reaction, the reasons of inconsistency of the data and the analysis results and the solutions, how to analyze the data, etc. A theoretical model for the next generation of microarray is proposed. We believe that this model is universal, laying the foundation for microarray technology from array design through the data analysis.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号