首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Sharing of microarray data has many advantages for the scientific and biomedical community, and should be advocated by neuroscience journals. The goals of sharing are manifold, and include improving analysis and confidence in results, and facilitating global comparisons between experiments, while at the same time, not penalizing those who share. The sharing of microarray data poses unique challenges relative to more generic data such as DNA sequences. These challenges are surmountable, and various sharing formats are possible. Centralized non-commercial databases are being developed to facilitate this process.  相似文献   

2.
Making sense of microarray data is a complex process, in which the interpretation of findings will depend on the overall experimental design and judgement of the investigator performing the analysis. As a result, differences in tissue harvesting, microarray types, sample labelling and data analysis procedures make post hoc sharing of microarray data a great challenge. To ensure rapid and meaningful data exchange, we need to create some order out of the existing chaos. In these ground-breaking microarray standardization and data sharing efforts, NIH agencies should take a leading role  相似文献   

3.
We describe the creation process of the Minimum Information Specification for In Situ Hybridization and Immunohistochemistry Experiments (MISFISHIE). Modeled after the existing minimum information specification for microarray data, we created a new specification for gene expression localization experiments, initially to facilitate data sharing within a consortium. After successful use within the consortium, the specification was circulated to members of the wider biomedical research community for comment and refinement. After a period of acquiring many new suggested requirements, it was necessary to enter a final phase of excluding those requirements that were deemed inappropriate as a minimum requirement for all experiments. The full specification will soon be published as a version 1.0 proposal to the community, upon which a more full discussion must take place so that the final specification may be achieved with the involvement of the whole community.  相似文献   

4.
The Microarray Gene Expression Data (MGED) society is an international organization established in 1999 for facilitating sharing of functional genomics and proteomics array data. To facilitate microarray data sharing, the MGED society has been working in establishing the relevant data standards. The three main components (which will be described in more detail later) of MGED standards are Minimum Information About a Microarray Experiment (MIAME), a document that outlines the minimum information that should be reported about a microarray experiment to enable its unambiguous interpretation and reproduction; MAGE, which consists of three parts, The Microarray Gene Expression Object Model (MAGE-OM), an XML-based document exchange format (MAGE-ML), which is derived directly from the object model, and the supporting tool kit MAGEstk; and MO, or MGED Ontology, which defines sets of common terms and annotation rules for microarray experiments, enabling unambiguous annotation and efficient queries, data analysis and data exchange without loss of meaning. We discuss here how these standards have been established, how they have evolved, and how they are used.  相似文献   

5.
6.
An object model and database for functional genomics   总被引:2,自引:0,他引:2  
MOTIVATION: Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. RESULTS: We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. AVAILABILITY: FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.  相似文献   

7.
8.
Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn''t, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication.Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%–35% in 2007–2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available.First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available.These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let''s learn from those with high rates of sharing to embrace the full potential of our research output.  相似文献   

9.
Confirming microarray data--is it really necessary?   总被引:2,自引:0,他引:2  
Rockett JC  Hellmann GM 《Genomics》2004,83(4):541-549
The generation of corroborative data has become a commonly used approach for ensuring the veracity of microarray data. Indeed, the need to conduct corroborative studies has now become official editorial policy for at least 2 journals, and several more are considering introducing such a policy. The issue of corroborating microarray data is a challenging one-there are good arguments for and against conducting such experiments. However, we believe that the introduction of a fixed requirement to corroborate microarray data, especially if adopted by more journals, is overly burdensome and may, in at least several applications of microarray technology, be inappropriate. We also believe that, in cases in which corroborative studies are deemed essential, a lack of clear guidance leaves researchers unclear as to what constitutes an acceptable corroborative study. Guidelines have already been outlined regarding the details of conducting microarray experiments. We propose that all stakeholders, including journal editorial boards, reviewers, and researchers, should undertake concerted and inclusive efforts to address properly and clarify the specific issue of corroborative data. In this article we highlight some of the thorny and vague areas for discussion surrounding this issue. We also report the results of a poll in which 76 life science journals were asked about their current or intended policies on the inclusion of corroborative studies in papers containing microarray data.  相似文献   

10.
MOTIVATION: The lack of microarray data management systems and databases is still one of the major problems faced by many life sciences laboratories. While developing the public repository for microarray data ArrayExpress we had to find novel solutions to many non-trivial software engineering problems. Our experience will be both relevant and useful for most bioinformaticians involved in developing information systems for a wide range of high-throughput technologies. RESULTS: ArrayExpress has been online since February 2002, growing exponentially to well over 10,000 hybridizations (as of September 2004). It has been demonstrated that our chosen design and implementation works for databases aimed at storage, access and sharing of high-throughput data. AVAILABILITY: The ArrayExpress database is available at http://www.ebi.ac.uk/arrayexpress/. The software is open source. CONTACT: ugis@ebi.ac.uk.  相似文献   

11.
RNA-Seq and microarray platforms have emerged as important tools for detecting changes in gene expression and RNA processing in biological samples. We present ExpressionPlot, a software package consisting of a default back end, which prepares raw sequencing or Affymetrix microarray data, and a web-based front end, which offers a biologically centered interface to browse, visualize, and compare different data sets. Download and installation instructions, a user's manual, discussion group, and a prototype are available at .  相似文献   

12.
We initiated the Critical Assessment of Microarray Data Analysis (CAMDA) conference to stimulate and evaluate the development of advanced data analysis techniques for microarrays. A standard data set has been released for this data analysis challenge. The goal of this challenge is to assess the performance of different analytical methods and at the same time to determine how such methods should be evaluated. We hope this effort will catalyze the discussion of microarray data analysis among the research community of biologists, statisticians, mathematicians, and computer scientists. AVAILABILITY: http://camda.duke.edu.  相似文献   

13.
A wide variety of software tools are available to analyze microarray data. To identify the optimum software for any project, it is essential to define specific and essential criteria on which to evaluate the advantages of the key features. In this review we describe the results of our comparison of several software tools. We then conclude with a discussion of the subset of tools that are most commonly used and describe the features that would constitute the “ideal microarray analysis software suite.”  相似文献   

14.
MOTIVATION: The identification of the change of gene expression in multifactorial diseases, such as breast cancer is a major goal of DNA microarray experiments. Here we present a new data mining strategy to better analyze the marginal difference in gene expression between microarray samples. The idea is based on the notion that the consideration of gene's behavior in a wide variety of experiments can improve the statistical reliability on identifying genes with moderate changes between samples. RESULTS: The availability of a large collection of array samples sharing the same platform in public databases, such as NCBI GEO, enabled us to re-standardize the expression intensity of a gene using its mean and variation in the wide variety of experimental conditions. This approach was evaluated via the re-identification of breast cancer-specific gene expression. It successfully prioritized several genes associated with breast tumor, for which the expression difference between normal and breast cancer cells was marginal and thus would have been difficult to recognize using conventional analysis methods. Maximizing the utility of microarray data in the public database, it provides a valuable tool particularly for the identification of previously unrecognized disease-related genes. AVAILABILITY: A user friendly web-interface (http://compbio.sookmyung.ac.kr/~lage/) was constructed to provide the present large-scale approach for the analysis of GEO microarray data (GS-LAGE server).  相似文献   

15.
16.
False discovery rate, sensitivity and sample size for microarray studies   总被引:10,自引:0,他引:10  
MOTIVATION: In microarray data studies most researchers are keenly aware of the potentially high rate of false positives and the need to control it. One key statistical shift is the move away from the well-known P-value to false discovery rate (FDR). Less discussion perhaps has been spent on the sensitivity or the associated false negative rate (FNR). The purpose of this paper is to explain in simple ways why the shift from P-value to FDR for statistical assessment of microarray data is necessary, to elucidate the determining factors of FDR and, for a two-sample comparative study, to discuss its control via sample size at the design stage. RESULTS: We use a mixture model, involving differentially expressed (DE) and non-DE genes, that captures the most common problem of finding DE genes. Factors determining FDR are (1) the proportion of truly differentially expressed genes, (2) the distribution of the true differences, (3) measurement variability and (4) sample size. Many current small microarray studies are plagued with large FDR, but controlling FDR alone can lead to unacceptably large FNR. In evaluating a design of a microarray study, sensitivity or FNR curves should be computed routinely together with FDR curves. Under certain assumptions, the FDR and FNR curves coincide, thus simplifying the choice of sample size for controlling the FDR and FNR jointly.  相似文献   

17.
The transfer of scientific data has emerged as a significant challenge, as datasets continue to grow in size and demand for open access sharing increases. Current methods for file transfer do not scale well for large files and can cause long transfer times. In this study we present BioTorrents, a website that allows open access sharing of scientific data and uses the popular BitTorrent peer-to-peer file sharing technology. BioTorrents allows files to be transferred rapidly due to the sharing of bandwidth across multiple institutions and provides more reliable file transfers due to the built-in error checking of the file sharing technology. BioTorrents contains multiple features, including keyword searching, category browsing, RSS feeds, torrent comments, and a discussion forum. BioTorrents is available at http://www.biotorrents.net.  相似文献   

18.
Most scientists recognize the importance of sharing data online in an open fashion. Nonetheless, many studies have documented the concerns that accompany data sharing activities, including loss of credit or IP, misuse and the time needed to curate interoperable data. To this end, discussions around data sharing often identify incentives that could potentially ameliorate these disincentivising concerns. Nonetheless, current Open Data discussions often rely on evidence‐based studies to identify the disincentives to overcome. This results in highly specific and directed interventions. In contrast, this paper offers a different interpretation of these concerns. To do so, it makes use of the Thomas Theorem which suggests that: “If men define situations as real, they are real in their consequences”. Using empirical evidence from sub‐Saharan African (bio)chemistry laboratories, this paper illustrates how individual perceptions of research environments – whether associated with evidence or not – are highly influential in shaping data sharing practices. It concludes with the suggestion that discussion on incentivising data sharing amongst scientific communities need to take a broader set of concerns into account and offer a more creative approach to ameliorating environmental disincentives.  相似文献   

19.
This review focuses on using microarray data on a clonal osteoblast cell model to demonstrate how various current and future bioinformatic tools can be used to understand, at a more global or comprehensible level, how cells grow and differentiate. In this example, BMP2 was used to stimulate growth and differentiation of osteoblast to a mineralized matrix. A discussion is included on various methods for clustering gene expression data, statistical evaluation of data, and various new tools that can be used to derive deeper insight into a particular biological problem. How these tools can be obtained is also discussed. New tools for the biologists to compare their datasets with others, as well as examples of future bioinformatic tools that can be used for developing gene networks and pathways for a given set of data are included and discussed.  相似文献   

20.
We analyze data sharing practices of astronomers over the past fifteen years. An analysis of URL links embedded in papers published by the American Astronomical Society reveals that the total number of links included in the literature rose dramatically from 1997 until 2005, when it leveled off at around 1500 per year. The analysis also shows that the availability of linked material decays with time: in 2011, 44% of links published a decade earlier, in 2001, were broken. A rough analysis of link types reveals that links to data hosted on astronomers'' personal websites become unreachable much faster than links to datasets on curated institutional sites. To gauge astronomers'' current data sharing practices and preferences further, we performed in-depth interviews with 12 scientists and online surveys with 173 scientists, all at a large astrophysical research institute in the United States: the Harvard-Smithsonian Center for Astrophysics, in Cambridge, MA. Both the in-depth interviews and the online survey indicate that, in principle, there is no philosophical objection to data-sharing among astronomers at this institution. Key reasons that more data are not presently shared more efficiently in astronomy include: the difficulty of sharing large data sets; over reliance on non-robust, non-reproducible mechanisms for sharing data (e.g. emailing it); unfamiliarity with options that make data-sharing easier (faster) and/or more robust; and, lastly, a sense that other researchers would not want the data to be shared. We conclude with a short discussion of a new effort to implement an easy-to-use, robust, system for data sharing in astronomy, at theastrodata.org, and we analyze the uptake of that system to-date.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号