首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
The microarray gene expression markup language (MAGE-ML) is a widely used XML (eXtensible Markup Language) standard for describing and exchanging information about microarray experiments. It can describe microarray designs, microarray experiment designs, gene expression data and data analysis results. We describe RMAGEML, a new Bioconductor package that provides a link between cDNA microarray data stored in MAGE-ML format and the Bioconductor framework for preprocessing, visualization and analysis of microarray experiments. AVAILABILITY: http://www.bioconductor.org. Open Source.  相似文献   

2.
An object model and database for functional genomics   总被引:2,自引:0,他引:2  
MOTIVATION: Large-scale functional genomics analysis is now feasible and presents significant challenges in data analysis, storage and querying. Data standards are required to enable the development of public data repositories and to improve data sharing. There is an established data format for microarrays (microarray gene expression markup language, MAGE-ML) and a draft standard for proteomics (PEDRo). We believe that all types of functional genomics experiments should be annotated in a consistent manner, and we hope to open up new ways of comparing multiple datasets used in functional genomics. RESULTS: We have created a functional genomics experiment object model (FGE-OM), developed from the microarray model, MAGE-OM and two models for proteomics, PEDRo and our own model (Gla-PSI-Glasgow Proposal for the Proteomics Standards Initiative). FGE-OM comprises three namespaces representing (i) the parts of the model common to all functional genomics experiments; (ii) microarray-specific components; and (iii) proteomics-specific components. We believe that FGE-OM should initiate discussion about the contents and structure of the next version of MAGE and the future of proteomics standards. A prototype database called RNA And Protein Abundance Database (RAPAD), based on FGE-OM, has been implemented and populated with data from microbial pathogenesis. AVAILABILITY: FGE-OM and the RAPAD schema are available from http://www.gusdb.org/fge.html, along with a set of more detailed diagrams. RAPAD can be accessed by registration at the site.  相似文献   

3.

Background  

Many proteomics initiatives require a seamless bioinformatics integration of a range of analytical steps between sample collection and systems modeling immediately assessable to the participants involved in the process. Proteomics profiling by 2D gel electrophoresis to the putative identification of differentially expressed proteins by comparison of mass spectrometry results with reference databases, includes many components of sample processing, not just analysis and interpretation, are regularly revisited and updated. In order for such updates and dissemination of data, a suitable data structure is needed. However, there are no such data structures currently available for the storing of data for multiple gels generated through a single proteomic experiments in a single XML file. This paper proposes a data structure based on XML standards to fill the void that exists between data generated by proteomics experiments and storing of data.  相似文献   

4.
ABSTRACT

Introduction: Protein microarray is a powerful tool for both biological study and clinical research. The most useful features of protein microarrays are their miniaturized size (low reagent and sample consumption), high sensitivity and their capability for parallel/high-throughput analysis. The major focus of this review is functional proteome microarray.

Areas covered: For proteome microarray, this review will discuss some recently constructed proteome microarrays and new concepts that have been used for constructing proteome microarrays and data interpretation in past few years, such as PAGES, M-NAPPA strategy, VirD technology, and the first protein microarray database. this review will summarize recent proteomic scale applications and address the limitations and future directions of proteome microarray technology.

Expert opinion: Proteome microarray is a powerful tool for basic biological and clinical research. It is expected to see improvements in the currently used proteome microarrays and the construction of more proteome microarrays for other species by using traditional strategies or novel concepts. It is anticipated that the maximum number of features on a single microarray and the number of possible applications will be increased, and the information that can be obtained from proteome microarray experiments will more in-depth in the future.  相似文献   

5.
A broad range of mass spectrometers are used in mass spectrometry (MS)-based proteomics research. Each type of instrument possesses a unique design, data system and performance specifications, resulting in strengths and weaknesses for different types of experiments. Unfortunately, the native binary data formats produced by each type of mass spectrometer also differ and are usually proprietary. The diverse, nontransparent nature of the data structure complicates the integration of new instruments into preexisting infrastructure, impedes the analysis, exchange, comparison and publication of results from different experiments and laboratories, and prevents the bioinformatics community from accessing data sets required for software development. Here, we introduce the 'mzXML' format, an open, generic XML (extensible markup language) representation of MS data. We have also developed an accompanying suite of supporting programs. We expect that this format will facilitate data management, interpretation and dissemination in proteomics research.  相似文献   

6.
Shen C  Li L  Chen JY 《Proteins》2006,64(2):436-443
Experimental processes to collect and process proteomics data are increasingly complex, and the computational methods to assess the quality and significance of these data remain unsophisticated. These challenges have led to many biological oversights and computational misconceptions. We developed an empirical Bayes model to analyze multiprotein complex (MPC) proteomics data derived from peptide mass spectrometry detections of purified protein complex pull-down experiments. Using our model and two yeast proteomics data sets, we estimated that there should be an average of about 20 true associations per MPC, almost 10 times as high as was previously estimated. For data sets generated to mimic a real proteome, our model achieved on average 80% sensitivity in detecting true associations, as compared with the 3% sensitivity in previous work, while maintaining a comparable false discovery rate of 0.3%. Cross-examination of our results with protein complexes confirmed by various experimental techniques demonstrates that many true associations that cannot be identified by previous approach are identified by our method.  相似文献   

7.

Background  

Many proteomics initiatives require integration of all information with uniformcriteria from collection of samples and data display to publication of experimental results. The integration and exchanging of these data of different formats and structure imposes a great challenge to us. The XML technology presents a promise in handling this task due to its simplicity and flexibility. Nasopharyngeal carcinoma (NPC) is one of the most common cancers in southern China and Southeast Asia, which has marked geographic and racial differences in incidence. Although there are some cancer proteome databases now, there is still no NPC proteome database.  相似文献   

8.
Adipocytes are central players in energy metabolism and the obesity epidemic, yet their protein composition remains largely unexplored. We investigated the adipocyte proteome by combining high accuracy, high sensitivity protein identification technology with subcellular fractionation of nuclei, mitochondria, membrane, and cytosol of 3T3-L1 adipocytes. We identified 3,287 proteins while essentially eliminating false positives, making this one of the largest high confidence proteomes reported to date. Comprehensive bioinformatics analysis revealed that the adipocyte proteome, despite its specialized role, is very complex. Comparison with microarray data showed that the mRNA abundance of detected versus non-detected proteins differed by less than 2-fold and that proteomics covered as large a proportion of the insulin signaling pathway. We used the Endeavour gene prioritization algorithm to associate a number of factors with vesicle transport in response to insulin stimulation, a key function of adipocytes. Our data and analysis can serve as a model for cellular proteomics. The adipocyte proteome is available as supplemental material and from the Max-Planck Unified Proteome database.  相似文献   

9.
10.
Results obtained from expression profilings of renal cell carcinoma using different “ome”‐based approaches and comprehensive data analysis demonstrated that proteome‐based technologies and cDNA microarray analyses complement each other during the discovery phase for disease‐related candidate biomarkers. The integration of the respective data revealed the uniqueness and complementarities of the different technologies. While comparative cDNA microarray analyses though restricted to up‐regulated targets largely revealed genes involved in controlling gene/protein expression (19%) and signal transduction processes (13%), proteomics/PROTEOMEX‐defined candidate biomarkers include enzymes of the cellular metabolism (36%), transport proteins (12%), and cell motility/structural molecules (10%). Candidate biomarkers defined by proteomics and PROTEOMEX are frequently shared, whereas the sharing rate between cDNA microarray and proteome‐based profilings is limited. Putative candidate biomarkers provide insights into their cellular (dys)function and their diagnostic/prognostic value but still warrant further validation in larger patient numbers. Based on the fact that merely three candidate biomarkers were shared by all applied technologies, namely annexin A4, tubulin α‐1A chain, and ubiquitin carboxyl‐terminal hydrolase L1, the analysis at a single hierarchical level of biological regulation seems to provide only limited results thus emphasizing the importance and benefit of performing rather combinatorial screenings which can complement the standard clinical predictors.  相似文献   

11.
The Microarray Gene Expression Data (MGED) society is an international organization established in 1999 for facilitating sharing of functional genomics and proteomics array data. To facilitate microarray data sharing, the MGED society has been working in establishing the relevant data standards. The three main components (which will be described in more detail later) of MGED standards are Minimum Information About a Microarray Experiment (MIAME), a document that outlines the minimum information that should be reported about a microarray experiment to enable its unambiguous interpretation and reproduction; MAGE, which consists of three parts, The Microarray Gene Expression Object Model (MAGE-OM), an XML-based document exchange format (MAGE-ML), which is derived directly from the object model, and the supporting tool kit MAGEstk; and MO, or MGED Ontology, which defines sets of common terms and annotation rules for microarray experiments, enabling unambiguous annotation and efficient queries, data analysis and data exchange without loss of meaning. We discuss here how these standards have been established, how they have evolved, and how they are used.  相似文献   

12.
Orchard S  Ping P 《Proteomics》2006,6(16):4436-4438
This meeting was convened with the aim of bringing together representatives from scientific journals, granting authorities, software and instrumentation manufacturers, data producers and database providers to discuss the implementation and adoption of the HUPO-PSI data standards and how these can be best used to support the publication and dissemination of proteomics data. The current status of data formats and reporting requirements was reviewed and the attendees agreed that the use of data standards was essential as the field of proteomics grows and matures.  相似文献   

13.
14.
15.
XML, bioinformatics and data integration   总被引:15,自引:0,他引:15  
Motivation: The eXtensible Markup Language (XML) is an emerging standard for structuring documents, notably for the World Wide Web. In this paper, the authors present XML and examine its use as a data language for bioinformatics. In particular, XML is compared to other languages, and some of the potential uses of XML in bioinformatics applications are presented. The authors propose to adopt XML for data interchange between databases and other sources of data. Finally the discussion is illustrated by a test case of a pedigree data model in XML. Contact: Emmanuel.Barillot@infobiogen.fr  相似文献   

16.
The global analysis of proteins is now feasible due to improvements in techniques such as two-dimensional gel electrophoresis (2-DE), mass spectrometry, yeast two-hybrid systems and the development of bioinformatics applications. The experiments form the basis of proteomics, and present significant challenges in data analysis, storage and querying. We argue that a standard format for proteome data is required to enable the storage, exchange and subsequent re-analysis of large datasets. We describe the criteria that must be met for the development of a standard for proteomics. We have developed a model to represent data from 2-DE experiments, including difference gel electrophoresis along with image analysis and statistical analysis across multiple gels. This part of proteomics analysis is not represented in current proposals for proteomics standards. We are working with the Proteomics Standards Initiative to develop a model encompassing biological sample origin, experimental protocols, a number of separation techniques and mass spectrometry. The standard format will facilitate the development of central repositories of data, enabling results to be verified or re-analysed, and the correlation of results produced by different research groups using a variety of laboratory techniques.  相似文献   

17.
Proteomics has been applied with great potential to elucidate molecular mechanisms in plants. This is especially valid in the case of non‐model crops of which their genome has not been sequenced yet, or is not well annotated. Plantains are a kind of cooking bananas that are economically very important in Africa, India, and Latin America. The aim of this work was to characterize the fruit proteome of common dessert bananas and plantains and to identify proteins that are only encoded by the plantain genome. We present the first plantain fruit proteome. All data are available via ProteomeXchange with identifier PXD005589. Using our in‐house workflow, we found 37 alleles to be unique for plantain covered by 59 peptides. Although we do not have access (yet) to whole‐genome sequencing data from triploid banana cultivars, we show that proteomics is an easily accessible complementary alternative to detect different allele specific SNPs/SAAPs. These unique alleles might contribute toward the differences in the metabolism between dessert bananas and plantains. This dataset will stimulate further analysis by the scientific community, boost plantain research, and facilitate plantain breeding.  相似文献   

18.
19.
20.
The spring workshop of the HUPO-PSI convened in Siena to further progress the data standards which are already making an impact on data exchange and deposition in the field of proteomics. Separate work groups pushed forward existing XML standards for the exchange of Molecular Interaction data (PSI-MI, MIF) and Mass Spectrometry data (PSI-MS, mzData) whilst significant progress was made on PSI-MS' mzIdent, which will allow the capture of data from analytical tools such as peak list search engines. A new focus for PSI (GPS, gel electrophoresis) was explored; as was the need for a common representation of protein modifications by all workers in the field of proteomics and beyond. All these efforts are contextualised by the work of the General Proteomics Standards workgroup; which in addition to the MIAPE reporting guidelines, is continually evolving an object model (PSI-OM) from which will be derived the general standard XML format for exchanging data between researchers, and for submission to repositories or journals.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号