首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
One important aim within systems biology is to integrate disparate pieces of information, leading to discovery of higher-level knowledge about important functionality within living organisms. This makes standards for representation of data and technology for exchange and integration of data important key points for development within the area. In this article, we focus on the recent developments within the field. We compare the recent updates to the three standard representations for exchange of data SBML, PSI MI and BioPAX. In addition, we give an overview of available tools for these three standards and a discussion on how these developments support possibilities for data exchange and integration.  相似文献   

2.
MOTIVATION: The generation of large amounts of microarray data and the need to share these data bring challenges for both data management and annotation and highlights the need for standards. MIAME specifies the minimum information needed to describe a microarray experiment and the Microarray Gene Expression Object Model (MAGE-OM) and resulting MAGE-ML provide a mechanism to standardize data representation for data exchange, however a common terminology for data annotation is needed to support these standards. RESULTS: Here we describe the MGED Ontology (MO) developed by the Ontology Working Group of the Microarray Gene Expression Data (MGED) Society. The MO provides terms for annotating all aspects of a microarray experiment from the design of the experiment and array layout, through to the preparation of the biological sample and the protocols used to hybridize the RNA and analyze the data. The MO was developed to provide terms for annotating experiments in line with the MIAME guidelines, i.e. to provide the semantics to describe a microarray experiment according to the concepts specified in MIAME. The MO does not attempt to incorporate terms from existing ontologies, e.g. those that deal with anatomical parts or developmental stages terms, but provides a framework to reference terms in other ontologies and therefore facilitates the use of ontologies in microarray data annotation. AVAILABILITY: The MGED Ontology version.1.2.0 is available as a file in both DAML and OWL formats at http://mged.sourceforge.net/ontologies/index.php. Release notes and annotation examples are provided. The MO is also provided via the NCICB's Enterprise Vocabulary System (http://nciterms.nci.nih.gov/NCIBrowser/Dictionary.do). CONTACT: Stoeckrt@pcbi.upenn.edu SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.  相似文献   

3.
The increasing volume of proteomics data currently being generated by increasingly high-throughput methodologies has led to an increasing need for methods by which such data can be accurately described, stored and exchanged between experimental researchers and data repositories. Work by the Proteomics Standards Initiative of the Human Proteome Organisation has laid the foundation for the development of standards by which experimental design can be described and data exchange facilitated. The progress of these efforts, and the direct benefits already accruing from them, were described at a plenary session of the 3(rd) Annual HUPO congress. Parallel sessions allowed the three work groups to present their progress to interested parties and to collect feedback from groups already implementing the available formats.  相似文献   

4.
Cary MP  Bader GD  Sander C 《FEBS letters》2005,579(8):1815-1820
Pathway information is vital for successful quantitative modeling of biological systems. The almost 170 online pathway databases vary widely in coverage and representation of biological processes, making their use extremely difficult. Future pathway information systems for querying, visualization and analysis must support standard exchange formats to successfully integrate data on a large scale. Such integrated systems will greatly facilitate the constructive cycle of computational model building and experimental verification that lies at the heart of systems biology.  相似文献   

5.
6.
High-throughput technologies are generating large amounts of complex data that have to be stored in databases, communicated to various data analysis tools and interpreted by scientists. Data representation and communication standards are needed to implement these steps efficiently. Here we give a classification of various standards related to systems biology and discuss various aspects of standardization in life sciences in general. Why are some standards more successful than others, what are the prerequisites for a standard to succeed and what are the possible pitfalls?  相似文献   

7.
The generation of proteomic data is becoming ever more high throughput. Both the technologies and experimental designs used to generate and analyze data are becoming increasingly complex. The need for methods by which such data can be accurately described, stored and exchanged between experimenters and data repositories has been recognized. Work by the Proteome Standards Initiative of the Human Proteome Organization has laid the foundation for the development of standards by which experimental design can be described and data exchange facilitated. The Minimum Information About a Proteomic Experiment data model describes both the scope and purpose of a proteomics experiment and encompasses the development of more specific interchange formats such as the mzData model of mass spectrometry. The eXtensible Mark-up Language-MI data interchange format, which allows exchange of molecular interaction data, has already been published and major databases within this field are supplying data downloads in this format.  相似文献   

8.
The generation of proteomic data is becoming ever more high throughput. Both the technologies and experimental designs used to generate and analyze data are becoming increasingly complex. The need for methods by which such data can be accurately described, stored and exchanged between experimenters and data repositories has been recognized. Work by the Proteome Standards Initiative of the Human Proteome Organization has laid the foundation for the development of standards by which experimental design can be described and data exchange facilitated. The Minimum Information About a Proteomic Experiment data model describes both the scope and purpose of a proteomics experiment and encompasses the development of more specific interchange formats such as the mzData model of mass spectrometry. The eXtensible Mark-up Language-MI data interchange format, which allows exchange of molecular interaction data, has already been published and major databases within this field are supplying data downloads in this format.  相似文献   

9.
The Human Proteome Organization's Proteomics Standards Initiative (PSI) promotes the development of exchange standards to improve data integration and interoperability. PSI specifies the suitable level of detail required when reporting a proteomics experiment (via the Minimum Information About a Proteomics Experiment), and provides extensible markup language (XML) exchange formats and dedicated controlled vocabularies (CVs) that must be combined to generate a standard compliant document. The framework presented here tackles the issue of checking that experimental data reported using a specific format, CVs and public bio‐ontologies (e.g. Gene Ontology, NCBI taxonomy) are compliant with the Minimum Information About a Proteomics Experiment recommendations. The semantic validator not only checks the XML syntax but it also enforces rules regarding the use of an ontology class or CV terms by checking that the terms exist in the resource and that they are used in the correct location of a document. Moreover, this framework is extremely fast, even on sizable data files, and flexible, as it can be adapted to any standard by customizing the parameters it requires: an XML Schema Definition, one or more CVs or ontologies, and a mapping file describing in a formal way how the semantic resources and the format are interrelated. As such, the validator provides a general solution to the common problem in data exchange: how to validate the correct usage of a data standard beyond simple XML Schema Definition validation. The framework source code and its various applications can be found at http://psidev.info/validator .  相似文献   

10.
The Computational Modeling in Biology Network (COMBINE), is an initiative to coordinate the development of the various community standards and formats in computational systems biology and related fields. This report summarizes the activities pursued at the first annual COMBINE meeting held in Edinburgh on October 6-9 2010 and the first HARMONY hackathon, held in New York on April 18-22 2011. The first of those meetings hosted 81 attendees. Discussions covered both official COMBINE standards-(BioPAX, SBGN and SBML), as well as emerging efforts and interoperability between different formats. The second meeting, oriented towards software developers, welcomed 59 participants and witnessed many technical discussions, development of improved standards support in community software systems and conversion between the standards. Both meetings were resounding successes and showed that the field is now mature enough to develop representation formats and related standards in a coordinated manner.  相似文献   

11.
12.

Background  

Flow cytometry technology is widely used in both health care and research. The rapid expansion of flow cytometry applications has outpaced the development of data storage and analysis tools. Collaborative efforts being taken to eliminate this gap include building common vocabularies and ontologies, designing generic data models, and defining data exchange formats. The Minimum Information about a Flow Cytometry Experiment (MIFlowCyt) standard was recently adopted by the International Society for Advancement of Cytometry. This standard guides researchers on the information that should be included in peer reviewed publications, but it is insufficient for data exchange and integration between computational systems. The Functional Genomics Experiment (FuGE) formalizes common aspects of comprehensive and high throughput experiments across different biological technologies. We have extended FuGE object model to accommodate flow cytometry data and metadata.  相似文献   

13.
Lars Vogt 《Zoomorphology》2009,128(3):201-217
Due to lack of common data standards, the communicability and comparability of biological data across various levels of organization and taxonomic groups is continuously decreasing. However, the interdependence between molecular and higher levels of organization is of growing interest and calls for co-operations between biologists from different methodological and theoretical backgrounds. A general data standard in biology would greatly facilitate such co-operations. This article examines the role that defined and formalized vocabularies (i.e., ontologies) could have in developing such a data standard. I suggest basic criteria for developing data standards on grounds of distinguishing content, concept, nomenclatural, and format standards and discuss the role of data bases and their use of bio-ontologies in current activities for data standardization in biology. General principles of ontology development are introduced, including foundational ontology properties (e.g. class–subclass, parthood), and how concepts are defined. After addressing problems that are specific to morphological data, the notion of a general structure concept for morphology is introduced and why it is required for developing a morphological ontology. The necessity for a general morphological ontology to be taxon-independent and free of homology assumptions is discussed and how it can solve the problems of morphology. The article concludes with an outlook on how the use of ontologies will likely establish some sort of general data standard in biology and why the development of a set of commonly used foundational ontology properties and the use of globally unique identifiers for all classes defined in ontologies is crucial for its success.  相似文献   

14.
The Functional Genomics Experiment data model (FuGE) has been developed to facilitate convergence of data standards for high-throughput, comprehensive analyses in biology. FuGE models the components of an experimental activity that are common across different technologies, including protocols, samples and data. FuGE provides a foundation for describing entire laboratory workflows and for the development of new data formats. The Microarray Gene Expression Data society and the Proteomics Standards Initiative have committed to using FuGE as the basis for defining their respective standards, and other standards groups, including the Metabolomics Standards Initiative, are evaluating FuGE in their development efforts. Adoption of FuGE by multiple standards bodies will enable uniform reporting of common parts of functional genomics workflows, simplify data-integration efforts and ease the burden on researchers seeking to fulfill multiple minimum reporting requirements. Such advances are important for transparent data management and mining in functional genomics and systems biology.  相似文献   

15.
ABSTRACT: BACKGROUND: Ongoing innovation in phylogenetics and evolutionary biology has been accompanied by a proliferation of software tools, data formats, analytical techniques and web servers. This brings with it the challenge of integrating phylogenetic and other related biological data found in a wide variety of formats, and underlines the need for reusable software that can read, manipulate and transform this information into the various forms required to build computational pipelines. RESULTS: We built a Python software library for working with phylogenetic data that is tightly integrated with Biopython, a broad-ranging toolkit for computational biology. Our library, Bio.Phylo, is highly interoperable with existing libraries, tools and standards, and is capable of parsing common file formats for phylogenetic trees, performing basic transformations and manipulations, attaching rich annotations, and visualizing trees. We unified the modules for working with the standard file formats Newick, NEXUS and phyloXML behind a consistent and simple API, providing a common set of functionality independent of the data source. CONCLUSIONS: Bio.Phylo meets a growing need in bioinformatics for working with heterogeneous types of phylogenetic data. By supporting interoperability with multiple file formats and leveraging existing Biopython features, this library simplifies the construction of phylogenetic workflows. We also provide examples of the benefits of building a community around a shared open-source project. Bio.Phylo is included with Biopython, available through the Biopython website, http://biopython.org.  相似文献   

16.

Background  

The generation of large amounts of microarray data presents challenges for data collection, annotation, exchange and analysis. Although there are now widely accepted formats, minimum standards for data content and ontologies for microarray data, only a few groups are using them together to build and populate large-scale databases. Structured environments for data management are crucial for making full use of these data.  相似文献   

17.
It is widely predicted that cost and efficiency gains in sequencing will usher in an era of personal genomics and personalized, predictive, preventive, and participatory medicine within a decade. I review the computational challenges ahead and propose general and specific directions for research and development. There is an urgent need to develop semantic ontologies that span genomics, molecular systems biology, and medical data. Although the development of such ontologies would be costly and difficult, the benefits will far outweigh the costs. I argue that availability of such ontologies would allow a revolution in web-services for personal genomics and medicine.  相似文献   

18.
An increase in studies using restriction site‐associated DNA sequencing (RADseq) methods has led to a need for both the development and assessment of novel bioinformatic tools that aid in the generation and analysis of these data. Here, we report the availability of AftrRAD, a bioinformatic pipeline that efficiently assembles and genotypes RADseq data, and outputs these data in various formats for downstream analyses. We use simulated and experimental data sets to evaluate AftrRAD's ability to perform accurate de novo assembly of loci, and we compare its performance with two other commonly used programs, stacks and pyrad. We demonstrate that AftrRAD is able to accurately assemble loci, while accounting for indel variation among alleles, in a more computationally efficient manner than currently available programs. AftrRAD run times are not strongly affected by the number of samples in the data set, making this program a useful tool when multicore systems are not available for parallel processing, or when data sets include large numbers of samples.  相似文献   

19.
20.
The "4D Biology Workshop for Health and Disease", held on 16-17th of March 2010 in Brussels, aimed at finding the best organising principles for large-scale proteomics, interactomics and structural genomics/biology initiatives, and setting the vision for future high-throughput research and large-scale data gathering in biological and medical science. Major conclusions of the workshop include the following. (i) Development of new technologies and approaches to data analysis is crucial. Biophysical methods should be developed that span a broad range of time/spatial resolution and characterise structures and kinetics of interactions. Mathematics, physics, computational and engineering tools need to be used more in biology and new tools need to be developed. (ii) Database efforts need to focus on improved definitions of ontologies and standards so that system-scale data and associated metadata can be understood and shared efficiently. (iii) Research infrastructures should play a key role in fostering multidisciplinary research, maximising knowledge exchange between disciplines and facilitating access to diverse technologies. (iv) Understanding disease on a molecular level is crucial. System approaches may represent a new paradigm in the search for biomarkers and new targets in human disease. (v) Appropriate education and training should be provided to help efficient exchange of knowledge between theoreticians, experimental biologists and clinicians. These conclusions provide a strong basis for creating major possibilities in advancing research and clinical applications towards personalised medicine.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号