首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The Feeding Experiments End-user Database (FEED) is a research tool developed by the Mammalian Feeding Working Group at the National Evolutionary Synthesis Center that permits synthetic, evolutionary analyses of the physiology of mammalian feeding. The tasks of the Working Group are to compile physiologic data sets into a uniform digital format stored at a central source, develop a standardized terminology for describing and organizing the data, and carry out a set of novel analyses using FEED. FEED contains raw physiologic data linked to extensive metadata. It serves as an archive for a large number of existing data sets and a repository for future data sets. The metadata are stored as text and images that describe experimental protocols, research subjects, and anatomical information. The metadata incorporate controlled vocabularies to allow consistent use of the terms used to describe and organize the physiologic data. The planned analyses address long-standing questions concerning the phylogenetic distribution of phenotypes involving muscle anatomy and feeding physiology among mammals, the presence and nature of motor pattern conservation in the mammalian feeding muscles, and the extent to which suckling constrains the evolution of feeding behavior in adult mammals. We expect FEED to be a growing digital archive that will facilitate new research into understanding the evolution of feeding anatomy.  相似文献   

2.
Modern society depends on the use of many diverse materials. Effectively managing these materials is becoming increasingly important and complex, from the analysis of supply chains, to quantifying their environmental impacts, to understanding future resource availability. Material stocks and flows data enable such analyses, but currently exist mainly as discrete packages, with highly varied type, scope, and structure. These factors constitute a powerful barrier to holistic integration and thus universal analysis of existing and yet to be published material stocks and flows data. We present the Unified Materials Information System (UMIS) to overcome this barrier by enabling material stocks and flows data to be comprehensively integrated across space, time, materials, and data type independent of their disaggregation, without loss of information, and avoiding double counting. UMIS can therefore be applied to structure diverse material stocks and flows data and their metadata across material systems analysis methods such as material flow analysis (MFA), input‐output analysis, and life cycle assessment. UMIS uniquely labels and visualizes processes and flows in UMIS diagrams; therefore, material stocks and flows data visualized in UMIS diagrams can be individually referenced in databases and computational models. Applications of UMIS to restructure existing material stocks and flows data represented by block flow diagrams, system dynamics diagrams, Sankey diagrams, matrices, and derived using the economy‐wide MFA classification system are presented to exemplify use. UMIS advances the capabilities with which complex quantitative material systems analysis, archiving, and computation of material stocks and flows data can be performed.  相似文献   

3.
4.
Much biodiversity data is collected worldwide, but it remains challenging to assemble the scattered knowledge for assessing biodiversity status and trends. The concept of Essential Biodiversity Variables (EBVs) was introduced to structure biodiversity monitoring globally, and to harmonize and standardize biodiversity data from disparate sources to capture a minimum set of critical variables required to study, report and manage biodiversity change. Here, we assess the challenges of a ‘Big Data’ approach to building global EBV data products across taxa and spatiotemporal scales, focusing on species distribution and abundance. The majority of currently available data on species distributions derives from incidentally reported observations or from surveys where presence‐only or presence–absence data are sampled repeatedly with standardized protocols. Most abundance data come from opportunistic population counts or from population time series using standardized protocols (e.g. repeated surveys of the same population from single or multiple sites). Enormous complexity exists in integrating these heterogeneous, multi‐source data sets across space, time, taxa and different sampling methods. Integration of such data into global EBV data products requires correcting biases introduced by imperfect detection and varying sampling effort, dealing with different spatial resolution and extents, harmonizing measurement units from different data sources or sampling methods, applying statistical tools and models for spatial inter‐ or extrapolation, and quantifying sources of uncertainty and errors in data and models. To support the development of EBVs by the Group on Earth Observations Biodiversity Observation Network (GEO BON), we identify 11 key workflow steps that will operationalize the process of building EBV data products within and across research infrastructures worldwide. These workflow steps take multiple sequential activities into account, including identification and aggregation of various raw data sources, data quality control, taxonomic name matching and statistical modelling of integrated data. We illustrate these steps with concrete examples from existing citizen science and professional monitoring projects, including eBird, the Tropical Ecology Assessment and Monitoring network, the Living Planet Index and the Baltic Sea zooplankton monitoring. The identified workflow steps are applicable to both terrestrial and aquatic systems and a broad range of spatial, temporal and taxonomic scales. They depend on clear, findable and accessible metadata, and we provide an overview of current data and metadata standards. Several challenges remain to be solved for building global EBV data products: (i) developing tools and models for combining heterogeneous, multi‐source data sets and filling data gaps in geographic, temporal and taxonomic coverage, (ii) integrating emerging methods and technologies for data collection such as citizen science, sensor networks, DNA‐based techniques and satellite remote sensing, (iii) solving major technical issues related to data product structure, data storage, execution of workflows and the production process/cycle as well as approaching technical interoperability among research infrastructures, (iv) allowing semantic interoperability by developing and adopting standards and tools for capturing consistent data and metadata, and (v) ensuring legal interoperability by endorsing open data or data that are free from restrictions on use, modification and sharing. Addressing these challenges is critical for biodiversity research and for assessing progress towards conservation policy targets and sustainable development goals.  相似文献   

5.
6.
Existing on-line databases for dendrochronology are not flexible in terms of user permissions, tree-ring data formats, metadata administration and language. This is why we developed the Digital Collaboratory for Cultural Dendrochronology (DCCD). This TRiDaS-based multi-lingual database allows users to control data access, to perform queries, to upload and download (meta)data in a variety of digital formats, and to edit metadata on line. The content of the DCCD conforms to EU best practices regarding the long-term preservation of digital research data.  相似文献   

7.
8.
Aim and BackgroundWe describe a successful implementation of a departmental incident learning system (ILS) across a regionally expanding academic radiation oncology department, dovetailing with a structured integration of the safety and quality program across clinical sites.Materials and methods mOver 6 years between 2011 and 2017, a long-standing departmental ILS was deployed to 4 clinical locations beyond the primary clinical location where it had been established. We queried all events reported to the ILS during this period and analyzed trends in reporting by clinical site. The chi-square test was used to determine whether differences over time in the rate of reporting were statistically significant. We describe a synchronous development of a common safety and quality program over the same period.ResultsThere was an overall increase in the number of event reports from each location over the time period from 2011 to 2017. The percentage increase in reported events from the first year of implementation to 2017 was 457% in site 1, 166.7% in site 2, 194.3% in site 3, 1025% in site 4, and 633.3% in site 5, with an overall increase of 677.7%. A statistically significant increase in the rate of reporting was seen from the first year of implementation to 2017 (p < 0.001 for all sites).ConclusionsWe observed significant increases in event reporting over a 6-year period across 5 regional sites within a large academic radiation oncology department, during which time we expanded and enhanced our safety and quality program, including regional integration. Implementing an ILS and structuring a safety and quality program together result in the successful integration of the ILS into existing departmental infrastructure.  相似文献   

9.
Data and knowledge mobilisation are significant challenges in ecology and resource management, with the journey from data collection through to management action often left incomplete due to difficulties sharing information across diverse and dispersed communities. This disconnect between science and management must be resolved if we are to successfully tackle the increasing impact of human activity on our ecosystems. Across their North Atlantic range, Atlantic salmon (Salmo salar L.) populations are in steep decline in many areas and urgent actions are required to curb this decline. Being commercially important this species has been subject to intense research, but management action often suffers from both a lack of access to this knowledge resource and support for its integration into effective management strategies. To respond to this challenge, the science and management communities must place higher priority on mobilising existing and emerging knowledge sources to inform current and future resource use and mitigation strategies. This approach requires a more complete picture of the current salmon ecology data and knowledge landscape, new mechanisms to enable data mobilisation and re-use, and new research to describe and parameterise the responses of wild populations to habitat changes. Here we present a unique interface for registering and linking data resources relevant to the Atlantic salmon life cycle that can address the data mobilisation aspect of these challenges. The Salmon Ecosystem Data Hub is a salmon-specific metadata catalogue, natively interoperable with many existing data portals, which creates a low resistance pathway to maximise visibility of data relevant to Atlantic salmon. This includes the capacity to annotate datasets with life-stage domains and variable classes, thereby permitting dispersed data to be formally contextualised and integrated to support hypotheses specific to scenario-based modelling and decision-making. The alignment and mobilisation of data within the Salmon Ecosystem Data Hub will help advance the development of appropriate environmentally driven forecast models and an ecosystem-based management approach for Atlantic salmon that optimises future management strategies.  相似文献   

10.
High-performance computing faces considerable change as the Internet and the Grid mature. Applications that once were tightly-coupled and monolithic are now decentralized, with collaborating components spread across diverse computational elements. Such distributed systems most commonly communicate through the exchange of structured data. Definition and translation of metadata is incorporated in all systems that exchange structured data. We observe that the manipulation of this metadata can be decomposed into three separate steps: discovery, binding of program objects to the metadata, and marshaling of data to and from wire formats. We have designed a method of representing message formats in XML, using datatypes available in the XML Schema specification. We have implemented a tool, XMIT, that uses such metadata and exploits this decomposition in order to provide flexible run-time metadata definition facilities for an efficient binary communication mechanism. We also demonstrate that the use of XMIT makes possible such flexibility at little performance cost.  相似文献   

11.
Quality control for plant metabolomics: reporting MSI-compliant studies   总被引:1,自引:0,他引:1  
The Metabolomics Standards Initiative (MSI) has recently released documents describing minimum parameters for reporting metabolomics experiments, in order to validate metabolomic studies and to facilitate data exchange. The reporting parameters encompassed by MSI include the biological study design, sample preparation, data acquisition, data processing, data analysis and interpretation relative to the biological hypotheses being evaluated. Herein we exemplify how such metadata can be reported by using a small case study – the metabolite profiling by GC-TOF mass spectrometry of Arabidopsis thaliana leaves from a knockout allele of the gene At1g08510 in the Wassilewskija ecotype. Pitfalls in quality control are highlighted that can invalidate results even if MSI reporting standards are fulfilled, including reliable compound identification and integration of unknown metabolites. Standardized data processing methods are proposed for consistent data storage and dissemination via databases.  相似文献   

12.

Background

The 1980s marked the occasion when Geographical Information System (GIS) technology was broadly introduced into the geo-spatial community through the establishment of a strong GIS industry. This technology quickly disseminated across many countries, and has now become established as an important research, planning and commercial tool for a wider community that includes organisations in the public and private health sectors. The broad acceptance of GIS technology and the nature of its functionality have meant that numerous datasets have been created over the past three decades. Most of these datasets have been created independently, and without any structured documentation systems in place. However, search and retrieval systems can only work if there is a mechanism for datasets existence to be discovered and this is where proper metadata creation and management can greatly help. This situation must be addressed through support mechanisms such as Web-based portal technologies, metadata editor tools, automation, metadata standards and guidelines and collaborative efforts with relevant individuals and organisations. Engagement with data developers or administrators should also include a strategy of identifying the benefits associated with metadata creation and publication.

Findings

The establishment of numerous Spatial Data Infrastructures (SDIs), and other Internet resources, is a testament to the recognition of the importance of supporting good data management and sharing practices across the geographic information community. These resources extend to health informatics in support of research, public services and teaching and learning. This paper identifies many of these resources available to the UK academic health informatics community. It also reveals the reluctance of many spatial data creators across the wider UK academic community to use these resources to create and publish metadata, or deposit their data in repositories for sharing. The Go-Geo! service is introduced as an SDI developed to provide UK academia with the necessary resources to address the concerns surrounding metadata creation and data sharing. The Go-Geo! portal, Geodoc metadata editor tool, ShareGeo spatial data repository, and a range of other support resources, are described in detail.

Conclusions

This paper describes a variety of resources available for the health research and public health sector to use for managing and sharing their data. The Go-Geo! service is one resource which offers an SDI for the eclectic range of disciplines using GIS in UK academia, including health informatics. The benefits of data management and sharing are immense, and in these times of cost restraints, these resources can be seen as solutions to find cost savings which can be reinvested in more research.  相似文献   

13.
The development of the Functional Genomics Investigation Ontology (FuGO) is a collaborative, international effort that will provide a resource for annotating functional genomics investigations, including the study design, protocols and instrumentation used, the data generated and the types of analysis performed on the data. FuGO will contain both terms that are universal to all functional genomics investigations and those that are domain specific. In this way, the ontology will serve as the "semantic glue" to provide a common understanding of data from across these disparate data sources. In addition, FuGO will reference out to existing mature ontologies to avoid the need to duplicate these resources, and will do so in such a way as to enable their ease of use in annotation. This project is in the early stages of development; the paper will describe efforts to initiate the project, the scope and organization of the project, the work accomplished to date, and the challenges encountered, as well as future plans.  相似文献   

14.
BACKGROUND: Personalised medicine provides patients with treatments that are specific to their genetic profiles. It requires efficient data sharing of disparate data types across a variety of scientific disciplines, such as molecular biology, pathology, radiology and clinical practice. Personalised medicine aims to offer the safest and most effective therapeutic strategy based on the gene variations of each subject. In particular, this is valid in oncology, where knowledge about genetic mutations has already led to new therapies. Current molecular biology techniques (microarrays, proteomics, epigenetic technology and improved DNA sequencing technology) enable better characterisation of cancer tumours. The vast amounts of data, however, coupled with the use of different terms - or semantic heterogeneity - in each discipline makes the retrieval and integration of information difficult. RESULTS: Existing software infrastructures for data-sharing in the cancer domain, such as caGrid, support access to distributed information. caGrid follows a service-oriented model-driven architecture. Each data source in caGrid is associated with metadata at increasing levels of abstraction, including syntactic, structural, reference and domain metadata. The domain metadata consists of ontology-based annotations associated with the structural information of each data source. However, caGrid's current querying functionality is given at the structural metadata level, without capitalising on the ontology-based annotations. This paper presents the design of and theoretical foundations for distributed ontology-based queries over cancer research data. Concept-based queries are reformulated to the target query language, where join conditions between multiple data sources are found by exploiting the semantic annotations. The system has been implemented, as a proof of concept, over the caGrid infrastructure. The approach is applicable to other model-driven architectures. A graphical user interface has been developed, supporting ontology-based queries over caGrid data sources. An extensive evaluation of the query reformulation technique is included. CONCLUSIONS: To support personalised medicine in oncology, it is crucial to retrieve and integrate molecular, pathology, radiology and clinical data in an efficient manner. The semantic heterogeneity of the data makes this a challenging task. Ontologies provide a formal framework to support querying and integration. This paper provides an ontology-based solution for querying distributed databases over service-oriented, model-driven infrastructures.  相似文献   

15.
16.
The planet is experiencing an ongoing global biodiversity crisis. Measuring the magnitude and rate of change more effectively requires access to organized, easily discoverable, and digitally-formatted biodiversity data, both legacy and new, from across the globe. Assembling this coherent digital representation of biodiversity requires the integration of data that have historically been analog, dispersed, and heterogeneous. The Integrated Publishing Toolkit (IPT) is a software package developed to support biodiversity dataset publication in a common format. The IPT’s two primary functions are to 1) encode existing species occurrence datasets and checklists, such as records from natural history collections or observations, in the Darwin Core standard to enhance interoperability of data, and 2) publish and archive data and metadata for broad use in a Darwin Core Archive, a set of files following a standard format. Here we discuss the key need for the IPT, how it has developed in response to community input, and how it continues to evolve to streamline and enhance the interoperability, discoverability, and mobilization of new data types beyond basic Darwin Core records. We close with a discussion how IPT has impacted the biodiversity research community, how it enhances data publishing in more traditional journal venues, along with new features implemented in the latest version of the IPT, and future plans for more enhancements.  相似文献   

17.
Comparative statistical analyses often require data harmonization, yet the social sciences do not have clear operationalization frameworks that guide and homogenize variable coding decisions across disciplines. When faced with a need to harmonize variables researchers often look for guidance from various international studies that employ output harmonization, such as the Comparative Survey of Election Studies, which offer recoding structures for the same variable (e.g. marital status). More problematically there are no agreed documentation standards or journal requirements for reporting variable harmonization to facilitate a transparent replication process. We propose a conceptual and data-driven digital solution that creates harmonization documentation standards for publication and scholarly citation: QuickCharmStats 1.1. It is free and open-source software that allows for the organizing, documenting and publishing of data harmonization projects. QuickCharmStats starts at the conceptual level and its workflow ends with a variable recording syntax. It is therefore flexible enough to reflect a variety of theoretical justifications for variable harmonization. Using the socio-demographic variable ‘marital status’, we demonstrate how the CharmStats workflow collates metadata while being guided by the scientific standards of transparency and replication. It encourages researchers to publish their harmonization work by providing researchers who complete the peer review process a permanent identifier. Those who contribute original data harmonization work to their discipline can now be credited through citations. Finally, we propose peer-review standards for harmonization documentation, describe a route to online publishing, and provide a referencing format to cite harmonization projects. Although CharmStats products are designed for social scientists our adherence to the scientific method ensures our products can be used by researchers across the sciences.  相似文献   

18.
Soil respiration, the flux of CO2 from the soil to the atmosphere represents a major flux in the global carbon cycle. Our ability to predict this flux remains limited because of multiple controlling mechanisms that interact over different temporal and spatial scales. However, new advances in measurement and analyses present an opportunity for the scientific community to improve the understanding of the mechanisms that regulate soil respiration. In this paper, we address several recent advancements in soil respiration research from experimental measurements and data analysis to new considerations for model-data integration. We focus on the links between the soil?Cplant-atmosphere continuum at short (i.e., diel) and medium (i.e., seasonal-years) temporal scales. First, we bring attention to the importance of identifying sources of soil CO2 production and highlight the application of automated soil respiration measurements and isotope approaches. Second, we discuss the need of quality assurance and quality control for applications in time series analysis. Third, we review perspectives about emergent ideas for modeling development and model-data integration for soil respiration research. Finally, we call for stronger interactions between modelers and experimentalists as a way to improve our understanding of soil respiration and overall terrestrial carbon cycling.  相似文献   

19.
We introduce and make publicly available a large corpus of digitized primary source human rights documents which are published annually by monitoring agencies that include Amnesty International, Human Rights Watch, the Lawyers Committee for Human Rights, and the United States Department of State. In addition to the digitized text, we also make available and describe document-term matrices, which are datasets that systematically organize the word counts from each unique document by each unique term within the corpus of human rights documents. To contextualize the importance of this corpus, we describe the development of coding procedures in the human rights community and several existing categorical indicators that have been created by human coding of the human rights documents contained in the corpus. We then discuss how the new human rights corpus and the existing human rights datasets can be used with a variety of statistical analyses and machine learning algorithms to help scholars understand how human rights practices and reporting have evolved over time. We close with a discussion of our plans for dataset maintenance, updating, and availability.  相似文献   

20.
The consistency of the species abundance distribution across diverse communities has attracted widespread attention. In this paper, I argue that the consistency of pattern arises because diverse ecological mechanisms share a common symmetry with regard to measurement scale. By symmetry, I mean that different ecological processes preserve the same measure of information and lose all other information in the aggregation of various perturbations. I frame these explanations of symmetry, measurement, and aggregation in terms of a recently developed extension to the theory of maximum entropy. I show that the natural measurement scale for the species abundance distribution is log-linear: the information in observations at small population sizes scales logarithmically and, as population size increases, the scaling of information grades from logarithmic to linear. Such log-linear scaling leads naturally to a gamma distribution for species abundance, which matches well with the observed patterns. Much of the variation between samples can be explained by the magnitude at which the measurement scale grades from logarithmic to linear. This measurement approach can be applied to the similar problem of allelic diversity in population genetics and to a wide variety of other patterns in biology.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号