首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Data support knowledge development and theory advances in ecology and evolution. We are increasingly reusing data within our teams and projects and through the global, openly archived datasets of others. Metadata can be challenging to write and interpret, but it is always crucial for reuse. The value metadata cannot be overstated—even as a relatively independent research object because it describes the work that has been done in a structured format. We advance a new perspective and classify methods for metadata curation and development with tables. Tables with templates can be effectively used to capture all components of an experiment or project in a single, easy‐to‐read file familiar to most scientists. If coupled with the R programming language, metadata from tables can then be rapidly and reproducibly converted to publication formats including extensible markup language files suitable for data repositories. Tables can also be used to summarize existing metadata and store metadata across many datasets. A case study is provided and the added benefits of tables for metadata, a priori, are developed to ensure a more streamlined publishing process for many data repositories used in ecology, evolution, and the environmental sciences. In ecology and evolution, researchers are often highly tabular thinkers from experimental data collection in the lab and/or field, and representations of metadata as a table will provide novel research and reuse insights.  相似文献   

2.
In ecological sciences, the role of metadata (i.e. key information about a dataset) to make existing datasets visible and discoverable has become increasingly important. Within the EU-funded WISER project (Water bodies in Europe: Integrative Systems to assess Ecological status and Recovery), we designed a metadatabase to allow scientists to find the optimal data for their analyses. An online questionnaire helped to collect metadata from the data providers and an online query tool (http://www.wiser.eu/results/meta-database/) facilitated data evaluation. The WISER metadatabase currently holds information on 114 datasets (22 river, 71 lake, 1 general freshwater and 20 coastal/transitional datasets), which also can be accessed by external scientists. We evaluate if generally used metadata standards (e.g. Darwin Core, ISO 19115, CSDGM, EML) are suitable for such specific purposes as WISER and suggest at least the linkage with standard metadata fields. Furthermore, we discuss whether the simple metadata documentation is enough for others to reuse a dataset and why there is still reluctance to publish both metadata and primary research data (i.e. time and financial constraints, misuse of data, abandoning intellectual property rights). We emphasise that metadata publication has major advantages as it makes datasets detectable by other scientists and generally makes a scientist’s work more visible.  相似文献   

3.
Dendrochronological data formats in general offer limited space for recording associated metadata. Such information is often recorded separately from the actual time series, and often only on paper. TRiDaBASE has been developed to improve metadata administration. It is a relational Microsoft Access database that allows users to register digital metadata according to TRiDaS, to generate TRiDaS XML for uploading to TRiDaS-based analytical systems and repositories, and to ingest TRiDaS XML created elsewhere for local querying and analyses.  相似文献   

4.
Biodiversity metadata provide service to query, management and use of actual data sets. The progress of the development of metadata standards in China was analyzed, and metadata required and/or produced based on the Convention on Biological Diversity were reviewed. A biodiversity metadata standard was developed based on the characteristics of biodiversity data and in line with the framework of international metadata standards. The content of biodiversity metadata is divided into two levels. The first level consists of metadata entities and elements that are necessary to exclusively identify a biodiversity data set, and is named as Core Metadata. The second level comprises metadata entities and elements that are necessary to describe all aspects of a biodiversity data set. The standard for core biodiversity metadata is presented in this paper, which is composed of 51 elements belonging to 6 categories (entities), i.e. inventory information, collection information, information on the content of the data set, management information, access information, and metadata management information. The name, definition, condition, data type, and field length of metadata elements in these six categories (entities) are also described.  相似文献   

5.
The Human Proteome Organisation's Proteomics Standards Initiative has developed the GelML (gel electrophoresis markup language) data exchange format for representing gel electrophoresis experiments performed in proteomics investigations. The format closely follows the reporting guidelines for gel electrophoresis, which are part of the Minimum Information About a Proteomics Experiment (MIAPE) set of modules. GelML supports the capture of metadata (such as experimental protocols) and data (such as gel images) resulting from gel electrophoresis so that laboratories can be compliant with the MIAPE Gel Electrophoresis guidelines, while allowing such data sets to be exchanged or downloaded from public repositories. The format is sufficiently flexible to capture data from a broad range of experimental processes, and complements other PSI formats for MS data and the results of protein and peptide identifications to capture entire gel‐based proteome workflows. GelML has resulted from the open standardisation process of PSI consisting of both public consultation and anonymous review of the specifications.  相似文献   

6.
The Feeding Experiments End-user Database (FEED) is a research tool developed by the Mammalian Feeding Working Group at the National Evolutionary Synthesis Center that permits synthetic, evolutionary analyses of the physiology of mammalian feeding. The tasks of the Working Group are to compile physiologic data sets into a uniform digital format stored at a central source, develop a standardized terminology for describing and organizing the data, and carry out a set of novel analyses using FEED. FEED contains raw physiologic data linked to extensive metadata. It serves as an archive for a large number of existing data sets and a repository for future data sets. The metadata are stored as text and images that describe experimental protocols, research subjects, and anatomical information. The metadata incorporate controlled vocabularies to allow consistent use of the terms used to describe and organize the physiologic data. The planned analyses address long-standing questions concerning the phylogenetic distribution of phenotypes involving muscle anatomy and feeding physiology among mammals, the presence and nature of motor pattern conservation in the mammalian feeding muscles, and the extent to which suckling constrains the evolution of feeding behavior in adult mammals. We expect FEED to be a growing digital archive that will facilitate new research into understanding the evolution of feeding anatomy.  相似文献   

7.
The rise of smartphones and web services made possible the large-scale collection of personal metadata. Information about individuals'' location, phone call logs, or web-searches, is collected and used intensively by organizations and big data researchers. Metadata has however yet to realize its full potential. Privacy and legal concerns, as well as the lack of technical solutions for personal metadata management is preventing metadata from being shared and reconciled under the control of the individual. This lack of access and control is furthermore fueling growing concerns, as it prevents individuals from understanding and managing the risks associated with the collection and use of their data. Our contribution is two-fold: (1) we describe openPDS, a personal metadata management framework that allows individuals to collect, store, and give fine-grained access to their metadata to third parties. It has been implemented in two field studies; (2) we introduce and analyze SafeAnswers, a new and practical way of protecting the privacy of metadata at an individual level. SafeAnswers turns a hard anonymization problem into a more tractable security one. It allows services to ask questions whose answers are calculated against the metadata instead of trying to anonymize individuals'' metadata. The dimensionality of the data shared with the services is reduced from high-dimensional metadata to low-dimensional answers that are less likely to be re-identifiable and to contain sensitive information. These answers can then be directly shared individually or in aggregate. openPDS and SafeAnswers provide a new way of dynamically protecting personal metadata, thereby supporting the creation of smart data-driven services and data science research.  相似文献   

8.
  1. Metadata plays an essential role in the long‐term preservation, reuse, and interoperability of data. Nevertheless, creating useful metadata can be sufficiently difficult and weakly enough incentivized that many datasets may be accompanied by little or no metadata. One key challenge is, therefore, how to make metadata creation easier and more valuable. We present a solution that involves creating domain‐specific metadata schemes that are as complex as necessary and as simple as possible. These goals are achieved by co‐development between a metadata expert and the researchers (i.e., the data creators). The final product is a bespoke metadata scheme into which researchers can enter information (and validate it) via the simplest of interfaces: a web browser application and a spreadsheet.
  2. We provide the R package dmdScheme (dmdScheme: An R package for working with domain specific MetaData schemes (Version v0.9.22), 2019) for creating a template domain‐specific scheme. We describe how to create a domain‐specific scheme from this template, including the iterative co‐development process, and the simple methods for using the scheme, and simple methods for quality assessment, improvement, and validation.
  3. The process of developing a metadata scheme following the outlined approach was successful, resulting in a metadata scheme which is used for the data generated in our research group. The validation quickly identifies forgotten metadata, as well as inconsistent metadata, therefore improving the quality of the metadata. Multiple output formats are available, including XML.
  4. Making the provision of metadata easier while also ensuring high quality must be a priority for data curation initiatives. We show how both objectives are achieved by close collaboration between metadata experts and researchers to create domain‐specific schemes. A near‐future priority is to provide methods to interface domain‐specific schemes with general metadata schemes, such as the Ecological Metadata Language, to increase interoperability.

The article describes a methodology to develop, enter, and validate domain specific metadata schemes which is suitable to be used by nonmetadata specialists. The approach uses an R package which forms the backend of the processing of the metadata, uses spreadsheets to enter the metadata, and provides a server based approach to distribute and use the developed metadata schemes.  相似文献   

9.
董仁才  王韬  张永霖  张雪琦  李欢欢 《生态学报》2018,38(11):3775-3783
在我国大力推动城市可持续发展,推进国家可持续发展实验区建设的同时,采用何种评估方法和数据开展城市可持续发展能力评估是需要重点解决的问题。近年来兴起的元数据理论与技术在解决评估数据质量控制方面被视为是一种行之有效的方法。针对我国现阶段使用的一些城市可持续发展能力评估指标体系的特点,通过深入剖析每一个指标数据的来源、获取手段、适用方法等特征,提出从软件工程学思路研发城市可持续发展能力评估元数据管理系统的具体方法,帮助可持续发展实验区高效获取和管理评估所需数据信息;以"十二五"科技支撑计划项目"城市可持续发展能力评估及信息管理关键技术研究与示范"中所建立的元数据规范,对其所包含的"数据发布日期"、"数据发布形式"、"空间范围"、"时间范围(起始时间、结束时间)"、"统计频率"、"数据安全限制分级"、"数据志说明"、"在线资源链接地址"和"数据统计单位信息(单位名称、联络人、联系电话、单位地址、邮件地址)"共14项为评估数据的关键元数据项,以此追踪对标的评估数据。并通过量化数据质量评分法针对数据质量在运用元数据追踪法前后的评价结果对比发现,被评估指标的数据质量在获得元数据支持时,其数据可靠性、可比性和可持续性方面的评价分值都获得了十分显著的改善。研究认为采用元数据理论在控制和保障城市可持续发展能力评估数据质量方面具有优势作用,开发有针对性的城市可持续发展能力评估元数据管理系统能够有效提高评估数据的综合评价结果。  相似文献   

10.
As modern computer systems face the challenge of large data, filesystems have to deal with a large number of files. This leads to amplified concerns of metadata operations as well as data operations. Most filesystems manage metadata of files by constructing in-memory data structures, such as directory entry (dentry) and inode. We found inefficiencies on management of metadata in existing filesystems, such as path traversal mechanism. In this article, we optimize the metadata operations by (1) looking up dentry cache (dcache) hash table in backward manner. To adopt the backward finding mechanism, we devise the rename and permission-granted mechanism. We also propose (2) compacting the metadata into dentry structures for in-memory space efficiency. We evaluate our optimized metadata managing mechanisms with several benchmarks, including a real-world workload. These optimizations significantly reduce dcache lookup latency by up to 40% and improve overall throughput by up to 72% in a real-world benchmark.  相似文献   

11.
12.
Metadata describe the ancillary information needed for data preservation and independent interpretation, comparison across heterogeneous datasets, and quality assessment and quality control (QA/QC). Environmental observations are vastly diverse in type and structure, can be taken across a wide range of spatiotemporal scales in a variety of measurement settings and approaches, and saved in multiple formats. Thus, well-organized, consistent metadata are required to produce usable data products from diverse environmental observations collected across field sites. However, existing metadata reporting protocols do not support the complex data synthesis and model-data integration needs of interdisciplinary earth system research. We developed a metadata reporting framework (FRAMES) to enable management and synthesis of observational data that are essential in advancing a predictive understanding of earth systems. FRAMES utilizes best practices for data and metadata organization enabling consistent data reporting and compatibility with a variety of standardized data protocols. We used an iterative scientist-centered design process to develop FRAMES, resulting in a data reporting format that incorporates existing field practices to maximize data-entry efficiency. Thus, FRAMES has a modular organization that streamlines metadata reporting and can be expanded to incorporate additional data types. With FRAMES's multi-scale measurement position hierarchy, data can be reported at observed spatial resolutions and then easily aggregated and linked across measurement types to support model-data integration. FRAMES is in early use by both data originators (persons generating data) and consumers (persons using data and metadata). In this paper, we describe FRAMES, identify lessons learned, and discuss areas of future development.  相似文献   

13.
Tree-ring research and collaboration are currently being hampered by the lack of a suitable data-transfer standard for both data and metadata. This paper highlights the issues currently being faced and proposes a solution that, if adopted by the global dendro community, will open up the possibility of exciting new research collaborations. The solution consists of a data model for dendrochronological data and metadata, and an eXtensible Markup Language (XML) schema as a technical vehicle to exchange this data and metadata. The technology and structure of the standard enable future versions to be developed that will satisfy evolving requirements whilst remaining backwards compatible.  相似文献   

14.
Many communities use standard, structured documentation that is machine-readable, i.e. metadata, to make discovery, access, use, and understanding of scientific datasets possible. Organizations and communities have also developed recommendations for metadata content that is required or suggested for their data developers and users. These recommendations are typically specific to metadata representations (dialects) used by the community. By considering the conceptual content of the recommendations, quantitative analysis and comparison of the completeness of multiple metadata dialects becomes possible. This is a study of completeness of EML and CSDGM metadata records from DataONE in terms of the LTER recommendation for Completeness. The goal of the study is to quantitatively measure completeness of metadata records and to determine if metadata developed by LTER is more complete with respect to the recommendation than other collections in EML and in CSDGM. We conclude that the LTER records are broadly more complete than the other EML collections, but similar in completeness to the CSDGM collections.  相似文献   

15.
High-performance computing faces considerable change as the Internet and the Grid mature. Applications that once were tightly-coupled and monolithic are now decentralized, with collaborating components spread across diverse computational elements. Such distributed systems most commonly communicate through the exchange of structured data. Definition and translation of metadata is incorporated in all systems that exchange structured data. We observe that the manipulation of this metadata can be decomposed into three separate steps: discovery, binding of program objects to the metadata, and marshaling of data to and from wire formats. We have designed a method of representing message formats in XML, using datatypes available in the XML Schema specification. We have implemented a tool, XMIT, that uses such metadata and exploits this decomposition in order to provide flexible run-time metadata definition facilities for an efficient binary communication mechanism. We also demonstrate that the use of XMIT makes possible such flexibility at little performance cost.  相似文献   

16.
Existing on-line databases for dendrochronology are not flexible in terms of user permissions, tree-ring data formats, metadata administration and language. This is why we developed the Digital Collaboratory for Cultural Dendrochronology (DCCD). This TRiDaS-based multi-lingual database allows users to control data access, to perform queries, to upload and download (meta)data in a variety of digital formats, and to edit metadata on line. The content of the DCCD conforms to EU best practices regarding the long-term preservation of digital research data.  相似文献   

17.
Celsius: a community resource for Affymetrix microarray data   总被引:1,自引:1,他引:0  
  相似文献   

18.

Background  

Phyloinformatic analyses involve large amounts of data and metadata of complex structure. Collecting, processing, analyzing, visualizing and summarizing these data and metadata should be done in steps that can be automated and reproduced. This requires flexible, modular toolkits that can represent, manipulate and persist phylogenetic data and metadata as objects with programmable interfaces.  相似文献   

19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号