首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Research data management (RDM) requires standards, policies, and guidelines. Findable, accessible, interoperable, and reusable (FAIR) data management is critical for sustainable research. Therefore, collaborative approaches for managing FAIR-structured data are becoming increasingly important for long-term, sustainable RDM. However, they are rather hesitantly applied in bioengineering. One of the reasons may be found in the interdisciplinary character of the research field. In addition, bioengineering as application of principles of biology and tools of process engineering, often have to meet different criteria. In consequence, RDM is complicated by the fact that researchers from different scientific institutions must meet the criteria of their home institution, which can lead to additional conflicts. Therefore, centrally provided general repositories implementing a collaborative approach that enables data storage from the outset In a biotechnology research network with over 20 tandem projects, it was demonstrated how FAIR-RDM can be implemented through a collaborative approach and the use of a data structure. In addition, the importance of a structure within a repository was demonstrated to keep biotechnology research data available throughout the entire data lifecycle. Furthermore, the biotechnology research network highlighted the importance of a structure within a repository to keep research data available throughout the entire data lifecycle.  相似文献   

2.
The need for open, reproducible science is of growing concern in the twenty-first century, with multiple initiatives like the widely supported FAIR principles advocating for data to be Findable, Accessible, Interoperable and Reusable. Plant ecological and evolutionary studies are not exempt from the need to ensure that the data upon which their findings are based are accessible and allow for replication in accordance with the FAIR principles. However, it is common that the collection and curation of herbarium specimens, a foundational aspect of studies involving plants, is neglected by authors. Without publicly available specimens, huge numbers of studies that rely on the field identification of plants are fundamentally not reproducible. We argue that the collection and public availability of herbarium specimens is not only good botanical practice but is also fundamental in ensuring that plant ecological and evolutionary studies are replicable, and thus scientifically sound. Data repositories that adhere to the FAIR principles must make sure that the original data are traceable to and re-examinable at their empirical source. In order to secure replicability, and adherence to the FAIR principles, substantial changes need to be brought about to restore the practice of collecting and curating specimens, to educate students of their importance, and to properly fund the herbaria which house them.  相似文献   

3.
Given the rapid growth of artificial intelligence (AI) applications in radiotherapy and the related transformations toward the data-driven healthcare domain, this article summarizes the need and usage of the FAIR (Findable, Accessible, Interoperable, Reusable) data principles in radiotherapy. This work introduces the FAIR data concept, presents practical and relevant use cases and the future role of the different parties involved. The goal of this article is to provide guidance and potential applications of FAIR to various radiotherapy stakeholders, focusing on the central role of medical physicists.  相似文献   

4.
5.
PDBx/mmCIF, Protein Data Bank Exchange (PDBx) macromolecular Crystallographic Information Framework (mmCIF), has become the data standard for structural biology. With its early roots in the domain of small-molecule crystallography, PDBx/mmCIF provides an extensible data representation that is used for deposition, archiving, remediation, and public dissemination of experimentally determined three-dimensional (3D) structures of biological macromolecules by the Worldwide Protein Data Bank (wwPDB, wwpdb.org). Extensions of PDBx/mmCIF are similarly used for computed structure models by ModelArchive (modelarchive.org), integrative/hybrid structures by PDB-Dev (pdb-dev.wwpdb.org), small angle scattering data by Small Angle Scattering Biological Data Bank SASBDB (sasbdb.org), and for models computed generated with the AlphaFold 2.0 deep learning software suite (alphafold.ebi.ac.uk). Community-driven development of PDBx/mmCIF spans three decades, involving contributions from researchers, software and methods developers in structural sciences, data repository providers, scientific publishers, and professional societies. Having a semantically rich and extensible data framework for representing a wide range of structural biology experimental and computational results, combined with expertly curated 3D biostructure data sets in public repositories, accelerates the pace of scientific discovery. Herein, we describe the architecture of the PDBx/mmCIF data standard, tools used to maintain representations of the data standard, governance, and processes by which data content standards are extended, plus community tools/software libraries available for processing and checking the integrity of PDBx/mmCIF data. Use cases exemplify how the members of the Worldwide Protein Data Bank have used PDBx/mmCIF as the foundation for its pipeline for delivering Findable, Accessible, Interoperable, and Reusable (FAIR) data to many millions of users worldwide.  相似文献   

6.
Researchers require infrastructures that ensure a maximum of accessibility, stability and reliability to facilitate working with and sharing of research data. Such infrastructures are being increasingly summarized under the term Research Data Repositories (RDR). The project re3data.org–Registry of Research Data Repositories–has begun to index research data repositories in 2012 and offers researchers, funding organizations, libraries and publishers an overview of the heterogeneous research data repository landscape. In July 2013 re3data.org lists 400 research data repositories and counting. 288 of these are described in detail using the re3data.org vocabulary. Information icons help researchers to easily identify an adequate repository for the storage and reuse of their data. This article describes the heterogeneous RDR landscape and presents a typology of institutional, disciplinary, multidisciplinary and project-specific RDR. Further the article outlines the features of re3data.org, and shows how this registry helps to identify appropriate repositories for storage and search of research data.  相似文献   

7.
Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn''t, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication.Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%–35% in 2007–2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available.First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available.These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let''s learn from those with high rates of sharing to embrace the full potential of our research output.  相似文献   

8.
Data independent acquisition (DIA) proteomics techniques have matured enormously in recent years, thanks to multiple technical developments in, for example, instrumentation and data analysis approaches. However, there are many improvements that are still possible for DIA data in the area of the FAIR (Findability, Accessibility, Interoperability and Reusability) data principles. These include more tailored data sharing practices and open data standards since public databases and data standards for proteomics were mostly designed with DDA data in mind. Here we first describe the current state of the art in the context of FAIR data for proteomics in general, and for DIA approaches in particular. For improving the current situation for DIA data, we make the following recommendations for the future: (i) development of an open data standard for spectral libraries; (ii) make mandatory the availability of the spectral libraries used in DIA experiments in ProteomeXchange resources; (iii) improve the support for DIA data in the data standards developed by the Proteomics Standards Initiative; and (iv) improve the support for DIA datasets in ProteomeXchange resources, including more tailored metadata requirements.  相似文献   

9.
Paul Ginsparg 《The EMBO journal》2016,35(24):2620-2625
Twenty‐five years ago, in August 1991, I spent a couple of afternoons at Los Alamos National Laboratory writing some simple software that enabled a small group of physicists to share drafts of their articles via automated email transactions with a central repository. Within a few years, the site migrated to the nascent WorldWideWeb as arXiv.org, and experienced both expansion in coverage and heavy growth in usage that continues to this day. In 1998, I gave a talk to a group of biologists—including David Lipman, Pat Brown, and Michael Eisen—at a meeting at Cold Spring Harbor Laboratory (CSHL) to describe the sharing of articles “pre‐publication” by physicists. The talk was met with some enthusiasm and prompted the “e‐biomed” proposal in the following spring by then NIH director Harold Varmus. He encouraged the creation of an NIH‐run electronic archive for all biomedical research articles, including both a preprint server and an archive of published peer‐reviewed articles, which generated significant discussion.  相似文献   

10.
Microarray technology has become an integral part of biomedical research and increasing amounts of datasets become available through public repositories. However, re-use of these datasets is severely hindered by unstructured, missing or incorrect biological samples information; as well as the wide variety of preprocessing methods in use. The inSilicoDb R/Bioconductor package is a command-line front-end to the InSilico DB, a web-based database currently containing 86 104 expert-curated human Affymetrix expression profiles compiled from 1937 GEO repository series. The use of this package builds on the Bioconductor project's focus on reproducibility by enabling a clear workflow in which not only analysis, but also the retrieval of verified data is supported.  相似文献   

11.

Background  

Standardization of analytical approaches and reporting methods via community-wide collaboration can work synergistically with web-tool development to result in rapid community-driven expansion of online data repositories suitable for data mining and meta-analysis. In metabolomics, the inter-laboratory reproducibility of gas-chromatography/mass-spectrometry (GC/MS) makes it an obvious target for such development. While a number of web-tools offer access to datasets and/or tools for raw data processing and statistical analysis, none of these systems are currently set up to act as a public repository by easily accepting, processing and presenting publicly submitted GC/MS metabolomics datasets for public re-analysis.  相似文献   

12.
BackgroundResearch in Bioinformatics generates tools and datasets in Bioinformatics at a very fast rate. Meanwhile, a lot of effort is going into making these resources findable and reusable to improve resource discovery by researchers in the course of their work.PurposeThis paper proposes a semi-automated tool to assess a resource according to the Findability, Accessibility, Interoperability and Reusability (FAIR) criteria. The aim is to create a portal that presents the assessment score together with a report that researchers can use to gauge a resource.MethodOur system uses internet searches to automate the process of generating FAIR scores. The process is semi-automated in that if a particular property of the FAIR scores has not been captured by AutoFAIR, a user is able to amend and supply the information to complete the assessment.ResultsWe compare our results against FAIRshake that was used as the benchmark tool for comparing the assessments. The results show that AutoFAIR was able to match the FAIR criteria in FAIRshake with minimal intervention from the user.ConclusionsWe show that AutoFAIR can be a good repository for storing metadata about tools and datasets, together with comprehensive reports detailing the assessments of the resources. Moreover, AutoFAIR is also able to score workflows, giving an overall indication of the FAIRness of the resources used in a scientific study.  相似文献   

13.

Background

The 1980s marked the occasion when Geographical Information System (GIS) technology was broadly introduced into the geo-spatial community through the establishment of a strong GIS industry. This technology quickly disseminated across many countries, and has now become established as an important research, planning and commercial tool for a wider community that includes organisations in the public and private health sectors. The broad acceptance of GIS technology and the nature of its functionality have meant that numerous datasets have been created over the past three decades. Most of these datasets have been created independently, and without any structured documentation systems in place. However, search and retrieval systems can only work if there is a mechanism for datasets existence to be discovered and this is where proper metadata creation and management can greatly help. This situation must be addressed through support mechanisms such as Web-based portal technologies, metadata editor tools, automation, metadata standards and guidelines and collaborative efforts with relevant individuals and organisations. Engagement with data developers or administrators should also include a strategy of identifying the benefits associated with metadata creation and publication.

Findings

The establishment of numerous Spatial Data Infrastructures (SDIs), and other Internet resources, is a testament to the recognition of the importance of supporting good data management and sharing practices across the geographic information community. These resources extend to health informatics in support of research, public services and teaching and learning. This paper identifies many of these resources available to the UK academic health informatics community. It also reveals the reluctance of many spatial data creators across the wider UK academic community to use these resources to create and publish metadata, or deposit their data in repositories for sharing. The Go-Geo! service is introduced as an SDI developed to provide UK academia with the necessary resources to address the concerns surrounding metadata creation and data sharing. The Go-Geo! portal, Geodoc metadata editor tool, ShareGeo spatial data repository, and a range of other support resources, are described in detail.

Conclusions

This paper describes a variety of resources available for the health research and public health sector to use for managing and sharing their data. The Go-Geo! service is one resource which offers an SDI for the eclectic range of disciplines using GIS in UK academia, including health informatics. The benefits of data management and sharing are immense, and in these times of cost restraints, these resources can be seen as solutions to find cost savings which can be reinvested in more research.  相似文献   

14.
Background: High resolution melting (HRM) is an emerging new method for interrogating and characterizing DNA samples. An important aspect of this technology is data analysis. Traditional HRM curves can be difficult to interpret and the method has been criticized for lack of statistical interrogation and arbitrary interpretation of results. Methods: Here we report the basic principles and first applications of a new statistical approach to HRM analysis addressing these concerns. Our method allows automated genotyping of unknown samples coupled with formal statistical information on the likelihood, if an unknown sample is of a known genotype (by discriminant analysis or “supervised learning”). It can also determine the assortment of alleles present (by cluster analysis or “unsupervised learning”) without a priori knowledge of the genotypes present. Conclusion: The new algorithms provide highly sensitive and specific auto-calling of genotypes from HRM data in both supervised an unsupervised analysis mode. The method is based on pure statistical interrogation of the data set with a high degree of standardization. The hypothesis-free unsupervised mode offers various possibilities for de novo HRM applications such as mutation discovery.  相似文献   

15.
Over the last decade, there have been significant changes in data sharing policies and in the data sharing environment faced by life science researchers. Using data from a 2013 survey of over 1600 life science researchers, we analyze the effects of sharing policies of funding agencies and journals. We also examine the effects of new sharing infrastructure and tools (i.e., third party repositories and online supplements). We find that recently enacted data sharing policies and new sharing infrastructure and tools have had a sizable effect on encouraging data sharing. In particular, third party repositories and online supplements as well as data sharing requirements of funding agencies, particularly the NIH and the National Human Genome Research Institute, were perceived by scientists to have had a large effect on facilitating data sharing. In addition, we found a high degree of compliance with these new policies, although noncompliance resulted in few formal or informal sanctions. Despite the overall effectiveness of data sharing policies, some significant gaps remain: about one third of grant reviewers placed no weight on data sharing plans in their reviews, and a similar percentage ignored the requirements of material transfer agreements. These patterns suggest that although most of these new policies have been effective, there is still room for policy improvement.  相似文献   

16.
To investigate the origin of Koreans, we examined the 12-locus Y-chromosome short tandem repeat (Y-STR) variation in a sample of 310 unrelated males from three localities (Gochang, Andong and Geoje) in Korea and statistically analyzed the previously published four Y-STR databases (n = 1655) of Korean population. The median joining network of 9-locus Y-STR haplotypes inferred as haplogroup O2b-SRY+465 showed a “star cluster” indicative of a population expansion from a centrally positioned haplotype. The central haplotype in the “star cluster” was the most frequently occurring Y-STR haplotype among the Korean male gene pool (6%, 127 of 1965, 10,14,12,13,14,16,13,13,23, for loci DYS391, DYS389I, DYS439, DYS438, DYS437, DYS19, DYS392, DYS393, and DYS390), which was shared among all seven datasets. Based on the “star cluster” pattern from both our data (41%, 128 of 310) and those previously published (34%, 563 of 1655), we suggest that the most frequent Y-STR haplotype among the Korean male gene pool seems to be the Korean modal (ancestral) haplotype. Further study with additional Y-STR and Y-SNP data of the east Asian populations as well as Korean population are needed to providing a genetic clue for the “star cluster” (O2b-SRY+465) associated with the ethnohistoric events of the Koreans.  相似文献   

17.
In most human foraging societies, the meat of large animals is widely shared. Many assume that people follow this practice because it helps to reduce the risk inherent in big game hunting. In principle, a hunter can offset the chance of many hungry days by exchanging some of the meat earned from a successful strike for shares in future kills made by other hunters. If hunting and its associated risks of failure have great antiquity, then meat sharing might have been the evolutionary foundation for many other distinctively human patterns of social exchange. Here we use previously unpublished data from the Tanzanian Hadza to test hypotheses drawn from a simple version of this argument. Results indicate that Hadza meat sharing does not fit the expectations of risk-reduction reciprocity. We comment on some variations of the “sharing as exchange” argument; then elaborate an alternative based partly on the observation that a successful hunter does not control the distribution of his kill. Instead of family provisioning, his goal may be to enhance his status as a desirable neighbor. If correct, this alternative argument has implications for the evolution of men's work.  相似文献   

18.
We examined the phylogeography of three south-east Australian trees (Eucalyptus delegatensis, Eucalyptus obliqua, and Eucalyptus regnans) with different tolerances, in terms of cold, drought, fire and soil to explore whether species with different ecologies share major phylogeographic patterns. A second aim of this study was to examine geographic patterns of chloroplast DNA (cpDNA) haplotype sharing among the three study species. Trees of E. delegatensis (n?=?120), E. obliqua (n?=?265) and E. regnans (n?=?270) were genotyped with five cpDNA microsatellite markers. The species shared major phylogeographic disjunctions, and common patterns at proposed glacial refugia (generally high haplotype diversity) and areas thought to have been treeless during the Last Glacial Maximum (LGM) (low diversity). Inter-specific sharing of haplotypes was extensive, and fixation of shared, regional haplotypes was more frequent in areas postulated as having been treeless at the LGM. Despite ecological differences, chloroplast microsatellite data suggest the three species have responded to past climatic changes in a similar way, by persisting in multiple, generally common refugia. We suggest that the natural ability of eucalypt species to hybridise with others with quite different or broader ecological tolerances may provide an “insurance policy” for response to rapidly changing abiotic conditions.  相似文献   

19.
Sharing of research data has begun to gain traction in many areas of the sciences in the past few years because of changing expectations from the scientific community, funding agencies, and academic journals. National Science Foundation (NSF) requirements for a data management plan (DMP) went into effect in 2011, with the intent of facilitating the dissemination and sharing of research results. Many projects that were funded during 2011 and 2012 should now have implemented the elements of the data management plans required for their grant proposals. In this paper we define ‘data sharing’ and present a protocol for assessing whether data have been shared and how effective the sharing was. We then evaluate the data sharing practices of researchers funded by the NSF at Oregon State University in two ways: by attempting to discover project-level research data using the associated DMP as a starting point, and by examining data sharing associated with journal articles that acknowledge NSF support. Sharing at both the project level and the journal article level was not carried out in the majority of cases, and when sharing was accomplished, the shared data were often of questionable usability due to access, documentation, and formatting issues. We close the article by offering recommendations for how data producers, journal publishers, data repositories, and funding agencies can facilitate the process of sharing data in a meaningful way.  相似文献   

20.
Two alternative “strategies” will not coexist in a population unless on average they are equally successful. The most likely way for such an equilibrium to be maintained is through something equivalent to frequency-dependent selection. Females of the digger wasp Sphex ichneumoneus (Sphecidae) nest in underground burrows. They usually dig and provision these by themselves but occasionally a nest is jointly occupied. The two wasps fight whenever they meet and in the end only one of the two females lays an egg in the shared nest. Two models based on the theory of mixed evolutionarily stable strategies were developed and tested on comprehensive field data from two North American populations of these wasps. The first model proposes two strategies called founding and joining. Founders start burrows alone, but they are more successful when they are joined by a joiner. At equilibrium founders and joiners are equally successful, which amounts to an amicable, sharing relationship. The predictions of this amicable model are decisively rejected by the data. The second model proposes two strategies called digging and entering. Diggers dig their own burrows but they often have to abandon these burrows because of temporary unsuitability. Enterers move in later, thereby exploiting abandoned burrows as a valuable resource. They do not distinguish an adandoned burrow from one that is still occupied. Therefore sharing of burrows arises as an unfortunate by product of selection for entering abandoned burrows, and Model 2 is not an amicable model. Its quantitative predictions are impressively fulfilled in one population, though not in another population. This is one of the only examples yet known of a mixed evolutionarily stable strategy in nature. Yet the word strategy itself can confuse, and this paper tries the experiment of substituting “decision”, defined as a moment at which the animal commits future time to a course of action.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号