首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.

Purpose

This paper introduces the new EcoSpold data format for life cycle inventory (LCI).

Methods

A short historical retrospect on data formats in the life cycle assessment (LCA) field is given. The guiding principles for the revision and implementation are explained. Some technical basics of the data format are described, and changes to the previous data format are explained.

Results

The EcoSpold 2 data format caters for new requirements that have arisen in the LCA field in recent years.

Conclusions

The new data format is the basis for the Ecoinvent v3 database, but since it is an open data format, it is expected to be adopted by other LCI databases. Several new concepts used in the new EcoSpold 2 data format open the way for new possibilities for the LCA practitioners and to expand the application of the datasets in other fields beyond LCA (e.g., Material Flow Analysis, Energy Balancing).
  相似文献   

2.

Background

Non-negative matrix factorization (NMF) has been introduced as an important method for mining biological data. Though there currently exists packages implemented in R and other programming languages, they either provide only a few optimization algorithms or focus on a specific application field. There does not exist a complete NMF package for the bioinformatics community, and in order to perform various data mining tasks on biological data.

Results

We provide a convenient MATLAB toolbox containing both the implementations of various NMF techniques and a variety of NMF-based data mining approaches for analyzing biological data. Data mining approaches implemented within the toolbox include data clustering and bi-clustering, feature extraction and selection, sample classification, missing values imputation, data visualization, and statistical comparison.

Conclusions

A series of analysis such as molecular pattern discovery, biological process identification, dimension reduction, disease prediction, visualization, and statistical comparison can be performed using this toolbox.
  相似文献   

3.

Background

Data from biological samples and medical evaluations plays an essential part in clinical decision making. This data is equally important in clinical studies and it is critical to have an infrastructure that ensures that its quality is preserved throughout its entire lifetime. We are running a 5-year longitudinal clinical study, KOL-Örestad, with the objective to identify new COPD (Chronic Obstructive Pulmonary Disease) biomarkers in blood. In the study, clinical data and blood samples are collected from both private and public health-care institutions and stored at our research center in databases and biobanks, respectively. The blood is analyzed by Mass Spectrometry and the results from this analysis then linked to the clinical data.

Method

We built an infrastructure that allows us to efficiently collect and analyze the data. We chose to use REDCap as the EDC (Electronic Data Capture) tool for the study due to its short setup-time, ease of use, and flexibility. REDCap allows users to easily design data collection modules based on existing templates. In addition, it provides two functions that allow users to import batches of data; through a web API (Application Programming Interface) as well as by uploading CSV-files (Comma Separated Values).

Results

We created a software, DART (Data Rapid Translation), that translates our biomarker data into a format that fits REDCap's CSV-templates. In addition, DART is configurable to work with many other data formats as well. We use DART to import our clinical chemistry data to the REDCap database.

Conclusion

We have shown that a powerful and internationally adopted EDC tool such as REDCap can be extended so that it can be used efficiently in proteomic studies. In our study, we accomplish this by using DART to translate our clinical chemistry data to a format that fits the templates of REDCap.
  相似文献   

4.

Introduction

Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.

Objectives

(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.

Methods

A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.

Results

Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.

Conclusion

Further efforts are required to improve data sharing in metabolomics.
  相似文献   

5.

Background

This paper examines how the adoption of a subject-specific library service has changed the way in which its users interact with a digital library. The LitMiner text-analysis application was developed to enable biologists to explore gene relationships in the published literature. The application features a suite of interfaces that enable users to search PubMed as well as local databases, to view document abstracts, to filter terms, to select gene name aliases, and to visualize the co-occurrences of genes in the literature. At each of these stages, LitMiner offers the functionality of a digital library. Documents that are accessible online are identified by an icon. Users can also order documents from their institution's library collection from within the application. In so doing, LitMiner aims to integrate digital library services into the research process of its users.

Methods

Case study

Results

This integration of digital library services into the research process of biologists results in increased access to the published literature.

Conclusion

In order to make better use of their collections, digital libraries should customize their services to suit the research needs of their patrons.
  相似文献   

6.

Introduction

New platforms are emerging that enable more data providers to publish life cycle inventory data.

Background

Providing datasets that are not complete LCA models results in fragments that are difficult for practitioners to integrate and use for LCA modeling. Additionally, when proxies are used to provide a technosphere input to a process that was not originally intended by the process authors, in most LCA software, this requires modifying the original process.

Results

The use of a bridge process, which is a process created to link two existing processes, is proposed as a solution.

Discussion

Benefits to bridge processes include increasing model transparency, facilitating dataset sharing and integration without compromising original dataset integrity and independence, providing a structure with which to make the data quality associated with process linkages explicit, and increasing model flexibility in the case that multiple bridges are provided. A drawback is that they add additional processes to existing LCA models which will increase their size.

Conclusions

Bridge processes can be an enabler in allowing users to integrate new datasets without modifying them to link to background databases or other processes they have available. They may not be the ideal long-term solution but provide a solution that works within the existing LCA data model.
  相似文献   

7.

Introduction

Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.

Objectives

Assessment of the quality of raw data processing in untargeted metabolomics.

Methods

Five published untargeted metabolomics studies, were reanalyzed.

Results

Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.

Conclusion

Incomplete raw data processing shows unexplored potential of current and legacy data.
  相似文献   

8.

Introduction

Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.

Objectives

Merge in the same platform the steps required for metabolomics data processing.

Methods

KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.

Results

The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.

Conclusion

KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.
  相似文献   

9.

Background

The Matlab software is a one of the most advanced development tool for application in engineering practice. From our point of view the most important is the image processing toolbox, offering many built-in functions, including mathematical morphology, and implementation of a many artificial neural networks as AI. It is very popular platform for creation of the specialized program for image analysis, also in pathology. Based on the latest version of Matlab Builder Java toolbox, it is possible to create the software, serving as a remote system for image analysis in pathology via internet communication. The internet platform can be realized based on Java Servlet Pages with Tomcat server as servlet container.

Methods

In presented software implementation we propose remote image analysis realized by Matlab algorithms. These algorithms can be compiled to executable jar file with the help of Matlab Builder Java toolbox. The Matlab function must be declared with the set of input data, output structure with numerical results and Matlab web figure. Any function prepared in that manner can be used as a Java function in Java Servlet Pages (JSP). The graphical user interface providing the input data and displaying the results (also in graphical form) must be implemented in JSP. Additionally the data storage to database can be implemented within algorithm written in Matlab with the help of Matlab Database Toolbox directly with the image processing. The complete JSP page can be run by Tomcat server.

Results

The proposed tool for remote image analysis was tested on the Computerized Analysis of Medical Images (CAMI) software developed by author. The user provides image and case information (diagnosis, staining, image parameter etc.). When analysis is initialized, input data with image are sent to servlet on Tomcat. When analysis is done, client obtains the graphical results as an image with marked recognized cells and also the quantitative output. Additionally, the results are stored in a server database. The internet platform was tested on PC Intel Core2 Duo T9600 2.8GHz 4GB RAM server with 768x576 pixel size, 1.28Mb tiff format images reffering to meningioma tumour (x400, Ki-67/MIB-1). The time consumption was as following: at analysis by CAMI, locally on a server – 3.5 seconds, at remote analysis – 26 seconds, from which 22 seconds were used for data transfer via internet connection. At jpg format image (102 Kb) the consumption time was reduced to 14 seconds.

Conclusions

The results have confirmed that designed remote platform can be useful for pathology image analysis. The time consumption is depended mainly on the image size and speed of the internet connections. The presented implementation can be used for many types of analysis at different staining, tissue, morphometry approaches, etc. The significant problem is the implementation of the JSP page in the multithread form, that can be used parallelly by many users. The presented platform for image analysis in pathology can be especially useful for small laboratory without its own image analysis system.
  相似文献   

10.

Introduction

Untargeted and targeted analyses are two classes of metabolic study. Both strategies have been advanced by high resolution mass spectrometers coupled with chromatography, which have the advantages of high mass sensitivity and accuracy. State-of-art methods for mass spectrometric data sets do not always quantify metabolites of interest in a targeted assay efficiently and accurately.

Objectives

TarMet can quantify targeted metabolites as well as their isotopologues through a reactive and user-friendly graphical user interface.

Methods

TarMet accepts vendor-neutral data files (NetCDF, mzXML and mzML) as inputs. Then it extracts ion chromatograms, detects peak position and bounds and confirms the metabolites via the isotope patterns. It can integrate peak areas for all isotopologues automatically.

Results

TarMet detects more isotopologues and quantify them better than state-of-art methods, and it can process isotope tracer assay well.

Conclusion

TarMet is a better tool for targeted metabolic and stable isotope tracer analyses.
  相似文献   

11.

Background

Centrifugation is an indispensable procedure for plasma sample preparation, but applied conditions can vary between labs.

Aim

Determine whether routinely used plasma centrifugation protocols (1500×g 10 min; 3000×g 5 min) influence non-targeted metabolomic analyses.

Methods

Nuclear magnetic resonance spectroscopy (NMR) and High Resolution Mass Spectrometry (HRMS) data were evaluated with sparse partial least squares discriminant analyses and compared with cell count measurements.

Results

Besides significant differences in platelet count, we identified substantial alterations in NMR and HRMS data related to the different centrifugation protocols.

Conclusion

Already minor differences in plasma centrifugation can significantly influence metabolomic patterns and potentially bias metabolomics studies.
  相似文献   

12.

Introduction

Mass spectrometry imaging (MSI) experiments result in complex multi-dimensional datasets, which require specialist data analysis tools.

Objectives

We have developed massPix—an R package for analysing and interpreting data from MSI of lipids in tissue.

Methods

massPix produces single ion images, performs multivariate statistics and provides putative lipid annotations based on accurate mass matching against generated lipid libraries.

Results

Classification of tissue regions with high spectral similarly can be carried out by principal components analysis (PCA) or k-means clustering.

Conclusion

massPix is an open-source tool for the analysis and statistical interpretation of MSI data, and is particularly useful for lipidomics applications.
  相似文献   

13.

Background

Next-generation sequencing can determine DNA bases and the results of sequence alignments are generally stored in files in the Sequence Alignment/Map (SAM) format and the compressed binary version (BAM) of it. SAMtools is a typical tool for dealing with files in the SAM/BAM format. SAMtools has various functions, including detection of variants, visualization of alignments, indexing, extraction of parts of the data and loci, and conversion of file formats. It is written in C and can execute fast. However, SAMtools requires an additional implementation to be used in parallel with, for example, OpenMP (Open Multi-Processing) libraries. For the accumulation of next-generation sequencing data, a simple parallelization program, which can support cloud and PC cluster environments, is required.

Results

We have developed cljam using the Clojure programming language, which simplifies parallel programming, to handle SAM/BAM data. Cljam can run in a Java runtime environment (e.g., Windows, Linux, Mac OS X) with Clojure.

Conclusions

Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The cljam code is written in Clojure and has fewer lines than other similar tools.
  相似文献   

14.

Background and aims

In order to analyse root system architectures (RSAs) from captured images, a variety of manual (e.g. Data Analysis of Root Tracings, DART), semi-automated and fully automated software packages have been developed. These tools offer complementary approaches to study RSAs and the use of the Root System Markup Language (RSML) to store RSA data makes the comparison of measurements obtained with different (semi-) automated root imaging platforms easier. The throughput of the data analysis process using exported RSA data, however, should benefit greatly from batch analysis in a generic data analysis environment (R software).

Methods

We developed an R package (archiDART) with five functions. It computes global RSA traits, root growth rates, root growth directions and trajectories, and lateral root distribution from DART-generated and/or RSML files. It also has specific plotting functions designed to visualise the dynamics of root system growth.

Results

The results demonstrated the ability of the package’s functions to compute relevant traits for three contrasted RSAs (Brachypodium distachyon [L.] P. Beauv., Hevea brasiliensis Müll. Arg. and Solanum lycopersicum L.).

Conclusions

This work extends the DART software package and other image analysis tools supporting the RSML format, enabling users to easily calculate a number of RSA traits in a generic data analysis environment.
  相似文献   

15.

Background

In recent years the visualization of biomagnetic measurement data by so-called pseudo current density maps or Hosaka-Cohen (HC) transformations became popular.

Methods

The physical basis of these intuitive maps is clarified by means of analytically solvable problems.

Results

Examples in magnetocardiography, magnetoencephalography and magnetoneurography demonstrate the usefulness of this method.

Conclusion

Hardware realizations of the HC-transformation and some similar transformations are discussed which could advantageously support cross-platform comparability of biomagnetic measurements.
  相似文献   

16.

Introduction

The Metabolomics Workbench Data Repository is a public repository of mass spectrometry and nuclear magnetic resonance data and metadata derived from a wide variety of metabolomics studies. The data and metadata for each study is deposited, stored, and accessed via files in the domain-specific ‘mwTab’ flat file format.

Objectives

In order to improve the accessibility, reusability, and interoperability of the data and metadata stored in ‘mwTab’ formatted files, we implemented a Python library and package. This Python package, named ‘mwtab’, is a parser for the domain-specific ‘mwTab’ flat file format, which provides facilities for reading, accessing, and writing ‘mwTab’ formatted files. Furthermore, the package provides facilities to validate both the format and required metadata elements of a given ‘mwTab’ formatted file.

Methods

In order to develop the ‘mwtab’ package we used the official ‘mwTab’ format specification. We used Git version control along with Python unit-testing framework as well as continuous integration service to run those tests on multiple versions of Python. Package documentation was developed using sphinx documentation generator.

Results

The ‘mwtab’ package provides both Python programmatic library interfaces and command-line interfaces for reading, writing, and validating ‘mwTab’ formatted files. Data and associated metadata are stored within Python dictionary- and list-based data structures, enabling straightforward, ‘pythonic’ access and manipulation of data and metadata. Also, the package provides facilities to convert ‘mwTab’ files into a JSON formatted equivalent, enabling easy reusability of the data by all modern programming languages that implement JSON parsers. The ‘mwtab’ package implements its metadata validation functionality based on a pre-defined JSON schema that can be easily specialized for specific types of metabolomics studies. The library also provides a command-line interface for interconversion between ‘mwTab’ and JSONized formats in raw text and a variety of compressed binary file formats.

Conclusions

The ‘mwtab’ package is an easy-to-use Python package that provides FAIRer utilization of the Metabolomics Workbench Data Repository. The source code is freely available on GitHub and via the Python Package Index. Documentation includes a ‘User Guide’, ‘Tutorial’, and ‘API Reference’. The GitHub repository also provides ‘mwtab’ package unit-tests via a continuous integration service.
  相似文献   

17.

Introduction

Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.

Objectives

In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.

Methods

The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.

Results

A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.

Conclusion

The workflow generated repeatable and informative fingerprints for robust metabolome characterization.
  相似文献   

18.

Background

Standardized schemas, databases, and public data repositories are needed for the studies of malaria vectors that encompass a remarkably diverse array of designs and rapidly generate large data volumes, often in resource-limited tropical settings lacking specialized software or informatics support.

Results

Data from the majority of mosquito studies conformed to a generic schema, with data collection forms recording the experimental design, sorting of collections, details of sample pooling or subdivision, and additional observations. Generically applicable forms with standardized attribute definitions enabled rigorous, consistent data and sample management with generic software and minimal expertise. Forms use now includes 20 experiments, 8 projects, and 15 users at 3 research and control institutes in 3 African countries, resulting in 11 peer-reviewed publications.

Conclusion

We have designed generic data schema that can be used to develop paper or electronic based data collection forms depending on the availability of resources. We have developed paper-based data collection forms that can be used to collect data from majority of entomological studies across multiple study areas using standardized data formats. Data recorded on these forms with standardized formats can be entered and linked with any relational database software. These informatics tools are recommended because they ensure that medical entomologists save time, improve data quality, and data collected and shared across multiple studies is in standardized formats hence increasing research outputs.
  相似文献   

19.

Background

Sequence alignment data is often ordered by coordinate (id of the reference sequence plus position on the sequence where the fragment was mapped) when stored in BAM files, as this simplifies the extraction of variants between the mapped data and the reference or of variants within the mapped data. In this order paired reads are usually separated in the file, which complicates some other applications like duplicate marking or conversion to the FastQ format which require to access the full information of the pairs.

Results

In this paper we introduce biobambam, a set of tools based on the efficient collation of alignments in BAM files by read name. The employed collation algorithm avoids time and space consuming sorting of alignments by read name where this is possible without using more than a specified amount of main memory. Using this algorithm tasks like duplicate marking in BAM files and conversion of BAM files to the FastQ format can be performed very efficiently with limited resources. We also make the collation algorithm available in the form of an API for other projects. This API is part of the libmaus package.

Conclusions

In comparison with previous approaches to problems involving the collation of alignments by read name like the BAM to FastQ or duplication marking utilities our approach can often perform an equivalent task more efficiently in terms of the required main memory and run-time. Our BAM to FastQ conversion is faster than all widely known alternatives including Picard and bamUtil. Our duplicate marking is about as fast as the closest competitor bamUtil for small data sets and faster than all known alternatives on large and complex data sets.
  相似文献   

20.

Introduction

Metabolomics is a well-established tool in systems biology, especially in the top–down approach. Metabolomics experiments often results in discovery studies that provide intriguing biological hypotheses but rarely offer mechanistic explanation of such findings. In this light, the interpretation of metabolomics data can be boosted by deploying systems biology approaches.

Objectives

This review aims to provide an overview of systems biology approaches that are relevant to metabolomics and to discuss some successful applications of these methods.

Methods

We review the most recent applications of systems biology tools in the field of metabolomics, such as network inference and analysis, metabolic modelling and pathways analysis.

Results

We offer an ample overview of systems biology tools that can be applied to address metabolomics problems. The characteristics and application results of these tools are discussed also in a comparative manner.

Conclusions

Systems biology-enhanced analysis of metabolomics data can provide insights into the molecular mechanisms originating the observed metabolic profiles and enhance the scientific impact of metabolomics studies.
  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号