共查询到20条相似文献,搜索用时 15 毫秒
1.
Alex B Grover C Haddow B Kabadjov M Klein E Matthews M Tobin R Wang X 《Genome biology》2008,9(Z2):S10
Background:
The tasks in BioCreative II were designed to approximate some of the laborious work involved in curating biomedical research papers. The approach to these tasks taken by the University of Edinburgh team was to adapt and extend the existing natural language processing (NLP) system that we have developed as part of a commercial curation assistant. Although this paper concentrates on using NLP to assist with curation, the system can be equally employed to extract types of information from the literature that is immediately relevant to biologists in general.Results:
Our system was among the highest performing on the interaction subtasks, and competitive performance on the gene mention task was achieved with minimal development effort. For the gene normalization task, a string matching technique that can be quickly applied to new domains was shown to perform close to average.Conclusion:
The technologies being developed were shown to be readily adapted to the BioCreative II tasks. Although high performance may be obtained on individual tasks such as gene mention recognition and normalization, and document classification, tasks in which a number of components must be combined, such as detection and normalization of interacting protein pairs, are still challenging for NLP systems.2.
Background:
The goal of text mining is to make the information conveyed in scientific publications accessible to structured search and automatic analysis. Two important subtasks of text mining are entity mention normalization - to identify biomedical objects in text - and extraction of qualified relationships between those objects. We describe a method for identifying genes and relationships between proteins.Results:
We present solutions to gene mention normalization and extraction of protein-protein interactions. For the first task, we identify genes by using background knowledge on each gene, namely annotations related to function, location, disease, and so on. Our approach currently achieves an f-measure of 86.4% on the BioCreative II gene normalization data. For the extraction of protein-protein interactions, we pursue an approach that builds on classical sequence analysis: motifs derived from multiple sequence alignments. The method achieves an f-measure of 24.4% (micro-average) in the BioCreative II interaction pair subtask.Conclusion:
For gene mention normalization, our approach outperforms strategies that utilize only the matching of genes names against dictionaries, without invoking further knowledge on each gene. Motifs derived from alignments of sentences are successful at identifying protein interactions in text; the approach we present in this report is fully automated and performs similarly to systems that require human intervention at one or more stages.Availability:
Our methods for gene, protein, and species identification, and extraction of protein-protein are available as part of the BioCreative Meta Services (BCMS), see http://bcms.bioinfo.cnio.es/.3.
Baumgartner WA Lu Z Johnson HL Caporaso JG Paquette J Lindemann A White EK Medvedeva O Cohen KB Hunter L 《Genome biology》2008,9(Z2):S9
Background:
Reliable information extraction applications have been a long sought goal of the biomedical text mining community, a goal that if reached would provide valuable tools to benchside biologists in their increasingly difficult task of assimilating the knowledge contained in the biomedical literature. We present an integrated approach to concept recognition in biomedical text. Concept recognition provides key information that has been largely missing from previous biomedical information extraction efforts, namely direct links to well defined knowledge resources that explicitly cement the concept's semantics. The BioCreative II tasks discussed in this special issue have provided a unique opportunity to demonstrate the effectiveness of concept recognition in the field of biomedical language processing.Results:
Through the modular construction of a protein interaction relation extraction system, we present several use cases of concept recognition in biomedical text, and relate these use cases to potential uses by the benchside biologist.Conclusion:
Current information extraction technologies are approaching performance standards at which concept recognition can begin to deliver high quality data to the benchside biologist. Our system is available as part of the BioCreative Meta-Server project and on the internet http://bionlp.sourceforge.net.4.
Krallinger M Morgan A Smith L Leitner F Tanabe L Wilbur J Hirschman L Valencia A 《Genome biology》2008,9(Z2):S1
Background:
Genome sciences have experienced an increasing demand for efficient text-processing tools that can extract biologically relevant information from the growing amount of published literature. In response, a range of text-mining and information-extraction tools have recently been developed specifically for the biological domain. Such tools are only useful if they are designed to meet real-life tasks and if their performance can be estimated and compared. The BioCreative challenge (Critical Assessment of Information Extraction in Biology) consists of a collaborative initiative to provide a common evaluation framework for monitoring and assessing the state-of-the-art of text-mining systems applied to biologically relevant problems.Results:
The Second BioCreative assessment (2006 to 2007) attracted 44 teams from 13 countries worldwide, with the aim of evaluating current information-extraction/text-mining technologies developed for one or more of the three tasks defined for this challenge evaluation. These tasks included the recognition of gene mentions in abstracts (gene mention task); the extraction of a list of unique identifiers for human genes mentioned in abstracts (gene normalization task); and finally the extraction of physical protein-protein interaction annotation-relevant information (protein-protein interaction task). The 'gold standard' data used for evaluating submissions for the third task was provided by the interaction databases MINT (Molecular Interaction Database) and IntAct.Conclusion:
The Second BioCreative assessment almost doubled the number of participants for each individual task when compared with the first BioCreative assessment. An overall improvement in terms of balanced precision and recall was observed for the best submissions for the gene mention (F score 0.87); for the gene normalization task, the best results were comparable (F score 0.81) compared with results obtained for similar tasks posed at the first BioCreative challenge. In case of the protein-protein interaction task, the importance and difficulties of experimentally confirmed annotation extraction from full-text articles were explored, yielding different results depending on the step of the annotation extraction workflow. A common characteristic observed in all three tasks was that the combination of system outputs could yield better results than any single system. Finally, the development of the first text-mining meta-server was promoted within the context of this community challenge.5.
N. Cesbron A.-L. Royer Y. Guitton A. Sydor B. Le Bizec G. Dervilly-Pinel 《Metabolomics : Official journal of the Metabolomic Society》2017,13(8):99
Introduction
Collecting feces is easy. It offers direct outcome to endogenous and microbial metabolites.Objectives
In a context of lack of consensus about fecal sample preparation, especially in animal species, we developed a robust protocol allowing untargeted LC-HRMS fingerprinting.Methods
The conditions of extraction (quantity, preparation, solvents, dilutions) were investigated in bovine feces.Results
A rapid and simple protocol involving feces extraction with methanol (1/3, M/V) followed by centrifugation and a step filtration (10 kDa) was developed.Conclusion
The workflow generated repeatable and informative fingerprints for robust metabolome characterization.6.
Marta Michalczuk Beata Urban Tadeusz Porowski Anna Wasilewska Alina Bakunowicz-Łazarczyk 《Metabolomics : Official journal of the Metabolomic Society》2018,14(6):82
Introduction
Citrate is an old metabolite which is best known for the role in the Krebs cycle. Citrate is widely used in many branches of medicine. In ophthalmology citrate is considered as a therapeutic agent and an useful diagnostic tool—biomarker.Objectives
To summarize the published literature on citrate usage in the leading causes of blindness and highlight the new possibilities for this old metabolite.Methods
We conducted a systematic search of the scientific literature about citrate usage in ophthalmology up to January 2018. The reference lists of identified articles were searched for providing in-depth information.Results
This systematic review included 30 articles. The role of citrate in the leading causes of blindness is presented.Conclusions
Citrate might help inhibit cataract progression, in case of questions confirm glaucoma diagnosis or improve cornea repair treatment as adjuvant agent (therapy of ulcerating cornea after alkali injury, crosslinking procedure). However, the knowledge about possible citrate usage in ophthalmology is not widely known. Promoting recent scientific knowledge about citrate usage in ophthalmology may not only benefit of medical improvement but may also limit economic costs caused by leading causes of blindness. Further studies on citrate usage in ophthalmology should continuously be the field of scientific interest.7.
Douglas B. Kell Stephen G. Oliver 《Metabolomics : Official journal of the Metabolomic Society》2016,12(9):148
Background
The term ‘metabolome’ was introduced to the scientific literature in September 1998.Aim and key scientific concepts of the review
To mark its 18-year-old ‘coming of age’, two of the co-authors of that paper review the genesis of metabolomics, whence it has come and where it may be going.8.
Chatr-aryamontri A Kerrien S Khadake J Orchard S Ceol A Licata L Castagnoli L Costa S Derow C Huntley R Aranda B Leroy C Thorneycroft D Apweiler R Cesareni G Hermjakob H 《Genome biology》2008,9(Z2):S5
Background
In the absence of consolidated pipelines to archive biological data electronically, information dispersed in the literature must be captured by manual annotation. Unfortunately, manual annotation is time consuming and the coverage of published interaction data is therefore far from complete. The use of text-mining tools to identify relevant publications and to assist in the initial information extraction could help to improve the efficiency of the curation process and, as a consequence, the database coverage of data available in the literature. The 2006 BioCreative competition was aimed at evaluating text-mining procedures in comparison with manual annotation of protein-protein interactions.Results
To aid the BioCreative protein-protein interaction task, IntAct and MINT (Molecular INTeraction) provided both the training and the test datasets. Data from both databases are comparable because they were curated according to the same standards. During the manual curation process, the major cause of data loss in mining the articles for information was ambiguity in the mapping of the gene names to stable UniProtKB database identifiers. It was also observed that most of the information about interactions was contained only within the full-text of the publication; hence, text mining of protein-protein interaction data will require the analysis of the full-text of the articles and cannot be restricted to the abstract.Conclusion
The development of text-mining tools to extract protein-protein interaction information may increase the literature coverage achieved by manual curation. To support the text-mining community, databases will highlight those sentences within the articles that describe the interactions. These will supply data-miners with a high quality dataset for algorithm development. Furthermore, the dictionary of terms created by the BioCreative competitors could enrich the synonym list of the PSI-MI (Proteomics Standards Initiative-Molecular Interactions) controlled vocabulary, which is used by both databases to annotate their data content.9.
Sonia Liggi Christine Hinz Zoe Hall Maria Laura Santoru Simone Poddighe John Fjeldsted Luigi Atzori Julian L. Griffin 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):52
Introduction
Data processing is one of the biggest problems in metabolomics, given the high number of samples analyzed and the need of multiple software packages for each step of the processing workflow.Objectives
Merge in the same platform the steps required for metabolomics data processing.Methods
KniMet is a workflow for the processing of mass spectrometry-metabolomics data based on the KNIME Analytics platform.Results
The approach includes key steps to follow in metabolomics data processing: feature filtering, missing value imputation, normalization, batch correction and annotation.Conclusion
KniMet provides the user with a local, modular and customizable workflow for the processing of both GC–MS and LC–MS open profiling data.10.
Rachel A. Spicer Christoph Steinbeck 《Metabolomics : Official journal of the Metabolomic Society》2018,14(1):16
Introduction
Data sharing is being increasingly required by journals and has been heralded as a solution to the ‘replication crisis’.Objectives
(i) Review data sharing policies of journals publishing the most metabolomics papers associated with open data and (ii) compare these journals’ policies to those that publish the most metabolomics papers.Methods
A PubMed search was used to identify metabolomics papers. Metabolomics data repositories were manually searched for linked publications.Results
Journals that support data sharing are not necessarily those with the most papers associated to open metabolomics data.Conclusion
Further efforts are required to improve data sharing in metabolomics.11.
D. Jacob C. Deborde M. Lefebvre M. Maucourt A. Moing 《Metabolomics : Official journal of the Metabolomic Society》2017,13(4):36
Introduction
Concerning NMR-based metabolomics, 1D spectra processing often requires an expert eye for disentangling the intertwined peaks.Objectives
The objective of NMRProcFlow is to assist the expert in this task in the best way without requirement of programming skills.Methods
NMRProcFlow was developed to be a graphical and interactive 1D NMR (1H & 13C) spectra processing tool.Results
NMRProcFlow (http://nmrprocflow.org), dedicated to metabolic fingerprinting and targeted metabolomics, covers all spectra processing steps including baseline correction, chemical shift calibration and alignment.Conclusion
Biologists and NMR spectroscopists can easily interact and develop synergies by visualizing the NMR spectra along with their corresponding experimental-factor levels, thus setting a bridge between experimental design and subsequent statistical analyses.12.
Morgan AA Lu Z Wang X Cohen AM Fluck J Ruch P Divoli A Fundel K Leaman R Hakenberg J Sun C Liu HH Torres R Krauthammer M Lau WW Liu H Hsu CN Schuemie M Cohen KB Hirschman L 《Genome biology》2008,9(Z2):S3
Background:
The goal of the gene normalization task is to link genes or gene products mentioned in the literature to biological databases. This is a key step in an accurate search of the biological literature. It is a challenging task, even for the human expert; genes are often described rather than referred to by gene symbol and, confusingly, one gene name may refer to different genes (often from different organisms). For BioCreative II, the task was to list the Entrez Gene identifiers for human genes or gene products mentioned in PubMed/MEDLINE abstracts. We selected abstracts associated with articles previously curated for human genes. We provided 281 expert-annotated abstracts containing 684 gene identifiers for training, and a blind test set of 262 documents containing 785 identifiers, with a gold standard created by expert annotators. Inter-annotator agreement was measured at over 90%.Results:
Twenty groups submitted one to three runs each, for a total of 54 runs. Three systems achieved F-measures (balanced precision and recall) between 0.80 and 0.81. Combining the system outputs using simple voting schemes and classifiers obtained improved results; the best composite system achieved an F-measure of 0.92 with 10-fold cross-validation. A 'maximum recall' system based on the pooled responses of all participants gave a recall of 0.97 (with precision 0.23), identifying 763 out of 785 identifiers.Conclusion:
Major advances for the BioCreative II gene normalization task include broader participation (20 versus 8 teams) and a pooled system performance comparable to human experts, at over 90% agreement. These results show promise as tools to link the literature with biological databases.13.
Background
Measurement-unit conflicts are a perennial problem in integrative research domains such as clinical meta-analysis. As multi-national collaborations grow, as new measurement instruments appear, and as Linked Open Data infrastructures become increasingly pervasive, the number of such conflicts will similarly increase.Methods
We propose a generic approach to the problem of (a) encoding measurement units in datasets in a machine-readable manner, (b) detecting when a dataset contained mixtures of measurement units, and (c) automatically converting any conflicting units into a desired unit, as defined for a given study.Results
We utilized existing ontologies and standards for scientific data representation, measurement unit definition, and data manipulation to build a simple and flexible Semantic Web Service-based approach to measurement-unit harmonization. A cardiovascular patient cohort in which clinical measurements were recorded in a number of different units (e.g., mmHg and cmHg for blood pressure) was automatically classified into a number of clinical phenotypes, semantically defined using different measurement units.Conclusions
We demonstrate that through a combination of semantic standards and frameworks, unit integration problems can be automatically detected and resolved.14.
Background
In recent years the visualization of biomagnetic measurement data by so-called pseudo current density maps or Hosaka-Cohen (HC) transformations became popular.Methods
The physical basis of these intuitive maps is clarified by means of analytically solvable problems.Results
Examples in magnetocardiography, magnetoencephalography and magnetoneurography demonstrate the usefulness of this method.Conclusion
Hardware realizations of the HC-transformation and some similar transformations are discussed which could advantageously support cross-platform comparability of biomagnetic measurements.15.
J. C. Martínez-Ávila A. García Bartolomé I. García I. Dapía Hoi Y. Tong L. Díaz P. Guerra J. Frías A. J. Carcás Sansuan A. M. Borobia 《Metabolomics : Official journal of the Metabolomic Society》2018,14(5):70
Introduction
Zonisamide is a new-generation anticonvulsant antiepileptic drug metabolized primarily in the liver, with subsequent elimination via the renal route.Objectives
Our objective was to evaluate the utility of pharmacometabolomics in the detection of zonisamide metabolites that could be related to its disposition and therefore, to its efficacy and toxicity.Methods
This study was nested to a bioequivalence clinical trial with 28 healthy volunteers. Each participant received a single dose of zonisamide on two separate occasions (period 1 and period 2), with a washout period between them. Blood samples of zonisamide were obtained from all patients at baseline for each period, before volunteers were administered any medication, for metabolomics analysis.Results
After a Lasso regression was applied, age, height, branched-chain amino acids, steroids, triacylglycerols, diacyl glycerophosphoethanolamine, glycerophospholipids susceptible to methylation, phosphatidylcholines with 20:4 FA (arachidonic acid) and cholesterol ester and lysophosphatidylcholine were obtained in both periods.Conclusion
To our knowledge, this is the only research study to date that has attempted to link basal metabolomic status with pharmacokinetic parameters of zonisamide.16.
Introduction
Untargeted metabolomics is a powerful tool for biological discoveries. To analyze the complex raw data, significant advances in computational approaches have been made, yet it is not clear how exhaustive and reliable the data analysis results are.Objectives
Assessment of the quality of raw data processing in untargeted metabolomics.Methods
Five published untargeted metabolomics studies, were reanalyzed.Results
Omissions of at least 50 relevant compounds from the original results as well as examples of representative mistakes were reported for each study.Conclusion
Incomplete raw data processing shows unexplored potential of current and legacy data.17.
Background
The reconstruction of ancestral genomes must deal with the problem of resolution, necessarily involving a trade-off between trying to identify genomic details and being overwhelmed by noise at higher resolutions.Results
We use the median reconstruction at the synteny block level, of the ancestral genome of the order Gentianales, based on coffee, Rhazya stricta and grape, to exemplify the effects of resolution (granularity) on comparative genomic analyses.Conclusions
We show how decreased resolution blurs the differences between evolving genomes, with respect to rate, mutational process and other characteristics.18.
Jamie V. de Seymour Stephanie Tu Xiaoling He Hua Zhang Ting-Li Han Philip N. Baker Karolina Sulek 《Metabolomics : Official journal of the Metabolomic Society》2018,14(6):79
Introduction
Intrahepatic cholestasis of pregnancy (ICP) is a common maternal liver disease; development can result in devastating consequences, including sudden fetal death and stillbirth. Currently, recognition of ICP only occurs following onset of clinical symptoms.Objective
Investigate the maternal hair metabolome for predictive biomarkers of ICP.Methods
The maternal hair metabolome (gestational age of sampling between 17 and 41 weeks) of 38 Chinese women with ICP and 46 pregnant controls was analysed using gas chromatography–mass spectrometry.Results
Of 105 metabolites detected in hair, none were significantly associated with ICP.Conclusion
Hair samples represent accumulative environmental exposure over time. Samples collected at the onset of ICP did not reveal any metabolic shifts, suggesting rapid development of the disease.19.
Renato de Souza Pinto Lemgruber Kaspar Valgepea Mark P. Hodson Ryan Tappel Sean D. Simpson Michael Köpke Lars K. Nielsen Esteban Marcellin 《Metabolomics : Official journal of the Metabolomic Society》2018,14(3):35
Introduction
Quantification of tetrahydrofolates (THFs), important metabolites in the Wood–Ljungdahl pathway (WLP) of acetogens, is challenging given their sensitivity to oxygen.Objective
To develop a simple anaerobic protocol to enable reliable THFs quantification from bioreactors.Methods
Anaerobic cultures were mixed with anaerobic acetonitrile for extraction. Targeted LC–MS/MS was used for quantification.Results
Tetrahydrofolates can only be quantified if sampled anaerobically. THF levels showed a strong correlation to acetyl-CoA, the end product of the WLP.Conclusion
Our method is useful for relative quantification of THFs across different growth conditions. Absolute quantification of THFs requires the use of labelled standards.20.
Antonio Rosato Leonardo Tenori Marta Cascante Pedro Ramon De Atauri Carulla Vitor A. P. Martins dos Santos Edoardo Saccenti 《Metabolomics : Official journal of the Metabolomic Society》2018,14(4):37