首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background  

Keyword searching through PubMed and other systems is the standard means of retrieving information from Medline. However, ad-hoc retrieval systems do not meet all of the needs of databases that curate information from literature, or of text miners developing a corpus on a topic that has many terms indicative of relevance. Several databases have developed supervised learning methods that operate on a filtered subset of Medline, to classify Medline records so that fewer articles have to be manually reviewed for relevance. A few studies have considered generalisation of Medline classification to operate on the entire Medline database in a non-domain-specific manner, but existing applications lack speed, available implementations, or a means to measure performance in new domains.  相似文献   

2.
With the rapid advances of various high-throughput technologies, generation of '-omics' data is commonplace in almost every biomedical field. Effective data management and analytical approaches are essential to fully decipher the biological knowledge contained in the tremendous amount of experimental data. Meta-analysis, a set of statistical tools for combining multiple studies of a related hypothesis, has become popular in genomic research. Here, we perform a systematic search from PubMed and manual collection to obtain 620 genomic meta-analysis papers, of which 333 microarray meta-analysis papers are summarized as the basis of this paper and the other 249 GWAS meta-analysis papers are discussed in the next companion paper. The review in the present paper focuses on various biological purposes of microarray meta-analysis, databases and software and related statistical procedures. Statistical considerations of such an analysis are further scrutinized and illustrated by a case study. Finally, several open questions are listed and discussed.  相似文献   

3.
4.

Background

The exponential increase of published biomedical literature prompts the use of text mining tools to manage the information overload automatically. One of the most common applications is to mine protein-protein interactions (PPIs) from PubMed abstracts. Currently, most tools in mining PPIs from literature are using co-occurrence-based approaches or rule-based approaches. Hybrid methods (frame-based approaches) by combining these two methods may have better performance in predicting PPIs. However, the predicted PPIs from these methods are rarely evaluated by known PPI databases and co-occurred terms in Gene Ontology (GO) database.

Methodology/Principal Findings

We here developed a web-based tool, PPI Finder, to mine human PPIs from PubMed abstracts based on their co-occurrences and interaction words, followed by evidences in human PPI databases and shared terms in GO database. Only 28% of the co-occurred pairs in PubMed abstracts appeared in any of the commonly used human PPI databases (HPRD, BioGRID and BIND). On the other hand, of the known PPIs in HPRD, 69% showed co-occurrences in the literature, and 65% shared GO terms.

Conclusions

PPI Finder provides a useful tool for biologists to uncover potential novel PPIs. It is freely accessible at http://liweilab.genetics.ac.cn/tm/.  相似文献   

5.
6.

Context

Identifying patients at risk for adverse outcomes of Clostridium difficile infection (CDI), including recurrence and death, will become increasingly important as novel therapies emerge, which are more effective than traditional approaches but very expensive. Clinical prediction rules (CPRs) can improve the accuracy of medical decision-making. Several CPRs have been developed for CDI, but none has gained a widespread acceptance.

Methods

We systematically reviewed studies describing the derivation or validation of CPRs for unfavourable outcomes of CDI, in medical databases (Medline, Embase, PubMed, Web of Science and Cochrane) and abstracts of conferences.

Results

Of 2945 titles and abstracts screened, 13 studies on the derivation of a CPR were identified: two on recurrences, five on complications (including mortality), five on mortality alone and one on response to treatment. Two studies on the validation of different severity indices were also retrieved. Most CPRs were developed as secondary analyses using cohorts assembled for other purposes. CPRs presented several methodological limitations that could explain their limited use in clinical practice. Except for leukocytosis, albumin and age, there was much heterogeneity in the variables used, and most studies were limited by small sample sizes. Eight models used a retrospective design. Only four studies reported the incidence of the outcome of interest, even if this is essential to evaluate the potential usefulness of a model in other populations. Only five studies performed multivariate analyses to adjust for confounders.

Conclusions

The lack of weighing variables, of validation, calibration and measures of reproducibility, the weak validities and performances when assessed, and the absence of sensitivity analyses, all led to suboptimal quality and debatable utility of those CPRs. Evidence-based tools developed through appropriate prospective cohorts would be more valuable for clinicians than empirically-developed CPRs.  相似文献   

7.
The molecular probe data base (MPDB) contains detailed information on synthetic oligonucleotides, including their identification, target genes, applications and bibliographic references. It is available on-line through Internet and can be searched by using Network Information Retrieval tools. In this article the most recent enhancements of MPDB, both in terms of data contents and new ways of access, are described. These include a recently established collaboration with EMBL Data Library, in the sphere of SRSWWW network browser, in view of a better integration of MPDB with other molecular biology databases.  相似文献   

8.
A wealth of bioinformatics tools and databases has been created over the last decade and most are freely available to the general public. However, these valuable resources live a shadow existence compared to experimental results and methods that are widely published in journals and relatively easily found through publication databases such as PubMed. For the general scientist as well as bioinformaticists, these tools can deliver great value to the design and analysis of biological and medical experiments, but there is no inventory presenting an up-to-date and easily searchable index of all these resources. To remedy this, the BioWareDB search engine has been created. BioWareDB is an extensive and current catalog of software and databases of relevance to researchers in the fields of biology and medicine, and presently consists of 2800 validated entries. AVAILABILITY: BioWareDB is freely available over the Internet at http://www.biowaredb.org/  相似文献   

9.
Science is a social process with far-reaching impact on our modern society. In recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities. Unfortunately, these databases are not always consistent, which considerably hinders this study. Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases. We found that identifying a single "best" database is far from easy. Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies.  相似文献   

10.

Background

In recent years large bibliographic databases have made much of the published literature of biology available for searches. However, the capabilities of the search engines integrated into these databases for text-based bibliographic searches are limited. To enable searches that deliver the results expected by comparative anatomists, an underlying logical structure known as an ontology is required.

Development and Testing of the Ontology

Here we present the Mammalian Feeding Muscle Ontology (MFMO), a multi-species ontology focused on anatomical structures that participate in feeding and other oral/pharyngeal behaviors. A unique feature of the MFMO is that a simple, computable, definition of each muscle, which includes its attachments and innervation, is true across mammals. This construction mirrors the logical foundation of comparative anatomy and permits searches using language familiar to biologists. Further, it provides a template for muscles that will be useful in extending any anatomy ontology. The MFMO is developed to support the Feeding Experiments End-User Database Project (FEED, https://feedexp.org/), a publicly-available, online repository for physiological data collected from in vivo studies of feeding (e.g., mastication, biting, swallowing) in mammals. Currently the MFMO is integrated into FEED and also into two literature-specific implementations of Textpresso, a text-mining system that facilitates powerful searches of a corpus of scientific publications. We evaluate the MFMO by asking questions that test the ability of the ontology to return appropriate answers (competency questions). We compare the results of queries of the MFMO to results from similar searches in PubMed and Google Scholar.

Results and Significance

Our tests demonstrate that the MFMO is competent to answer queries formed in the common language of comparative anatomy, but PubMed and Google Scholar are not. Overall, our results show that by incorporating anatomical ontologies into searches, an expanded and anatomically comprehensive set of results can be obtained. The broader scientific and publishing communities should consider taking up the challenge of semantically enabled search capabilities.  相似文献   

11.

Background

Evidence about relevant outcomes is required in the evaluation of clinical interventions for children with autism spectrum disorders (ASD). However, to date, the variety of outcome measurement tools being used, and lack of knowledge about the measurement properties of some, compromise conclusions regarding the most effective interventions.

Objectives

This two-stage systematic review aimed to identify the tools used in studies evaluating interventions for anxiety for high-functioning children with ASD in middle childhood, and then to evaluate the tools for their appropriateness and measurement properties.

Methods

Electronic databases including Medline, PsychInfo, Embase, and the Cochrane database and registers were searched for anxiety intervention studies for children with ASD in middle childhood. Articles examining the measurement properties of the tools used were then searched for using a methodological filter in PubMed, and the quality of the papers evaluated using the COSMIN checklist.

Results

Ten intervention studies were identified in which six tools measuring anxiety and one of overall symptom change were used as primary outcomes. One further tool was included as it is recommended for standard use in UK children''s mental health services. Sixty three articles on the properties of the tools were evaluated for the quality of evidence, and the quality of the measurement properties of each tool was summarised.

Conclusions

Overall three questionnaires were found robust in their measurement properties, the Spence Children''s Anxiety Scale, its revised version – the Revised Children''s Anxiety and Depression Scale, and also the Screen for Child Anxiety Related Emotional Disorders. Crucially the articles on measurement properties provided almost no evidence on responsiveness to change, nor on the validity of use of the tools for evaluation of interventions for children with ASD.

PROSPERO Registration number

CRD42012002684.  相似文献   

12.
Traditional laboratory experiments, rehabilitation clinics, and wearable sensors offer biomechanists a wealth of data on healthy and pathological movement. To harness the power of these data and make research more efficient, modern machine learning techniques are starting to complement traditional statistical tools. This survey summarizes the current usage of machine learning methods in human movement biomechanics and highlights best practices that will enable critical evaluation of the literature. We carried out a PubMed/Medline database search for original research articles that used machine learning to study movement biomechanics in patients with musculoskeletal and neuromuscular diseases. Most studies that met our inclusion criteria focused on classifying pathological movement, predicting risk of developing a disease, estimating the effect of an intervention, or automatically recognizing activities to facilitate out-of-clinic patient monitoring. We found that research studies build and evaluate models inconsistently, which motivated our discussion of best practices. We provide recommendations for training and evaluating machine learning models and discuss the potential of several underutilized approaches, such as deep learning, to generate new knowledge about human movement. We believe that cross-training biomechanists in data science and a cultural shift toward sharing of data and tools are essential to maximize the impact of biomechanics research.  相似文献   

13.
The large amount of information contained in bibliographic databases has recently boosted the use of citations, and other indicators based on citation numbers, as tools for the quantitative assessment of scientific research. Citations counts are often interpreted as proxies for the scientific influence of papers, journals, scholars, and institutions. However, a rigorous and scientifically grounded methodology for a correct use of citation counts is still missing. In particular, cross-disciplinary comparisons in terms of raw citation counts systematically favors scientific disciplines with higher citation and publication rates. Here we perform an exhaustive study of the citation patterns of millions of papers, and derive a simple transformation of citation counts able to suppress the disproportionate citation counts among scientific domains. We find that the transformation is well described by a power-law function, and that the parameter values of the transformation are typical features of each scientific discipline. Universal properties of citation patterns descend therefore from the fact that citation distributions for papers in a specific field are all part of the same family of univariate distributions.  相似文献   

14.

Background

The number of retracted scholarly articles has risen precipitously in recent years. Past surveys of the retracted literature each limited their scope to articles in PubMed, though many retracted articles are not indexed in PubMed. To understand the scope and characteristics of retracted articles across the full spectrum of scholarly disciplines, we surveyed 42 of the largest bibliographic databases for major scholarly fields and publisher websites to identify retracted articles. This study examines various trends among them.

Results

We found, 4,449 scholarly publications retracted from 1928–2011. Unlike Math, Physics, Engineering and Social Sciences, the percentages of retractions in Medicine, Life Science and Chemistry exceeded their percentages among Web of Science (WoS) records. Retractions due to alleged publishing misconduct (47%) outnumbered those due to alleged research misconduct (20%) or questionable data/interpretations (42%). This total exceeds 100% since multiple justifications were listed in some retraction notices. Retraction/WoS record ratios vary among author affiliation countries. Though widespread, only miniscule percentages of publications for individual years, countries, journals, or disciplines have been retracted. Fifteen prolific individuals accounted for more than half of all retractions due to alleged research misconduct, and strongly influenced all retraction characteristics. The number of articles retracted per year increased by a factor of 19.06 from 2001 to 2010, though excluding repeat offenders and adjusting for growth of the published literature decreases it to a factor of 11.36.

Conclusions

Retracted articles occur across the full spectrum of scholarly disciplines. Most retracted articles do not contain flawed data; and the authors of most retracted articles have not been accused of research misconduct. Despite recent increases, the proportion of published scholarly literature affected by retraction remains very small. Articles and editorials discussing retractions, or their relation to research integrity, should always consider individual cases in these broad contexts. However, better mechanisms are still needed for raising researchers’ awareness of the retracted literature in their field.  相似文献   

15.
Objectives To describe the multicentre clinical databases that exist in the United Kingdom, to report on their quality, to explore which organisational and managerial features are associated with high quality, and to make recommendations for improvements.Design Cross sectional survey, with interviews with database custodians and search of electronic bibliographic database (PubMed).Studies reviewed 105 clinical databases across the United Kingdom.Results Clinical databases existed in all areas of health care, but their distribution was uneven—cancer and surgery were better covered than mental health and obstetrics. They varied greatly in age, size, growth rate, and geographical areas covered. Their scope (and thus their potential uses) and the quality of the data collected also varied. The latter was not associated with any organisational characteristics. Despite impressive achievements, many faced substantial financial uncertainty. Considerable scope existed for improvements: greater use of nationally approved codes; more support from relevant professional organisations; greater involvement by nurses, allied health professionals, managers, and laypeople in database management teams; and more attention to data security and ensuring patient confidentiality. With some notable exceptions, the audit and research potential of most databases had not been realised: half the databases had each produced only four or fewer peer reviewed research articles.Conclusions At least one clinical database support unit is needed in the United Kingdom to provide assistance in organisation and management, information technology, epidemiology, and statistics. Without such an initiative, the variable picture of databases reported here is likely to persist and their potential not be realised.  相似文献   

16.
Collecting and analysing all available literature before starting a new animal experiment is important and it is indispensable when writing systematic reviews of animal research. In practice, finding all animal studies relevant to a specific research question turns out to be anything but simple. In order to facilitate this search process, we previously developed a search filter for retrieving animal studies in the most often used biomedical database, PubMed. It is a general requirement for systematic reviews, however, that at least two databases are searched. In this report, we therefore present a similar search filter for a second important database, namely Embase. We show that our filter retrieves more animal studies than (a combination of) the options currently available in Embase. Our search filters for PubMed and Embase therefore represent valuable tools for improving the quality of (systematic) reviews and thereby of new animal experiments.  相似文献   

17.
SUMMARY: With the availability of whole genome sequence in many species, linkage analysis, positional cloning and microarray are gradually becoming powerful tools for investigating the links between phenotype and genotype or genes. However, in these methods, causative genes underlying a quantitative trait locus, or a disease, are usually located within a large genomic region or a large set of genes. Examining the function of every gene is very time consuming and needs to retrieve and integrate the information from multiple databases or genome resources. PGMapper is a software tool for automatically matching phenotype to genes from a defined genome region or a group of given genes by combining the mapping information from the Ensembl database and gene function information from the OMIM and PubMed databases. PGMapper is currently available for candidate gene search of human, mouse, rat, zebrafish and 12 other species. AVAILABILITY: Available online at http://www.genediscovery.org/pgmapper/index.jsp.  相似文献   

18.
MOTIVATION: Protein-protein interactions play critical roles in biological processes, and many biologists try to find or to predict crucial information concerning these interactions. Before verifying interactions in biological laboratory work, validating them from previous research is necessary. Although many efforts have been made to create databases that store verified information in a structured form, much interaction information still remains as unstructured text. As the amount of new publications has increased rapidly, a large amount of research has sought to extract interactions from the text automatically. However, there remain various difficulties associated with the process of applying automatically generated results into manually annotated databases. For interactions that are not found in manually stored databases, researchers attempt to search for abstracts or full papers. RESULTS: As a result of a search for two proteins, PubMed frequently returns hundreds of abstracts. In this paper, a method is introduced that validates protein-protein interactions from PubMed abstracts. A query is generated from two given proteins automatically and abstracts are then collected from PubMed. Following this, target proteins and their synonyms are recognized and their interaction information is extracted from the collection. It was found that 67.37% of the interactions from DIP-PPI corpus were found from the PubMed abstracts and 87.37% of interactions were found from the given full texts. AVAILABILITY: Contact authors.  相似文献   

19.
The DNA microarray technology has arguably caught the attention of the worldwide life science community and is now systematically supporting major discoveries in many fields of study. The majority of the initial technical challenges of conducting experiments are being resolved, only to be replaced with new informatics hurdles, including statistical analysis, data visualization, interpretation, and storage. Two systems of databases, one containing expression data and one containing annotation data are quickly becoming essential knowledge repositories of the research community. This present paper surveys several databases, which are considered "pillars" of research and important nodes in the network. This paper focuses on a generalized workflow scheme typical for microarray experiments using two examples related to cancer research. The workflow is used to reference appropriate databases and tools for each step in the process of array experimentation. Additionally, benefits and drawbacks of current array databases are addressed, and suggestions are made for their improvement.  相似文献   

20.
Text processing through Web services: calling Whatizit   总被引:1,自引:0,他引:1  
MOTIVATION: Text-mining (TM) solutions are developing into efficient services to researchers in the biomedical research community. Such solutions have to scale with the growing number and size of resources (e.g. available controlled vocabularies), with the amount of literature to be processed (e.g. about 17 million documents in PubMed) and with the demands of the user community (e.g. different methods for fact extraction). These demands motivated the development of a server-based solution for literature analysis. Whatizit is a suite of modules that analyse text for contained information, e.g. any scientific publication or Medline abstracts. Special modules identify terms and then link them to the corresponding entries in bioinformatics databases such as UniProtKb/Swiss-Prot data entries and gene ontology concepts. Other modules identify a set of selected annotation types like the set produced by the EBIMed analysis pipeline for proteins. In the case of Medline abstracts, Whatizit offers access to EBI's in-house installation via PMID or term query. For large quantities of the user's own text, the server can be operated in a streaming mode (http://www.ebi.ac.uk/webservices/whatizit).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号