首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Kim D  Yu H 《PloS one》2011,6(1):e15338

Background

Figures are ubiquitous in biomedical full-text articles, and they represent important biomedical knowledge. However, the sheer volume of biomedical publications has made it necessary to develop computational approaches for accessing figures. Therefore, we are developing the Biomedical Figure Search engine (http://figuresearch.askHERMES.org) to allow bioscientists to access figures efficiently. Since text frequently appears in figures, automatically extracting such text may assist the task of mining information from figures. Little research, however, has been conducted exploring text extraction from biomedical figures.

Methodology

We first evaluated an off-the-shelf Optical Character Recognition (OCR) tool on its ability to extract text from figures appearing in biomedical full-text articles. We then developed a Figure Text Extraction Tool (FigTExT) to improve the performance of the OCR tool for figure text extraction through the use of three innovative components: image preprocessing, character recognition, and text correction. We first developed image preprocessing to enhance image quality and to improve text localization. Then we adapted the off-the-shelf OCR tool on the improved text localization for character recognition. Finally, we developed and evaluated a novel text correction framework by taking advantage of figure-specific lexicons.

Results/Conclusions

The evaluation on 382 figures (9,643 figure texts in total) randomly selected from PubMed Central full-text articles shows that FigTExT performed with 84% precision, 98% recall, and 90% F1-score for text localization and with 62.5% precision, 51.0% recall and 56.2% F1-score for figure text extraction. When limiting figure texts to those judged by domain experts to be important content, FigTExT performed with 87.3% precision, 68.8% recall, and 77% F1-score. FigTExT significantly improved the performance of the off-the-shelf OCR tool we used, which on its own performed with 36.6% precision, 19.3% recall, and 25.3% F1-score for text extraction. In addition, our results show that FigTExT can extract texts that do not appear in figure captions or other associated text, further suggesting the potential utility of FigTExT for improving figure search.Open in a separate windowFigure 9Additional reasons for OCR errors.(A) High image complexity. (B) Thick stroke. (C) Low image contrast. (D) Small font size. (E) Non-standard font type.  相似文献   

2.
SUMMARY: Figures in biomedical articles present visual evidence for research facts and help readers understand the article better. However, when figures are taken out of context, it is difficult to understand their content. We developed a summarization algorithm to summarize the content of figures and used it in our figure search engine (http://figuresearch.askhermes.org/). In this article, we report on the development of web browser extensions for Mozilla Firefox, Google Chrome and Apple Safari to display summaries for figures in PubMed Central and NCBI Images. AVAILABILITY: The extensions can be downloaded from http://figuresearch.askhermes.org/articlesearch/extensions.php.  相似文献   

3.

Background

After many years of general neglect, interest has grown and efforts came under way for the mapping, control, surveillance, and eventual elimination of neglected tropical diseases (NTDs). Disease risk estimates are a key feature to target control interventions, and serve as a benchmark for monitoring and evaluation. What is currently missing is a georeferenced global database for NTDs providing open-access to the available survey data that is constantly updated and can be utilized by researchers and disease control managers to support other relevant stakeholders. We describe the steps taken toward the development of such a database that can be employed for spatial disease risk modeling and control of NTDs.

Methodology

With an emphasis on schistosomiasis in Africa, we systematically searched the literature (peer-reviewed journals and ‘grey literature’), contacted Ministries of Health and research institutions in schistosomiasis-endemic countries for location-specific prevalence data and survey details (e.g., study population, year of survey and diagnostic techniques). The data were extracted, georeferenced, and stored in a MySQL database with a web interface allowing free database access and data management.

Principal Findings

At the beginning of 2011, our database contained more than 12,000 georeferenced schistosomiasis survey locations from 35 African countries available under http://www.gntd.org. Currently, the database is expanded to a global repository, including a host of other NTDs, e.g. soil-transmitted helminthiasis and leishmaniasis.

Conclusions

An open-access, spatially explicit NTD database offers unique opportunities for disease risk modeling, targeting control interventions, disease monitoring, and surveillance. Moreover, it allows for detailed geostatistical analyses of disease distribution in space and time. With an initial focus on schistosomiasis in Africa, we demonstrate the proof-of-concept that the establishment and running of a global NTD database is feasible and should be expanded without delay.  相似文献   

4.
5.

Background

Flower colour is of great importance in various fields relating to floral biology and pollinator behaviour. However, subjective human judgements of flower colour may be inaccurate and are irrelevant to the ecology and vision of the flower''s pollinators. For precise, detailed information about the colours of flowers, a full reflectance spectrum for the flower of interest should be used rather than relying on such human assessments.

Methodology/Principal Findings

The Floral Reflectance Database (FReD) has been developed to make an extensive collection of such data available to researchers. It is freely available at http://www.reflectance.co.uk. The database allows users to download spectral reflectance data for flower species collected from all over the world. These could, for example, be used in modelling interactions between pollinator vision and plant signals, or analyses of flower colours in various habitats. The database contains functions for calculating flower colour loci according to widely-used models of bee colour space, reflectance graphs of the spectra and an option to search for flowers with similar colours in bee colour space.

Conclusions/Significance

The Floral Reflectance Database is a valuable new tool for researchers interested in the colours of flowers and their association with pollinator colour vision, containing raw spectral reflectance data for a large number of flower species.  相似文献   

6.

Background

Personalized feedback is a promising self-help for problem gamblers. Such interventions have shown consistently positive results with other addictive behaviours, and our own pilot test of personalized normative feedback materials for gamblers yielded positive findings. The current randomized controlled trial evaluated the effectiveness, and the sustained efficacy, of the personalized feedback intervention materials for problem gamblers.

Methodology/Principal Findings

Respondents recruited by a general population telephone screener of Ontario adults included gamblers with moderate and severe gambling problems. Those who agreed to participate were randomly assigned to receive: 1) the full personalized normative feedback intervention; 2) a partial feedback that contained all the feedback information provided to those in condition 1 but without the normative feedback content (i.e., no comparisons provided to general population gambling norms); or 3) a waiting list control condition. The primary hypothesis was that problem gamblers who received the personalized normative feedback intervention would reduce their gambling more than problem gamblers who did not receive any intervention (waiting list control condition) by the six-month follow-up.

Conclusions/Significance

The study found no evidence for the impact of normative personalized feedback. However, participants who received, the partial feedback (without norms) reduced the number of days they gambled compared to participants who did not receive the intervention. We concluded that personalized feedback interventions were well received and the materials may be helpful at reducing gambling. Realistically, it can be expected that the personalized feedback intervention may have a limited, short term impact on the severity of participants'' problem gambling because the intervention is just a brief screener. An Internet-based version of the personalized feedback intervention tool, however, may offer an easy to access and non-threatening portal that can be used to motivate participants to seek further help online or in person.

Trial Registration

ClinicalTrials.gov NCT00578357  相似文献   

7.

Background

Interventions delivered via the Internet have the potential to address the problem of hazardous alcohol consumption at minimal incremental cost, with potentially major public health implications. It was hypothesised that providing access to a psychologically enhanced website would result in greater reductions in drinking and related problems than giving access to a typical alcohol website simply providing information on potential harms of alcohol. DYD-RCT Trial registration: ISRCTN 31070347.

Methodology/Principal Findings

A two-arm randomised controlled trial was conducted entirely on-line through the Down Your Drink (DYD) website. A total of 7935 individuals who screened positive for hazardous alcohol consumption were recruited and randomized. At entry to the trial, the geometric mean reported past week alcohol consumption was 46.0 (SD 31.2) units. Consumption levels reduced substantially in both groups at the principal 3 month assessment point to an average of 26.0 (SD 22.3) units. Similar changes were reported at 1 month and 12 months. There were no significant differences between the groups for either alcohol consumption at 3 months (intervention: control ratio of geometric means 1.03, 95% CI 0.97 to 1.10) or for this outcome and the main secondary outcomes at any of the assessments. The results were not materially changed following imputation of missing values, nor was there any evidence that the impact of the intervention varied with baseline measures or level of exposure to the intervention.

Conclusions/Significance

Findings did not provide support for the hypothesis that access to a psychologically enhanced website confers additional benefit over standard practice and indicate the need for further research to optimise the effectiveness of Internet-based behavioural interventions. The trial demonstrates a widespread and potentially sustainable demand for Internet based interventions for people with hazardous alcohol consumption, which could be delivered internationally.

Trial Registration

Controlled-Trials.com ISRCTN31070347  相似文献   

8.
9.

Background

Current technologies have lead to the availability of multiple genomic data types in sufficient quantity and quality to serve as a basis for automatic global network inference. Accordingly, there are currently a large variety of network inference methods that learn regulatory networks to varying degrees of detail. These methods have different strengths and weaknesses and thus can be complementary. However, combining different methods in a mutually reinforcing manner remains a challenge.

Methodology

We investigate how three scalable methods can be combined into a useful network inference pipeline. The first is a novel t-test–based method that relies on a comprehensive steady-state knock-out dataset to rank regulatory interactions. The remaining two are previously published mutual information and ordinary differential equation based methods (tlCLR and Inferelator 1.0, respectively) that use both time-series and steady-state data to rank regulatory interactions; the latter has the added advantage of also inferring dynamic models of gene regulation which can be used to predict the system''s response to new perturbations.

Conclusion/Significance

Our t-test based method proved powerful at ranking regulatory interactions, tying for first out of methods in the DREAM4 100-gene in-silico network inference challenge. We demonstrate complementarity between this method and the two methods that take advantage of time-series data by combining the three into a pipeline whose ability to rank regulatory interactions is markedly improved compared to either method alone. Moreover, the pipeline is able to accurately predict the response of the system to new conditions (in this case new double knock-out genetic perturbations). Our evaluation of the performance of multiple methods for network inference suggests avenues for future methods development and provides simple considerations for genomic experimental design. Our code is publicly available at http://err.bio.nyu.edu/inferelator/.  相似文献   

10.
Biomedical literature incorporates millions of figures, which are a rich and important knowledge resource for biomedical researchers. Scientists need access to the figures and the knowledge they represent in order to validate research findings and to generate new hypotheses. By themselves, these figures are nearly always incomprehensible to both humans and machines and their associated texts are therefore essential for full comprehension. The associated text of a figure, however, is scattered throughout its full-text article and contains redundant information content. In this paper, we report the continued development and evaluation of several figure summarization systems, the FigSum+ systems, that automatically identify associated texts, remove redundant information, and generate a text summary for every figure in an article. Using a set of 94 annotated figures selected from 19 different journals, we conducted an intrinsic evaluation of FigSum+. We evaluate the performance by precision, recall, F1, and ROUGE scores. The best FigSum+ system is based on an unsupervised method, achieving F1 score of 0.66 and ROUGE-1 score of 0.97. The annotated data is available at figshare.com (http://figshare.com/articles/Figure_Associated_Text_Summarization_and_Evaluation/858903).  相似文献   

11.

Background

Observers misperceive the location of points within a scene as compressed towards the goal of a saccade. However, recent studies suggest that saccadic compression does not occur for discrete elements such as dots when they are perceived as unified objects like a rectangle.

Methodology/Principal Findings

We investigated the magnitude of horizontal vs. vertical compression for Kanizsa figure (a collection of discrete elements unified into single perceptual objects by illusory contours) and control rectangle figures. Participants were presented with Kanizsa and control figures and had to decide whether the horizontal or vertical length of stimulus was longer using the two-alternative force choice method. Our findings show that large but not small Kanizsa figures are perceived as compressed, that such compression is large in the horizontal dimension and small or nil in the vertical dimension. In contrast to recent findings, we found no saccadic compression for control rectangles.

Conclusions

Our data suggest that compression of Kanizsa figure has been overestimated in previous research due to methodological artifacts, and highlight the importance of studying perceptual phenomena by multiple methods.  相似文献   

12.

Background

Omega-3 fatty acids are dietary essentials, and the current low intakes in most modern developed countries are believed to contribute to a wide variety of physical and mental health problems. Evidence from clinical trials indicates that dietary supplementation with long-chain omega-3 may improve child behavior and learning, although most previous trials have involved children with neurodevelopmental disorders such as attention-deficit/hyperactivity disorder (ADHD) or developmental coordination disorder (DCD). Here we investigated whether such benefits might extend to the general child population.

Objectives

To determine the effects of dietary supplementation with the long-chain omega-3 docosahexaenoic acid (DHA) on the reading, working memory, and behavior of healthy schoolchildren.

Design

Parallel group, fixed-dose, randomized, double-blind, placebo-controlled trial (RCT).

Setting

Mainstream primary schools in Oxfordshire, UK (n = 74).

Participants

Healthy children aged 7–9 years initially underperforming in reading (≤33rd centile). 1376 invited, 362 met study criteria.

Intervention

600 mg/day DHA (from algal oil), or taste/color matched corn/soybean oil placebo.

Main Outcome Measures

Age-standardized measures of reading, working memory, and parent- and teacher-rated behavior.

Results

ITT analyses showed no effect of DHA on reading in the full sample, but significant effects in the pre-planned subgroup of 224 children whose initial reading performance was ≤20th centile (the target population in our original study design). Parent-rated behavior problems (ADHD-type symptoms) were significantly reduced by active treatment, but little or no effects were seen for either teacher-rated behaviour or working memory.

Conclusions

DHA supplementation appears to offer a safe and effective way to improve reading and behavior in healthy but underperforming children from mainstream schools. Replication studies are clearly warranted, as such children are known to be at risk of low educational and occupational outcomes in later life.

Trial Registration

ClinicalTrials.gov NCT01066182 and Controlled-Trials.com ISRCTN99771026  相似文献   

13.
When reading bioscience journal articles, many researchers focus attention on the figures and their captions. This observation led to the development of the BioText literature search engine [1], a freely available Web-based application that allows biologists to search over the contents of Open Access Journals, and see figures from the articles displayed directly in the search results. This article presents a qualitative assessment of this system in the form of a usability study with 20 biologist participants using and commenting on the system. 19 out of 20 participants expressed a desire to use a bioscience literature search engine that displays articles'' figures alongside the full text search results. 15 out of 20 participants said they would use a caption search and figure display interface either frequently or sometimes, while 4 said rarely and 1 said undecided. 10 out of 20 participants said they would use a tool for searching the text of tables and their captions either frequently or sometimes, while 7 said they would use it rarely if at all, 2 said they would never use it, and 1 was undecided. This study found evidence, supporting results of an earlier study, that bioscience literature search systems such as PubMed should show figures from articles alongside search results. It also found evidence that full text and captions should be searched along with the article title, metadata, and abstract. Finally, for a subset of users and information needs, allowing for explicit search within captions for figures and tables is a useful function, but it is not entirely clear how to cleanly integrate this within a more general literature search interface. Such a facility supports Open Access publishing efforts, as it requires access to full text of documents and the lifting of restrictions in order to show figures in the search interface.  相似文献   

14.
15.
Jones M  Ghoorah A  Blaxter M 《PloS one》2011,6(4):e19259

Background

DNA barcoding and other DNA sequence-based techniques for investigating and estimating biodiversity require explicit methods for associating individual sequences with taxa, as it is at the taxon level that biodiversity is assessed. For many projects, the bioinformatic analyses required pose problems for laboratories whose prime expertise is not in bioinformatics. User-friendly tools are required for both clustering sequences into molecular operational taxonomic units (MOTU) and for associating these MOTU with known organismal taxonomies.

Results

Here we present jMOTU, a Java program for the analysis of DNA barcode datasets that uses an explicit, determinate algorithm to define MOTU. We demonstrate its usefulness for both individual specimen-based Sanger sequencing surveys and bulk-environment metagenetic surveys using long-read next-generation sequencing data. jMOTU is driven through a graphical user interface, and can analyse tens of thousands of sequences in a short time on a desktop computer. A companion program, Taxonerator, that adds traditional taxonomic annotation to MOTU, is also presented. Clustering and taxonomic annotation data are stored in a relational database, and are thus amenable to subsequent data mining and web presentation.

Conclusions

jMOTU efficiently and robustly identifies the molecular taxa present in survey datasets, and Taxonerator decorates the MOTU with putative identifications. jMOTU and Taxonerator are freely available from http://www.nematodes.org/.  相似文献   

16.

Background

Linkage studies often yield intervals containing several hundred positional candidate genes. Different manual or automatic approaches exist for the determination of the gene most likely to cause the disease. While the manual search is very flexible and takes advantage of the researchers'' background knowledge and intuition, it may be very cumbersome to collect and study the relevant data. Automatic solutions on the other hand usually focus on certain models, remain “black boxes” and do not offer the same degree of flexibility.

Methodology

We have developed a web-based application that combines the advantages of both approaches. Information from various data sources such as gene-phenotype associations, gene expression patterns and protein-protein interactions was integrated into a central database. Researchers can select which information for the genes within a candidate interval or for single genes shall be displayed. Genes can also interactively be filtered, sorted and prioritised according to criteria derived from the background knowledge and preconception of the disease under scrutiny.

Conclusions

GeneDistiller provides knowledge-driven, fully interactive and intuitive access to multiple data sources. It displays maximum relevant information, while saving the user from drowning in the flood of data. A typical query takes less than two seconds, thus allowing an interactive and explorative approach to the hunt for the candidate gene.

Access

GeneDistiller can be freely accessed at http://www.genedistiller.org  相似文献   

17.

Background

Nanolipoprotein particles (NLPs) are discoidal, nanometer-sized particles comprised of self-assembled phospholipid membranes and apolipoproteins. NLPs assembled with human apolipoproteins have been used for myriad biotechnology applications, including membrane protein solubilization, drug delivery, and diagnostic imaging. To expand the repertoire of lipoproteins for these applications, insect apolipophorin-III (apoLp-III) was evaluated for the ability to form discretely-sized, homogeneous, and stable NLPs.

Methodology

Four NLP populations distinct with regards to particle diameters (ranging in size from 10 nm to >25 nm) and lipid-to-apoLp-III ratios were readily isolated to high purity by size exclusion chromatography. Remodeling of the purified NLP species over time at 4°C was monitored by native gel electrophoresis, size exclusion chromatography, and atomic force microscopy. Purified 20 nm NLPs displayed no remodeling and remained stable for over 1 year. Purified NLPs with 10 nm and 15 nm diameters ultimately remodeled into 20 nm NLPs over a period of months. Intra-particle chemical cross-linking of apoLp-III stabilized NLPs of all sizes.

Conclusions

ApoLp-III-based NLPs can be readily prepared, purified, characterized, and stabilized, suggesting their utility for biotechnological applications.  相似文献   

18.
19.
20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号