首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Hundreds of millions of figures are available in biomedical literature, representing important biomedical experimental evidence. Since text is a rich source of information in figures, automatically extracting such text may assist in the task of mining figure information. A high-quality ground truth standard can greatly facilitate the development of an automated system. This article describes DeTEXT: A database for evaluating text extraction from biomedical literature figures. It is the first publicly available, human-annotated, high quality, and large-scale figure-text dataset with 288 full-text articles, 500 biomedical figures, and 9308 text regions. This article describes how figures were selected from open-access full-text biomedical articles and how annotation guidelines and annotation tools were developed. We also discuss the inter-annotator agreement and the reliability of the annotations. We summarize the statistics of the DeTEXT data and make available evaluation protocols for DeTEXT. Finally we lay out challenges we observed in the automated detection and recognition of figure text and discuss research directions in this area. DeTEXT is publicly available for downloading at http://prir.ustb.edu.cn/DeTEXT/.  相似文献   

2.
3.
Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.ca/.  相似文献   

4.
Viral phylodynamics is defined as the study of how epidemiological, immunological, and evolutionary processes act and potentially interact to shape viral phylogenies. Since the coining of the term in 2004, research on viral phylodynamics has focused on transmission dynamics in an effort to shed light on how these dynamics impact viral genetic variation. Transmission dynamics can be considered at the level of cells within an infected host, individual hosts within a population, or entire populations of hosts. Many viruses, especially RNA viruses, rapidly accumulate genetic variation because of short generation times and high mutation rates. Patterns of viral genetic variation are therefore heavily influenced by how quickly transmission occurs and by which entities transmit to one another. Patterns of viral genetic variation will also be affected by selection acting on viral phenotypes. Although viruses can differ with respect to many phenotypes, phylodynamic studies have to date tended to focus on a limited number of viral phenotypes. These include virulence phenotypes, phenotypes associated with viral transmissibility, cell or tissue tropism phenotypes, and antigenic phenotypes that can facilitate escape from host immunity. Due to the impact that transmission dynamics and selection can have on viral genetic variation, viral phylogenies can therefore be used to investigate important epidemiological, immunological, and evolutionary processes, such as epidemic spread [2], spatio-temporal dynamics including metapopulation dynamics [3], zoonotic transmission, tissue tropism [4], and antigenic drift [5]. The quantitative investigation of these processes through the consideration of viral phylogenies is the central aim of viral phylodynamics.
This is a “Topic Page” article for PLOS Computational Biology.
  相似文献   

5.
Marine compound database consists of marine natural products and chemical entities, collected from various literature sources, which are known to possess bioactivity against human diseases. The database is constructed using html code. The 12 categories of 182 compounds are provided with the source, compound name, 2-dimensional structure, bioactivity and clinical trial information. The database is freely available online and can be accessed at http://www.progenebio.in/mcdb/index.htm  相似文献   

6.
Objectives To evaluate how acceptable authors find the BMJ''s current practice of publishing short versions of research articles in the paper journal and a longer version on the web and to determine authors'' attitudes towards publishing only abstracts in the paper journal and publishing unedited versions on bmj.com once papers have been accepted for publication.Design Two cross sectional surveys.Setting General medical journal.Participants Survey 1: corresponding authors of a consecutive sample of published BMJ research articles that had undergone the ELPS (electronic long, paper short) process. Survey 2: corresponding authors of consecutive research articles submitted to BMJ.Results Response rates were 90% (104/115) in survey 1 and 75% (213/283) in survey 2. ELPS is largely acceptable to BMJ authors, but there is some concern that electronic information is not permanent and uncertainty about how versions are referenced. While authors who had experienced ELPS reported some problems with editors shortening papers, most were able to rectify these. Overall, 70% thought that the BMJ should continue to use ELPS; 49% thought that publishing just the abstract in the printed journal with the full version only on bmj.com was unacceptable; and 23% thought it unacceptable to post unedited versions on bmj.com once a paper had been accepted for publication.Conclusions It is acceptable to authors to publish short versions of research articles in the printed version of a general medical journal with longer versions on the website. Authors dislike the idea of publishing only abstracts in the printed journal but are in favour of posting accepted articles on the website ahead of the printed version.  相似文献   

7.
8.
9.
We have analyzed host cell genes linked to HIV replication that were identified in nine genome-wide studies, including three independent siRNA screens. Overlaps among the siRNA screens were very modest (<7% for any pairwise combination), and similarly, only modest overlaps were seen in pairwise comparisons with other types of genome-wide studies. Combining all genes from the genome-wide studies together with genes reported in the literature to affect HIV yields 2,410 protein-coding genes, or fully 9.5% of all human genes (though of course some of these are false positive calls). Here we report an “encyclopedia” of all overlaps between studies (available at http://www.hostpathogen.org), which yielded a more extensively corroborated set of host factors assisting HIV replication. We used these genes to calculate refined networks that specify cellular subsystems recruited by HIV to assist in replication, and present additional analysis specifying host cell genes that are attractive as potential therapeutic targets.  相似文献   

10.
The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a public resource that curates interactions between environmental chemicals and gene products, and their relationships to diseases, as a means of understanding the effects of environmental chemicals on human health. CTD provides a triad of core information in the form of chemical-gene, chemical-disease, and gene-disease interactions that are manually curated from scientific articles. To increase the efficiency, productivity, and data coverage of manual curation, we have leveraged text mining to help rank and prioritize the triaged literature. Here, we describe our text-mining process that computes and assigns each article a document relevancy score (DRS), wherein a high DRS suggests that an article is more likely to be relevant for curation at CTD. We evaluated our process by first text mining a corpus of 14,904 articles triaged for seven heavy metals (cadmium, cobalt, copper, lead, manganese, mercury, and nickel). Based upon initial analysis, a representative subset corpus of 3,583 articles was then selected from the 14,094 articles and sent to five CTD biocurators for review. The resulting curation of these 3,583 articles was analyzed for a variety of parameters, including article relevancy, novel data content, interaction yield rate, mean average precision, and biological and toxicological interpretability. We show that for all measured parameters, the DRS is an effective indicator for scoring and improving the ranking of literature for the curation of chemical-gene-disease information at CTD. Here, we demonstrate how fully incorporating text mining-based DRS scoring into our curation pipeline enhances manual curation by prioritizing more relevant articles, thereby increasing data content, productivity, and efficiency.  相似文献   

11.
FORMIDABEL is a database of Belgian Ants containing more than 27.000 occurrence records. These records originate from collections, field sampling and literature. The database gives information on 76 native and 9 introduced ant species found in Belgium. The collection records originated mainly from the ants collection in Royal Belgian Institute of Natural Sciences (RBINS), the ‘Gaspar’ Ants collection in Gembloux and the zoological collection of the University of Liège (ULG). The oldest occurrences date back from May 1866, the most recent refer to August 2012. FORMIDABEL is a work in progress and the database is updated twice a year.The latest version of the dataset is publicly and freely accessible through this url: http://ipt.biodiversity.be/resource.do?r=formidabel. The dataset is also retrievable via the GBIF data portal through this link: http://data.gbif.org/datasets/resource/14697A dedicated geo-portal, developed by the Belgian Biodiversity Platform is accessible at: http://www.formicidae-atlas.bePurpose: FORMIDABEL is a joint cooperation of the Flemish ants working group “Polyergus” (http://formicidae.be) and the Wallonian ants working group “FourmisWalBru” (http://fourmiswalbru.be). The original database was created in 2002 in the context of the preliminary red data book of Flemish Ants (Dekoninck et al. 2003). Later, in 2005, data from the Southern part of Belgium; Wallonia and Brussels were added. In 2012 this dataset was again updated for the creation of the first Belgian Ants Atlas (Figure 1) (Dekoninck et al. 2012). The main purpose of this atlas was to generate maps for all outdoor-living ant species in Belgium using an overlay of the standard Belgian ecoregions. By using this overlay for most species, we can discern a clear and often restricted distribution pattern in Belgium, mainly based on vegetation and soil types.Open in a separate windowFigure 1.www.formicidae-atlas.be  相似文献   

12.
Studies describing intricate patterns of DNA methylation in nematode and ciliate are controversial due to the uncertainty of genomic evolutionary conservation of DNA methylation enzymes.See related research articles http://genomebiology.com/2012/13/10/R99 and http://genomebiology.com/2012/13/10/R100  相似文献   

13.
14.
15.
To mark our tenth Anniversary at PLOS Biology, we are launching a special, celebratory Tenth Anniversary PLOS Biology Collection which showcases 10 specially selected PLOS Biology research articles drawn from a decade of publishing excellent science. It also features newly commissioned articles, including thought-provoking pieces on the Open Access movement (past and present), on article-level metrics, and on the history of the Public Library of Science. Each research article highlighted in the collection is also accompanied by a PLOS Biologue blog post to extend the impact of these remarkable studies to the widest possible audience.As we celebrate 10 years of PLOS Biology, 10 years of the Public Library of Science, and 10 years of strong advocacy and trail-blazing for the Open Access movement, we mustn''t forget the real star of the show – the fantastic science that we''ve published.It''s hard to cast one''s mind back 10 years and recall the scepticism with which open access publishing was initially received. A key concern at the time was that the model would be tainted with the stigma of “vanity publishing,” and that this model, in which the author pays to publish, is incompatible with integrity, editorial rigour, and scientific excellence. As also discussed in the accompanying editorial [1], the sheer quality of the science that has appeared in PLOS Biology has been vital for dispelling this myth.Our tenth anniversary provides us with a great opportunity to celebrate all of the 1800 or so research articles published in PLOS Biology since our launch in 2003. Unable to showcase each one in turn, we turned to our Editorial Board to help us pick the top 10 research articles to feature in a special Tenth Anniversary PLOS Biology Collection (www.ploscollections.org/Biology10thAnniversary). During the month of October, we will also publish a PLOS Biologue blog post (http://blogs.plos.org/biologue/) for each of these selected research articles, trying to capture and convey what it is about them that the staff editors, the editorial board, and the authors feel is special.By now, you''re probably wondering which papers we selected. The selection is detailed in Box 1, with links to each article. If you haven''t read these articles before, we urge you to read them now and to judge for yourself. As Editorial Board Member Steve O''Rahilly put it, “I think a common theme in many of the best PLOS Biology papers is that they are rich in data that is analysed very carefully and self-critically and presented without hype. However the conclusions are important for the biological community and their insights are likely to stand the test of time.”As well as publishing research articles, PLOS Biology has a thriving Magazine section that has hosted scientific and policy debates, aired polemical and provocative views, celebrated scientific lives in obituaries, reviewed interesting books, and explored unsolved mysteries. One example of how this section has triggered productive community debate is Rosie Redfield''s Perspective on how genetics should be taught to undergraduates [2]. Yet we don''t seek just to provoke debate, but also to enlighten; take a moment to read Georgina Mace''s editorial on the current issues and debates in the sustainability sciences [3]. We also try to break down barriers between fields [4] and to promote public engagement with science [5],[6].We feel strongly that our role doesn''t end with publishing the research article itself. Instead, we aim to unpackage the fascinating discoveries published in PLOS Biology by commissioning articles that explain the significance and impact of the research we publish to audiences of varying expertise. These companion articles range from Primers, which are written by experts who contextualise research articles for those in the field; to Synopses, which are written by science writers who digest an article for our wider readership of biologists; and finally, to PLOS Biologue blog posts, which distil research discoveries for a more general scientifically engaged public. We also use social media to bring these findings to the attention of a global online audience.Of course, the continued success of PLOS Biology doesn''t rest solely on the amazing research we''ve already published; it also hinges on the ground-breaking science we strive to publish in the future. Maintaining the high quality of the biology that we publish is of vital importance to us, not least because, as Editorial Board Member Robert Insall reflects, “What I like about PLOS Biology is that it avoids other journals'' fixation on fashion and the biggest names. This means the papers PLOS Biology is publishing now will last longer and mean more in a generation''s time.”

Box 1. Research Articles Featured in the Tenth Anniversary PLOS Biology Collection

Our Editorial Board Members helped us select 10 articles from the great science published during PLOS Biology''s first decade to feature in our Tenth Anniversary Collection. Please access these articles from the list below and from our Collection page. To read the PLOS Biologue blog posts that accompany them, please go to http://blogs.plos.org/biologue/ for more information.Carmena J et al. (2003) Learning to Control a BrainMachine Interface for Reaching and Grasping by Primates  Primer: Current Approaches to the Study of Movement Control  Synopsis: Retraining the Brain to Recover Movement Brennecke J et al. (2004) Principles of MicroRNA–Target Recognition  Synopsis: Seeds of Destruction: Predicting How microRNAs Choose Their Target Voight BF et al. (2005) A Map of Recent Positive Selection in the Human Genome  Synopsis: Clues to Our Past: Mining the Human Genome for Signs of Recent Selection Palmer C et al. (2007) Development of the Human Infant Intestinal Microbiota  Synopsis: Microbes Colonize a Baby''s Gut with Distinction Levy S et al. (2007) The Diploid Genome Sequence of an Individual Human  Synopsis: A New Human Genome Sequence Paves the Way for Individualized Genomics Illingworth R et al. (2008) A Novel CpG Island Set Identifies Tissue-Specific Methylation at Developmental Gene Loci Silva J et al. (2008) Promotion of Reprogramming to Ground State Pluripotency by Signal Inhibition  Synopsis: A Shortcut to Immortality: Rapid Reprogramming with Tissue Cells Coppé J-P et al. (2008) Senescence-Associated Secretory Phenotypes Reveal Cell-Nonautonomous Functions of Oncogenic RAS and the p53 Tumor Suppressor Shu X et al. (2011) A Genetically Encoded Tag for Correlated Light and Electron Microscopy of Intact Cells, Tissues, and Organisms Bonds MH et al. (2012) Disease Ecology, Biodiversity, and the Latitudinal Gradient in Income  Synopsis: Which Came First: Burden of Infectious Disease or Poverty?  相似文献   

16.
Human mitochondrial DNA (mtDNA) encodes a set of 37 genes which are essential structural and functional components of the electron transport chain. Variations in these genes have been implicated in a broad spectrum of diseases and are extensively reported in literature and various databases. In this study, we describe MitoLSDB, an integrated platform to catalogue disease association studies on mtDNA (http://mitolsdb.igib.res.in). The main goal of MitoLSDB is to provide a central platform for direct submissions of novel variants that can be curated by the Mitochondrial Research Community. MitoLSDB provides access to standardized and annotated data from literature and databases encompassing information from 5231 individuals, 675 populations and 27 phenotypes. This platform is developed using the Leiden Open (source) Variation Database (LOVD) software. MitoLSDB houses information on all 37 genes in each population amounting to 132397 variants, 5147 unique variants. For each variant its genomic location as per the Revised Cambridge Reference Sequence, codon and amino acid change for variations in protein-coding regions, frequency, disease/phenotype, population, reference and remarks are also listed. MitoLSDB curators have also reported errors documented in literature which includes 94 phantom mutations, 10 NUMTs, six documentation errors and one artefactual recombination. MitoLSDB is the largest repository of mtDNA variants systematically standardized and presented using the LOVD platform. We believe that this is a good starting resource to curate mtDNA variants and will facilitate direct submissions enhancing data coverage, annotation in context of pathogenesis and quality control by ensuring non-redundancy in reporting novel disease associated variants.  相似文献   

17.
18.
While a huge amount of information about biological literature can be obtained by searching the PubMed database, reading through all the titles and abstracts resulting from such a search for useful information is inefficient. Text mining makes it possible to increase this efficiency. Some websites use text mining to gather information from the PubMed database; however, they are database-oriented, using pre-defined search keywords while lacking a query interface for user-defined search inputs. We present the PubMed Abstract Reading Helper (PubstractHelper) website which combines text mining and reading assistance for an efficient PubMed search. PubstractHelper can accept a maximum of ten groups of keywords, within each group containing up to ten keywords. The principle behind the text-mining function of PubstractHelper is that keywords contained in the same sentence are likely to be related. PubstractHelper highlights sentences with co-occurring keywords in different colors. The user can download the PMID and the abstracts with color markings to be reviewed later. The PubstractHelper website can help users to identify relevant publications based on the presence of related keywords, which should be a handy tool for their research.

Availability

http://bio.yungyun.com.tw/ATM/PubstractHelper.aspx and http://holab.med.ncku.edu.tw/ATM/PubstractHelper.aspx  相似文献   

19.
The Alliance of Genome Resources (the Alliance) is a combined effort of 7 knowledgebase projects: Saccharomyces Genome Database, WormBase, FlyBase, Mouse Genome Database, the Zebrafish Information Network, Rat Genome Database, and the Gene Ontology Resource. The Alliance seeks to provide several benefits: better service to the various communities served by these projects; a harmonized view of data for all biomedical researchers, bioinformaticians, clinicians, and students; and a more sustainable infrastructure. The Alliance has harmonized cross-organism data to provide useful comparative views of gene function, gene expression, and human disease relevance. The basis of the comparative views is shared calls of orthology relationships and the use of common ontologies. The key types of data are alleles and variants, gene function based on gene ontology annotations, phenotypes, association to human disease, gene expression, protein–protein and genetic interactions, and participation in pathways. The information is presented on uniform gene pages that allow facile summarization of information about each gene in each of the 7 organisms covered (budding yeast, roundworm Caenorhabditis elegans, fruit fly, house mouse, zebrafish, brown rat, and human). The harmonized knowledge is freely available on the alliancegenome.org portal, as downloadable files, and by APIs. We expect other existing and emerging knowledge bases to join in the effort to provide the union of useful data and features that each knowledge base currently provides.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号