期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Is searching full text more effective than searching abstracts?

Jimmy Lin 《BMC bioinformatics》2009,10(1):46

Background

With the growing availability of full-text articles online, scientists and other consumers of the life sciences literature now have the ability to go beyond searching bibliographic records (title, abstract, metadata) to directly access full-text content. Motivated by this emerging trend, I posed the following question: is searching full text more effective than searching abstracts? This question is answered by comparing text retrieval algorithms on MEDLINE^? abstracts, full-text articles, and spans (paragraphs) within full-text articles using data from the TREC 2007 genomics track evaluation. Two retrieval models are examined: bm25 and the ranking algorithm implemented in the open-source Lucene search engine. 相似文献

2.

PageRank without hyperlinks: Reranking with PubMed related article networks for biomedical text retrieval

Jimmy Lin 《BMC bioinformatics》2008,9(1):270

Background

Graph analysis algorithms such as PageRank and HITS have been successful in Web environments because they are able to extract important inter-document relationships from manually-created hyperlinks. We consider the application of these techniques to biomedical text retrieval. In the current PubMed^® search interface, a MEDLINE^® citation is connected to a number of related citations, which are in turn connected to other citations. Thus, a MEDLINE record represents a node in a vast content-similarity network. This article explores the hypothesis that these networks can be exploited for text retrieval, in the same manner as hyperlink graphs on the Web.

Results

We conducted a number of reranking experiments using the TREC 2005 genomics track test collection in which scores extracted from PageRank and HITS analysis were combined with scores returned by an off-the-shelf retrieval engine. Experiments demonstrate that incorporating PageRank scores yields significant improvements in terms of standard ranked-retrieval metrics.

Conclusion

The link structure of content-similarity networks can be exploited to improve the effectiveness of information retrieval systems. These results generalize the applicability of graph analysis algorithms to text retrieval in the biomedical domain.

相似文献

3.

Moara: a Java library for extracting and normalizing gene and protein mentions

Mariana L Neves José-María Carazo Alberto Pascual-Montano 《BMC bioinformatics》2010,11(1):157

Background

Gene/protein recognition and normalization are important preliminary steps for many biological text mining tasks, such as information retrieval, protein-protein interactions, and extraction of semantic information, among others. Despite dedication to these problems and effective solutions being reported, easily integrated tools to perform these tasks are not readily available. 相似文献

4.

Revealing biases inherent in recombination protocols

Javier F Chaparro-Riggers Bernard LW Loo Karen M Polizzi Phillip R Gibbs Xiao-Song Tang Mark J Nelson Andreas S Bommarius 《BMC biotechnology》2007,7(1):77

Background

The recombination of homologous genes is an effective protein engineering tool to evolve proteins. DNA shuffling by gene fragmentation and reassembly has dominated the literature since its first publication, but this fragmentation-based method is labor intensive. Recently, a fragmentation-free PCR based protocol has been published, termed recombination-dependent PCR, which is easy to perform. However, a detailed comparison of both methods is still missing. 相似文献

5.

BIGSdb: Scalable analysis of bacterial genome variation at the population level

Keith A Jolley Martin CJ Maiden 《BMC bioinformatics》2010,11(1):595

Background

The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. 相似文献

6.

Application of protein structure alignments to iterated hidden Markov model protocols for structure prediction

Eric D Scheeff Philip E Bourne 《BMC bioinformatics》2006,7(1):410-17

Background

One of the most powerful methods for the prediction of protein structure from sequence information alone is the iterative construction of profile-type models. Because profiles are built from sequence alignments, the sequences included in the alignment and the method used to align them will be important to the sensitivity of the resulting profile. The inclusion of highly diverse sequences will presumably produce a more powerful profile, but distantly related sequences can be difficult to align accurately using only sequence information. Therefore, it would be expected that the use of protein structure alignments to improve the selection and alignment of diverse sequence homologs might yield improved profiles. However, the actual utility of such an approach has remained unclear. 相似文献

7.

Encoding the states of interacting proteins to facilitate biological pathways reconstruction

Alberto Termanini Paolo Tieri Claudio Franceschi 《Biology direct》2010,5(1):52

Background

In a systems biology perspective, protein-protein interactions (PPI) are encoded in machine-readable formats to avoid issues encountered in their retrieval for the reconstruction of comprehensive interaction maps and biological pathways. However, the information stored in electronic formats currently used doesn't allow a valid automatic reconstruction of biological pathways. 相似文献

8.

Model based analysis of real-time PCR data from DNA binding dye protocols

Mariano J Alvarez Guillermo J Vila-Ortiz Mariano C Salibe Osvaldo L Podhajcer Fernando J Pitossi 《BMC bioinformatics》2007,8(1):85

相似文献

9.

OntoDas – a tool for facilitating the construction of complex queries to the Gene Ontology

Kieran O'Neill Alexander Garcia Anita Schwegmann Rafael C Jimenez Dan Jacobson Henning Hermjakob 《BMC bioinformatics》2008,9(1):437

Background

Ontologies such as the Gene Ontology can enable the construction of complex queries over biological information in a conceptual way, however existing systems to do this are too technical. Within the biological domain there is an increasing need for software that facilitates the flexible retrieval of information. OntoDas aims to fulfil this need by allowing the definition of queries by selecting valid ontology terms. 相似文献

10.

Visualization of large influenza virus sequence datasets using adaptively aggregated trees with sampling-based subscale representation

Leonid Zaslavsky Yiming Bao Tatiana A Tatusova 《BMC bioinformatics》2008,9(1):237

Background

With the amount of influenza genome sequence data growing rapidly, researchers need machine assistance in selecting datasets and exploring the data. Enhanced visualization tools are required to represent results of the exploratory analysis on the web in an easy-to-comprehend form and to facilitate convenient information retrieval. 相似文献

11.

Exploring supervised and unsupervised methods to detect topics in biomedical text

Minsuk Lee Weiqing Wang Hong Yu 《BMC bioinformatics》2006,7(1):140-11

Background

Topic detection is a task that automatically identifies topics (e.g., "biochemistry" and "protein structure") in scientific articles based on information content. Topic detection will benefit many other natural language processing tasks including information retrieval, text summarization and question answering; and is a necessary step towards the building of an information system that provides an efficient way for biologists to seek information from an ocean of literature. 相似文献

12.

Retrieval with gene queries

Aditya K Sehgal Padmini Srinivasan 《BMC bioinformatics》2006,7(1):220-12

Background

Accuracy of document retrieval from MEDLINE for gene queries is crucially important for many applications in bioinformatics. We explore five information retrieval-based methods to rank documents retrieved by PubMed gene queries for the human genome. The aim is to rank relevant documents higher in the retrieved list. We address the special challenges faced due to ambiguity in gene nomenclature: gene terms that refer to multiple genes, gene terms that are also English words, and gene terms that have other biological meanings. 相似文献

13.

Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO) Cellular Component curation

Kimberly Van Auken Joshua Jaffery Juancarlos Chan Hans-Michael Müller Paul W Sternberg 《BMC bioinformatics》2009,10(1):228

Background

Manual curation of experimental data from the biomedical literature is an expensive and time-consuming endeavor. Nevertheless, most biological knowledge bases still rely heavily on manual curation for data extraction and entry. Text mining software that can semi- or fully automate information retrieval from the literature would thus provide a significant boost to manual curation efforts. 相似文献

14.

The TREC/KREC Assay for the Diagnosis and Monitoring of Patients with DiGeorge Syndrome

Eva Froňková Adam Klocperk Michael Svatoň Michaela Nováková Michaela Kotrová Jana Kayserová Tomá? Kalina Petra Keslová Felix Votava Hana Vinohradská Tomá? Freiberger Ester Mejst?íková Jan Trka Anna ?edivá 《PloS one》2014,9(12)

DiGeorge syndrome (DGS) presents with a wide spectrum of thymic pathologies. Nationwide neonatal screening programs of lymphocyte production using T-cell recombination excision circles (TREC) have repeatedly identified patients with DGS. We tested what proportion of DGS patients could be identified at birth by combined TREC and kappa-deleting element recombination circle (KREC) screening. Furthermore, we followed TREC/KREC levels in peripheral blood (PB) to monitor postnatal changes in lymphocyte production.

Methods

TREC/KREC copies were assessed by quantitative PCR (qPCR) and were related to the albumin control gene in dry blood spots (DBSs) from control (n = 56), severe immunodeficiency syndrome (SCID, n = 10) and DGS (n = 13) newborns. PB was evaluated in DGS children (n = 32), in diagnostic samples from SCID babies (n = 5) and in 91 controls.

Results

All but one DGS patient had TREC levels in the normal range at birth, albeit quantitative TREC values were significantly lower in the DGS cohort. One patient had slightly reduced KREC at birth. Postnatal DGS samples revealed reduced TREC numbers in 5 of 32 (16%) patients, whereas KREC copy numbers were similar to controls. Both TREC and KREC levels showed a more pronounced decrease with age in DGS patients than in controls (p<0.0001 for both in a linear model). DGS patients had higher percentages of NK cells at the expense of T cells (p<0.0001). The patients with reduced TREC levels had repeated infections in infancy and developed allergy and/or autoimmunity, but they were not strikingly different from other patients. In 12 DGS patients with paired DBS and blood samples, the TREC/KREC levels were mostly stable or increased and showed similar kinetics in respective patients.

Conclusions

The combined TREC/KREC approach with correction via control gene identified 1 of 13 (8%) of DiGeorge syndrome patients at birth in our cohort. The majority of patients had TREC/KREC levels in the normal range. 相似文献

15.

The Ontology Lookup Service, a lightweight cross-platform tool for controlled vocabulary queries

Richard G Côté Philip Jones Rolf Apweiler Henning Hermjakob 《BMC bioinformatics》2006,7(1):97-7

Background

With the vast amounts of biomedical data being generated by high-throughput analysis methods, controlled vocabularies and ontologies are becoming increasingly important to annotate units of information for ease of search and retrieval. Each scientific community tends to create its own locally available ontology. The interfaces to query these ontologies tend to vary from group to group. We saw the need for a centralized location to perform controlled vocabulary queries that would offer both a lightweight web-accessible user interface as well as a consistent, unified SOAP interface for automated queries. 相似文献

16.

Concept-based query expansion for retrieving gene related publications from MEDLINE

Sérgio Matos Joel P Arrais João Maia-Rodrigues José Luis Oliveira 《BMC bioinformatics》2010,11(1):212

Background

Advances in biotechnology and in high-throughput methods for gene analysis have contributed to an exponential increase in the number of scientific publications in these fields of study. While much of the data and results described in these articles are entered and annotated in the various existing biomedical databases, the scientific literature is still the major source of information. There is, therefore, a growing need for text mining and information retrieval tools to help researchers find the relevant articles for their study. To tackle this, several tools have been proposed to provide alternative solutions for specific user requests. 相似文献

17.

Orientation determination by wavelets matching for 3D reconstruction of very noisy electron microscopic virus images

Ali?Samir?Saad Email author 《BMC structural biology》2005,5(1):5

Background

In order to perform a 3D reconstruction of electron microscopic images of viruses, it is necessary to determine the orientation (Euler angels) of the 2D projections of the virus. The projections containing high resolution information are usually very noisy. This paper proposes a new method, based on weighted-projection matching in wavelet space for virus orientation determination. In order to speed the retrieval of the best match between projections from a model and real virus particle, a hierarchical correlation matching method is also proposed. 相似文献

18.

RibAlign: a software tool and database for eubacterial phylogeny based on concatenated ribosomal protein subunits

Hanno Teeling Oliver Frank Gloeckner 《BMC bioinformatics》2006,7(1):66

Background

Until today, analysis of 16S ribosomal RNA (rRNA) sequences has been the de-facto gold standard for the assessment of phylogenetic relationships among prokaryotes. However, the branching order of the individual phlya is not well-resolved in 16S rRNA-based trees. In search of an improvement, new phylogenetic methods have been developed alongside with the growing availability of complete genome sequences. Unfortunately, only a few genes in prokaryotic genomes qualify as universal phylogenetic markers and almost all of them have a lower information content than the 16S rRNA gene. Therefore, emphasis has been placed on methods that are based on multiple genes or even entire genomes. The concatenation of ribosomal protein sequences is one method which has been ascribed an improved resolution. Since there is neither a comprehensive database for ribosomal protein sequences nor a tool that assists in sequence retrieval and generation of respective input files for phylogenetic reconstruction programs, RibAlign has been developed to fill this gap. 相似文献

19.

BioInfer: a corpus for information extraction in the biomedical domain

Sampo Pyysalo Filip Ginter Juho Heimonen Jari Björne Jorma Boberg Jouni Järvinen Tapio Salakoski 《BMC bioinformatics》2007,8(1):50

Background

Lately, there has been a great interest in the application of information extraction methods to the biomedical domain, in particular, to the extraction of relationships of genes, proteins, and RNA from scientific publications. The development and evaluation of such methods requires annotated domain corpora. 相似文献

20.

LINNAEUS: A species name identification system for biomedical literature

Martin Gerner Goran Nenadic Casey M Bergman 《BMC bioinformatics》2010,11(1):85

Background

The task of recognizing and identifying species names in biomedical literature has recently been regarded as critical for a number of applications in text and data mining, including gene name recognition, species-specific document retrieval, and semantic enrichment of biomedical articles. 相似文献