期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Hairpins in bookstacks: information retrieval from biomedical text

Shatkay H 《Briefings in bioinformatics》2005,6(3):222-238

Current advances in high-throughput biology are accompanied by a tremendous increase in the number of related publications. Much biomedical information is reported in the vast amount of literature. The ability to rapidly and effectively survey the literature is necessary for both the design and the interpretation of large-scale experiments, and for curation of structured biomedical knowledge in public databases. Given the millions of published documents, the field of information retrieval, which is concerned with the automatic identification of relevant documents from large text collections, has much to offer. This paper introduces the basics of information retrieval, discusses its applications in biomedicine, and presents traditional and non-traditional ways in which it can be used. 相似文献

2.

Linking genes to literature: text mining,information extraction,and retrieval applications for biology

Krallinger Martin Valencia Alfonso Hirschman Lynette 《Genome biology》2008,9(2):1-3

A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics ''In silico Biology: Gene Discovery and Systems Genomics'', Atlanta, USA, 15-17 November, 2007.Technological developments have had a profound impact on biology during the past decade, spectacularly augmenting our ability to survey and interrogate biological phenomena. In particular, they have increased capacity for data generation by several orders of magnitude and made computation a necessary partner of biology. The sixth meeting in the biennial series of bioinformatics conferences co-sponsored by Georgia Institute of Technology in Atlanta and the Oak Ridge National Laboratory addressed the challenges that this technology-driven avalanche of data pose to bioinformatics - increasing the complexity of longstanding problems and creating new ones. 相似文献

3.

Linking genes to literature: text mining,information extraction,and retrieval applications for biology

Krallinger M Valencia A Hirschman L 《Genome biology》2008,9(Z2):S8

相似文献

4.

Molecular profiling of thyroid cancer subtypes using large-scale text mining

Chengkun Wu Jean-Marc Schwartz Georg Brabant Goran Nenadic 《BMC medical genomics》2014,7(Z3):S3

Background

Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer literature is vital for understanding of the disease and developing targeted therapeutics.

Results

We developed a large-scale text mining system to generate a molecular profiling of thyroid cancer subtypes. The system first uses a subtype classification method for the thyroid cancer literature, which employs a scoring scheme to assign different subtypes to articles. We evaluated the classification method on a gold standard derived from the PubMed Supplementary Concept annotations, achieving a micro-average F1-score of 85.9% for primary subtypes. We then used the subtype classification results to extract genes and pathways associated with different thyroid cancer subtypes and successfully unveiled important genes and pathways, including some instances that are missing from current manually annotated databases or most recent review articles.

Conclusions

Identification of key genes and pathways plays a central role in understanding the molecular biology of thyroid cancer. An integration of subtype context can allow prioritized screening for diagnostic biomarkers and novel molecular targeted therapeutics. Source code used for this study is made freely available online at https://github.com/chengkun-wu/GenesThyCan.

相似文献

5.

New challenges for text mining: mapping between text and manually curated pathways

Oda K Kim JD Ohta T Okanohara D Matsuzaki T Tateisi Y Tsujii J 《BMC bioinformatics》2008,9(Z3):S5

相似文献

6.

Androgen receptor gene methylation and exon one CAG repeat length in ovarian cancer: differences from breast cancer

Kassim S Zoheiry NM Hamed WM Going JJ Craft JA 《IUBMB life》2004,56(7):417-426

More than one neoplastic founder clone can exist in benign epithelial tumours. Although theories of clonal selection make pluriclonality appear unlikely in carcinomas, published data do not exclude this possibility. This study looked for evidence of multiclonal X inactivation in ovarian carcinoma using AR methylation as a marker. Fifteen unifocal ovarian carcinomas and 14 multifocal carcinomas all in Scottish patients were studied. One representative formalin-fixed paraffin-embedded tumour block was chosen for each of the former and two for the latter. From each of these 43 tumour blocks three samples each of approximately 10(4) carcinoma cells were obtained by microdissection (129 in all). DNA released by proteinase K digestion was subjected to PCR amplification of the androgen receptor gene AR exon I CAG repeat polymorphism with and without prior digestion with methylation-sensitive restriction enzymes HpaII and HhaI. Complex amplification patterns were consistent with mosaic X inactivation in some ovarian carcinomas but acquired anomalies of AR methylation cannot be excluded. Parallel analysis of other X-linked polymorphic loci would strengthen the inference of clonality status from DNA methylation data in tumour X studies. Strikingly, the number of CAG repeats in the 29 ovarian tumour patients (median 16, range 11 - 20) was substantially fewer than in 34 previously studied breast cancer patients from the same scottish population (median 21, range 14 - 26; P < 0.0001), and women homozygous for the AR CAG repeat were over-represented in the ovarian cancer patients but not in the breast cancer series. These findings reinforce recent suggestions that AR may have a role in ovarian carcinogenesis. 相似文献

7.

Frontiers of biomedical text mining: current progress 总被引：3，自引：0，他引：3

Zweigenbaum P Demner-Fushman D Yu H Cohen KB 《Briefings in bioinformatics》2007,8(5):358-375

It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or 'BioNLP' in general, focusing primarily on papers published within the past year. 相似文献

8.

Key epigenetic changes associated with lung cancer development: Results from dense methylation array profiling

Heather H. Nelson Carmen J. Marsit Brock C. Christensen E.A. Houseman Milica Kontic Joseph L. Wiemels Margaret R. Karagas Margaret R. Wrensch Shichun Zheng John K. Wiencke Karl T. Kelsey 《Epigenetics》2012,7(6):559-566

Epigenetic alterations are a common event in lung cancer and their identification can serve to inform on the carcinogenic process and provide clinically relevant biomarkers. Using paired tumor and non-tumor lung tissues from 146 individuals from three independent populations we sought to identify common changes in DNA methylation associated with the development of non-small cell lung cancer. Pathologically normal lung tissue taken at the time of cancer resection was matched to tumorous lung tissue and together were probed for methylation using Illumina GoldenGate arrays in the discovery set (n = 47 pairs) followed by bisulfite pyrosequencing for validation sets (n = 99 pairs). For each matched pair the change in methylation at each CpG was calculated (the odds ratio), and these ratios were averaged across individuals and ranked by magnitude to identify the CpGs with the greatest change in methylation associated with tumor development. We identified the top gene-loci representing an increase in methylation (HOXA9, 10.3-fold and SOX1, 5.9-fold) and decrease in methylation (DDR1, 8.1-fold). In replication testing sets, methylation was higher in tumors for HOXA9 (p < 2.2 × 10⁻¹⁶) and SOX1 (p < 2.2 × 10⁻¹⁶) and lower for DDR1 (p < 2.2 × 10⁻¹⁶). The magnitude and strength of these changes were consistent across squamous cell and adenocarcinoma tumors. Our data indicate that the identified genes consistently have altered methylation in lung tumors. Our identified genes should be included in translational studies that aim to develop screening for early disease detection. 相似文献

9.

Novel methylation and expression markers associated with breast cancer

Kuznetsova EB Kekeeva TV Larin SS Zemliakova VV Babenko OV Nemtsova MV Zaletaev DV Strel'nikov VV 《Molekuliarnaia biologiia》2007,41(4):624-633

We have developed a modification of methylation sensitive arbitrarily primed PCR, one of the methods of differentially methylated CpG islands in cancer cells genomes screening. Seven genes undergoing abnormal epigenetic regulation in breast cancer, SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4 and PSMF1, have been identified by this method. Methylation and loss of expression frequencies were evaluated for each of the identified genes on 100 paired (cancer/morphologically intact control) breast tissue samples. Significant frequencies of abnormal methylation were detected for SEMA6B, BIN1, and LAMC3 (38%, 18%, and 8% correspondingly). Methylation of the above genes was not characteristic for morphologically intact breast tissues. Downregulation of SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4 and PSMF1 in breast cancer was as frequent as 44-94% by real-time PCR expression assay. The most pronounced functional alterations were demonstrated for SEMA6B and LAMC3 genes, which allows recommending their inclusion into the panels of carcinogenesis diagnostic panels. Fine methylation mapping was performed for the genes most frequently methylated in breast cancer (SEMA6B, BIN1, LAMC3), providing a fundamental basis for the development of effective methylation tests for these genes. 相似文献

10.

Comparison of vocabularies, representations and ranking algorithms for gene prioritization by text mining

Yu S Van Vooren S Tranchevent LC De Moor B Moreau Y 《Bioinformatics (Oxford, England)》2008,24(16):i119-i125

相似文献

11.

Gene prioritization and clustering by multi-view text mining

Shi Yu Leon-Charles Tranchevent Bart De Moor Yves Moreau 《BMC bioinformatics》2010,11(1):28

Background

Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. 相似文献

12.

Robust prediction of gene regulation in colorectal cancer tissues from DNA methylation profiles

Hagen Klett Yesilda Balavarca Reka Toth Biljana Gigic Nina Habermann Dominique Scherer 《Epigenetics》2018,13(4):386-397

DNA methylation is recognized as one of several epigenetic regulators of gene expression and as potential driver of carcinogenesis through gene-silencing of tumor suppressors and activation of oncogenes. However, abnormal methylation, even of promoter regions, does not necessarily alter gene expression levels, especially if the gene is already silenced, leaving the exact mechanisms of methylation unanswered. Using a large cohort of matching DNA methylation and gene expression samples of colorectal cancer (CRC; n = 77) and normal adjacent mucosa tissues (n = 108), we investigated the regulatory role of methylation on gene expression. We show that on a subset of genes enriched in common cancer pathways, methylation is significantly associated with gene regulation through gene-specific mechanisms. We built two classification models to infer gene regulation in CRC from methylation differences of tumor and normal tissues, taking into account both gene-silencing and gene-activation effects through hyper- and hypo-methylation of CpGs. The classification models result in high prediction performances in both training and independent CRC testing cohorts (0.92<AUC<0.97) as well as in individual patient data (average AUC = 0.82), suggesting a robust interplay between methylation and gene regulation. Validation analysis in other cancerous tissues resulted in lower prediction performances (0.69<AUC<0.90); however, it identified genes that share robust dependencies across cancerous tissues. In conclusion, we present a robust classification approach that predicts the gene-specific regulation through DNA methylation in CRC tissues with possible transition to different cancer entities. Furthermore, we present HMGA1 as consistently associated with methylation across cancers, suggesting a potential candidate for DNA methylation targeting cancer therapy. 相似文献

13.

Novel markers of gene methylation and expression in breast cancer

E. B. Kuznetsova T. V. Kekeeva S. S. Larin V. V. Zemlyakova O. V. Babenko M. V. Nemtsova D. V. Zaletayev V. V. Strelnikov 《Molecular Biology》2007,41(4):562-570

An optimized methylation-sensitive restriction fingerprinting technique was used to search for differentially methylated CpG islands in the tumor genome and detected seven genes subject to abnormal epigenetic regulation in breast cancer: SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4, and PSMF1. For each gene, the rate of promoter methylation and changes in expression were estimated in tumor and morphologically intact paired specimens of breast tissue (N = 100). Significant methylation rates of 38, 18, and 8% were found for SEMA6B, BIN1, and LAMC3, respectively. The genes were not methylated in morphologically intact breast tissue. The expression of SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4, and PSMF1 was decreased in 44–94% of tumor specimens by the real-time RT-PCR assay. The most profound changes in SEMA6B and LAMC3 suggest that these genes can be included in biomarker panels for breast cancer diagnosis. Fine methylation mapping of the most frequently methylated CpG islands (SEMA6B, BIN1, and LAMC3) provides a fundamental basis for developing efficient methylation tests for these genes. 相似文献

14.

Conceptual biology,hypothesis discovery,and text mining: Swanson's legacy

Tanja?Bekhuis Email author 《Biomedical Digital Libraries》2006,3(1):2

Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. 相似文献

15.

RNA-Seq analysis of yak ovary: improving yak gene structure information and mining reproduction-related genes 总被引：1，自引：0，他引：1

DaoLiang Lan XianRong Xiong YanLi Wei Tong Xu JinCheng Zhong XiangDong Zhi Yong Wang Jian Li 《中国科学：生命科学英文版》2014,57(9):925-935

相似文献

16.

Optimization models for cancer classification: extracting gene interaction information from microarray expression data

Antonov AV Tetko IV Mader MT Budczies J Mewes HW 《Bioinformatics (Oxford, England)》2004,20(5):644-652

相似文献

17.

Text mining and ontologies in biomedicine: making sense of raw text 总被引：1，自引：0，他引：1

Spasic I Ananiadou S McNaught J Kumar A 《Briefings in bioinformatics》2005,6(3):239-251

The volume of biomedical literature is increasing at such a rate that it is becoming difficult to locate, retrieve and manage the reported information without text mining, which aims to automatically distill information, extract facts, discover implicit links and generate hypotheses relevant to user needs. Ontologies, as conceptual models, provide the necessary framework for semantic representation of textual information. The principal link between text and an ontology is terminology, which maps terms to domain-specific concepts. This paper summarises different approaches in which ontologies have been used for text-mining applications in biomedicine. 相似文献

18.

Age-dependent methylation of ESR1 gene in prostate cancer 总被引：4，自引：0，他引：4

Li LC Shiina H Deguchi M Zhao H Okino ST Kane CJ Carroll PR Igawa M Dahiya R 《Biochemical and biophysical research communications》2004,321(2):455-461

The incidence of prostate cancer increases dramatically with age and the mechanism underlying this association is unclear. Age-dependent methylation of estrogen receptor alpha (ESR1) gene has been previously implicated in other cancerous and benign diseases. We evaluated the age-dependent methylation of ESR1 in prostate cancer. The methylation status of ESR1 in 83 prostate cancer samples from patients aged 49 to 77 years (mean age at 67.4 years) was examined using the bisulfite genomic sequencing technique. The samples were divided into three age groups: men aged 60 years and under (n = 14), men aged 61-70 years (n = 40), and men aged over 70 years (n = 29). Overall, ESR1 promoter methylation was detected in 54 out of 83 (65.1%) prostate samples. The methylation rate of ESR1 increased dramatically with age from 50.0% in patients aged 60 years and under to 89.7% for patients aged 70 years and over. Logistic regression analyses revealed that age and Gleason score were the only variables that affect incidence of ESR1 methylation; other clinical factors such as prostate-specific antigen level and clinical stage did not. We also calculated ESR1 methylation density (the percentage of methylated CpGs among all CpGs within the analyzed region) and severity (the percentage of methylated CpG alleles) for each sample analyzed. Multiple regression analyses showed a positive correlation between age and methylation density (beta, 0.35; P, 0.012; 95% CI, 0.26-2.01); while Gleason score was positively associated with methylation severity (beta, 0.45; P, 0.018; 95% CI, 1.04-4.26). These findings suggest that methylation of ESR1 is both age-dependent and tumor differentiation-dependent and age-dependent methylation of ESR1 may represent a mechanism linking aging and prostate cancer. 相似文献

19.

Evaluation of text data mining for database curation: lessons learned from the KDD Challenge Cup

Yeh AS Hirschman L Morgan AA 《Bioinformatics (Oxford, England)》2003,19(Z1):i331-i339

MOTIVATION: The biological literature is a major repository of knowledge. Many biological databases draw much of their content from a careful curation of this literature. However, as the volume of literature increases, the burden of curation increases. Text mining may provide useful tools to assist in the curation process. To date, the lack of standards has made it impossible to determine whether text mining techniques are sufficiently mature to be useful. RESULTS: We report on a Challenge Evaluation task that we created for the Knowledge Discovery and Data Mining (KDD) Challenge Cup. We provided a training corpus of 862 articles consisting of journal articles curated in FlyBase, along with the associated lists of genes and gene products, as well as the relevant data fields from FlyBase. For the test, we provided a corpus of 213 new ('blind') articles; the 18 participating groups provided systems that flagged articles for curation, based on whether the article contained experimental evidence for gene expression products. We report on the evaluation results and describe the techniques used by the top performing groups. 相似文献

20.

Manually structured digital abstracts: a scaffold for automatic text mining

Seringhaus M Gerstein M 《FEBS letters》2008,582(8):1170

相似文献