共查询到20条相似文献,搜索用时 15 毫秒
1.
Shatkay H 《Briefings in bioinformatics》2005,6(3):222-238
Current advances in high-throughput biology are accompanied by a tremendous increase in the number of related publications. Much biomedical information is reported in the vast amount of literature. The ability to rapidly and effectively survey the literature is necessary for both the design and the interpretation of large-scale experiments, and for curation of structured biomedical knowledge in public databases. Given the millions of published documents, the field of information retrieval, which is concerned with the automatic identification of relevant documents from large text collections, has much to offer. This paper introduces the basics of information retrieval, discusses its applications in biomedicine, and presents traditional and non-traditional ways in which it can be used. 相似文献
2.
3.
A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics ''In silico Biology: Gene Discovery and Systems Genomics'', Atlanta, USA, 15-17 November, 2007.Technological developments have had a profound impact on biology during the past decade, spectacularly augmenting our ability to survey and interrogate biological phenomena. In particular, they have increased capacity for data generation by several orders of magnitude and made computation a necessary partner of biology. The sixth meeting in the biennial series of bioinformatics conferences co-sponsored by Georgia Institute of Technology in Atlanta and the Oak Ridge National Laboratory addressed the challenges that this technology-driven avalanche of data pose to bioinformatics - increasing the complexity of longstanding problems and creating new ones. 相似文献
4.
Background
Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer literature is vital for understanding of the disease and developing targeted therapeutics.Results
We developed a large-scale text mining system to generate a molecular profiling of thyroid cancer subtypes. The system first uses a subtype classification method for the thyroid cancer literature, which employs a scoring scheme to assign different subtypes to articles. We evaluated the classification method on a gold standard derived from the PubMed Supplementary Concept annotations, achieving a micro-average F1-score of 85.9% for primary subtypes. We then used the subtype classification results to extract genes and pathways associated with different thyroid cancer subtypes and successfully unveiled important genes and pathways, including some instances that are missing from current manually annotated databases or most recent review articles.Conclusions
Identification of key genes and pathways plays a central role in understanding the molecular biology of thyroid cancer. An integration of subtype context can allow prioritized screening for diagnostic biomarkers and novel molecular targeted therapeutics. Source code used for this study is made freely available online at https://github.com/chengkun-wu/GenesThyCan.5.
6.
More than one neoplastic founder clone can exist in benign epithelial tumours. Although theories of clonal selection make pluriclonality appear unlikely in carcinomas, published data do not exclude this possibility. This study looked for evidence of multiclonal X inactivation in ovarian carcinoma using AR methylation as a marker. Fifteen unifocal ovarian carcinomas and 14 multifocal carcinomas all in Scottish patients were studied. One representative formalin-fixed paraffin-embedded tumour block was chosen for each of the former and two for the latter. From each of these 43 tumour blocks three samples each of approximately 10(4) carcinoma cells were obtained by microdissection (129 in all). DNA released by proteinase K digestion was subjected to PCR amplification of the androgen receptor gene AR exon I CAG repeat polymorphism with and without prior digestion with methylation-sensitive restriction enzymes HpaII and HhaI. Complex amplification patterns were consistent with mosaic X inactivation in some ovarian carcinomas but acquired anomalies of AR methylation cannot be excluded. Parallel analysis of other X-linked polymorphic loci would strengthen the inference of clonality status from DNA methylation data in tumour X studies. Strikingly, the number of CAG repeats in the 29 ovarian tumour patients (median 16, range 11 - 20) was substantially fewer than in 34 previously studied breast cancer patients from the same scottish population (median 21, range 14 - 26; P < 0.0001), and women homozygous for the AR CAG repeat were over-represented in the ovarian cancer patients but not in the breast cancer series. These findings reinforce recent suggestions that AR may have a role in ovarian carcinogenesis. 相似文献
7.
Heather H. Nelson Carmen J. Marsit Brock C. Christensen E.A. Houseman Milica Kontic Joseph L. Wiemels Margaret R. Karagas Margaret R. Wrensch Shichun Zheng John K. Wiencke Karl T. Kelsey 《Epigenetics》2012,7(6):559-566
Epigenetic alterations are a common event in lung cancer and their identification can serve to inform on the carcinogenic process and provide clinically relevant biomarkers. Using paired tumor and non-tumor lung tissues from 146 individuals from three independent populations we sought to identify common changes in DNA methylation associated with the development of non-small cell lung cancer. Pathologically normal lung tissue taken at the time of cancer resection was matched to tumorous lung tissue and together were probed for methylation using Illumina GoldenGate arrays in the discovery set (n = 47 pairs) followed by bisulfite pyrosequencing for validation sets (n = 99 pairs). For each matched pair the change in methylation at each CpG was calculated (the odds ratio), and these ratios were averaged across individuals and ranked by magnitude to identify the CpGs with the greatest change in methylation associated with tumor development. We identified the top gene-loci representing an increase in methylation (HOXA9, 10.3-fold and SOX1, 5.9-fold) and decrease in methylation (DDR1, 8.1-fold). In replication testing sets, methylation was higher in tumors for HOXA9 (p < 2.2 × 10−16) and SOX1 (p < 2.2 × 10−16) and lower for DDR1 (p < 2.2 × 10−16). The magnitude and strength of these changes were consistent across squamous cell and adenocarcinoma tumors. Our data indicate that the identified genes consistently have altered methylation in lung tumors. Our identified genes should be included in translational studies that aim to develop screening for early disease detection. 相似文献
8.
Frontiers of biomedical text mining: current progress 总被引:3,自引:0,他引:3
It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or 'BioNLP' in general, focusing primarily on papers published within the past year. 相似文献
9.
Kuznetsova EB Kekeeva TV Larin SS Zemliakova VV Babenko OV Nemtsova MV Zaletaev DV Strel'nikov VV 《Molekuliarnaia biologiia》2007,41(4):624-633
We have developed a modification of methylation sensitive arbitrarily primed PCR, one of the methods of differentially methylated CpG islands in cancer cells genomes screening. Seven genes undergoing abnormal epigenetic regulation in breast cancer, SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4 and PSMF1, have been identified by this method. Methylation and loss of expression frequencies were evaluated for each of the identified genes on 100 paired (cancer/morphologically intact control) breast tissue samples. Significant frequencies of abnormal methylation were detected for SEMA6B, BIN1, and LAMC3 (38%, 18%, and 8% correspondingly). Methylation of the above genes was not characteristic for morphologically intact breast tissues. Downregulation of SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4 and PSMF1 in breast cancer was as frequent as 44-94% by real-time PCR expression assay. The most pronounced functional alterations were demonstrated for SEMA6B and LAMC3 genes, which allows recommending their inclusion into the panels of carcinogenesis diagnostic panels. Fine methylation mapping was performed for the genes most frequently methylated in breast cancer (SEMA6B, BIN1, LAMC3), providing a fundamental basis for the development of effective methylation tests for these genes. 相似文献
10.
11.
Hagen Klett Yesilda Balavarca Reka Toth Biljana Gigic Nina Habermann Dominique Scherer 《Epigenetics》2018,13(4):386-397
DNA methylation is recognized as one of several epigenetic regulators of gene expression and as potential driver of carcinogenesis through gene-silencing of tumor suppressors and activation of oncogenes. However, abnormal methylation, even of promoter regions, does not necessarily alter gene expression levels, especially if the gene is already silenced, leaving the exact mechanisms of methylation unanswered. Using a large cohort of matching DNA methylation and gene expression samples of colorectal cancer (CRC; n = 77) and normal adjacent mucosa tissues (n = 108), we investigated the regulatory role of methylation on gene expression. We show that on a subset of genes enriched in common cancer pathways, methylation is significantly associated with gene regulation through gene-specific mechanisms. We built two classification models to infer gene regulation in CRC from methylation differences of tumor and normal tissues, taking into account both gene-silencing and gene-activation effects through hyper- and hypo-methylation of CpGs. The classification models result in high prediction performances in both training and independent CRC testing cohorts (0.92<AUC<0.97) as well as in individual patient data (average AUC = 0.82), suggesting a robust interplay between methylation and gene regulation. Validation analysis in other cancerous tissues resulted in lower prediction performances (0.69<AUC<0.90); however, it identified genes that share robust dependencies across cancerous tissues. In conclusion, we present a robust classification approach that predicts the gene-specific regulation through DNA methylation in CRC tissues with possible transition to different cancer entities. Furthermore, we present HMGA1 as consistently associated with methylation across cancers, suggesting a potential candidate for DNA methylation targeting cancer therapy. 相似文献
12.
13.
E. B. Kuznetsova T. V. Kekeeva S. S. Larin V. V. Zemlyakova O. V. Babenko M. V. Nemtsova D. V. Zaletayev V. V. Strelnikov 《Molecular Biology》2007,41(4):562-570
An optimized methylation-sensitive restriction fingerprinting technique was used to search for differentially methylated CpG islands in the tumor genome and detected seven genes subject to abnormal epigenetic regulation in breast cancer: SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4, and PSMF1. For each gene, the rate of promoter methylation and changes in expression were estimated in tumor and morphologically intact paired specimens of breast tissue (N = 100). Significant methylation rates of 38, 18, and 8% were found for SEMA6B, BIN1, and LAMC3, respectively. The genes were not methylated in morphologically intact breast tissue. The expression of SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4, and PSMF1 was decreased in 44–94% of tumor specimens by the real-time RT-PCR assay. The most profound changes in SEMA6B and LAMC3 suggest that these genes can be included in biomarker panels for breast cancer diagnosis. Fine methylation mapping of the most frequently methylated CpG islands (SEMA6B, BIN1, and LAMC3) provides a fundamental basis for developing efficient methylation tests for these genes. 相似文献
14.
Background
Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model. 相似文献15.
16.
17.
Age-dependent methylation of ESR1 gene in prostate cancer 总被引:4,自引:0,他引:4
Li LC Shiina H Deguchi M Zhao H Okino ST Kane CJ Carroll PR Igawa M Dahiya R 《Biochemical and biophysical research communications》2004,321(2):455-461
The incidence of prostate cancer increases dramatically with age and the mechanism underlying this association is unclear. Age-dependent methylation of estrogen receptor alpha (ESR1) gene has been previously implicated in other cancerous and benign diseases. We evaluated the age-dependent methylation of ESR1 in prostate cancer. The methylation status of ESR1 in 83 prostate cancer samples from patients aged 49 to 77 years (mean age at 67.4 years) was examined using the bisulfite genomic sequencing technique. The samples were divided into three age groups: men aged 60 years and under (n = 14), men aged 61-70 years (n = 40), and men aged over 70 years (n = 29). Overall, ESR1 promoter methylation was detected in 54 out of 83 (65.1%) prostate samples. The methylation rate of ESR1 increased dramatically with age from 50.0% in patients aged 60 years and under to 89.7% for patients aged 70 years and over. Logistic regression analyses revealed that age and Gleason score were the only variables that affect incidence of ESR1 methylation; other clinical factors such as prostate-specific antigen level and clinical stage did not. We also calculated ESR1 methylation density (the percentage of methylated CpGs among all CpGs within the analyzed region) and severity (the percentage of methylated CpG alleles) for each sample analyzed. Multiple regression analyses showed a positive correlation between age and methylation density (beta, 0.35; P, 0.012; 95% CI, 0.26-2.01); while Gleason score was positively associated with methylation severity (beta, 0.45; P, 0.018; 95% CI, 1.04-4.26). These findings suggest that methylation of ESR1 is both age-dependent and tumor differentiation-dependent and age-dependent methylation of ESR1 may represent a mechanism linking aging and prostate cancer. 相似文献
18.
Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians. 相似文献
19.
The volume of biomedical literature is increasing at such a rate that it is becoming difficult to locate, retrieve and manage the reported information without text mining, which aims to automatically distill information, extract facts, discover implicit links and generate hypotheses relevant to user needs. Ontologies, as conceptual models, provide the necessary framework for semantic representation of textual information. The principal link between text and an ontology is terminology, which maps terms to domain-specific concepts. This paper summarises different approaches in which ontologies have been used for text-mining applications in biomedicine. 相似文献
20.
Jung-Hoon Park Jinah Park Jung Kyoon Choi Jaemyun Lyu Min-Gyun Bae Young-Gun Lee Jae-Bum Bae Dong Yoon Park Han-Kwang Yang Tae-You Kim Young-Joon Kim 《BMC medical genomics》2011,4(1):1-15