首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Current advances in high-throughput biology are accompanied by a tremendous increase in the number of related publications. Much biomedical information is reported in the vast amount of literature. The ability to rapidly and effectively survey the literature is necessary for both the design and the interpretation of large-scale experiments, and for curation of structured biomedical knowledge in public databases. Given the millions of published documents, the field of information retrieval, which is concerned with the automatic identification of relevant documents from large text collections, has much to offer. This paper introduces the basics of information retrieval, discusses its applications in biomedicine, and presents traditional and non-traditional ways in which it can be used.  相似文献   

2.
3.
A report of the 6th Georgia Tech-Oak Ridge National Lab International Conference on Bioinformatics ''In silico Biology: Gene Discovery and Systems Genomics'', Atlanta, USA, 15-17 November, 2007.Technological developments have had a profound impact on biology during the past decade, spectacularly augmenting our ability to survey and interrogate biological phenomena. In particular, they have increased capacity for data generation by several orders of magnitude and made computation a necessary partner of biology. The sixth meeting in the biennial series of bioinformatics conferences co-sponsored by Georgia Institute of Technology in Atlanta and the Oak Ridge National Laboratory addressed the challenges that this technology-driven avalanche of data pose to bioinformatics - increasing the complexity of longstanding problems and creating new ones.  相似文献   

4.

Background

Thyroid cancer is the most common endocrine tumor with a steady increase in incidence. It is classified into multiple histopathological subtypes with potentially distinct molecular mechanisms. Identifying the most relevant genes and biological pathways reported in the thyroid cancer literature is vital for understanding of the disease and developing targeted therapeutics.

Results

We developed a large-scale text mining system to generate a molecular profiling of thyroid cancer subtypes. The system first uses a subtype classification method for the thyroid cancer literature, which employs a scoring scheme to assign different subtypes to articles. We evaluated the classification method on a gold standard derived from the PubMed Supplementary Concept annotations, achieving a micro-average F1-score of 85.9% for primary subtypes. We then used the subtype classification results to extract genes and pathways associated with different thyroid cancer subtypes and successfully unveiled important genes and pathways, including some instances that are missing from current manually annotated databases or most recent review articles.

Conclusions

Identification of key genes and pathways plays a central role in understanding the molecular biology of thyroid cancer. An integration of subtype context can allow prioritized screening for diagnostic biomarkers and novel molecular targeted therapeutics. Source code used for this study is made freely available online at https://github.com/chengkun-wu/GenesThyCan.
  相似文献   

5.
6.
More than one neoplastic founder clone can exist in benign epithelial tumours. Although theories of clonal selection make pluriclonality appear unlikely in carcinomas, published data do not exclude this possibility. This study looked for evidence of multiclonal X inactivation in ovarian carcinoma using AR methylation as a marker. Fifteen unifocal ovarian carcinomas and 14 multifocal carcinomas all in Scottish patients were studied. One representative formalin-fixed paraffin-embedded tumour block was chosen for each of the former and two for the latter. From each of these 43 tumour blocks three samples each of approximately 10(4) carcinoma cells were obtained by microdissection (129 in all). DNA released by proteinase K digestion was subjected to PCR amplification of the androgen receptor gene AR exon I CAG repeat polymorphism with and without prior digestion with methylation-sensitive restriction enzymes HpaII and HhaI. Complex amplification patterns were consistent with mosaic X inactivation in some ovarian carcinomas but acquired anomalies of AR methylation cannot be excluded. Parallel analysis of other X-linked polymorphic loci would strengthen the inference of clonality status from DNA methylation data in tumour X studies. Strikingly, the number of CAG repeats in the 29 ovarian tumour patients (median 16, range 11 - 20) was substantially fewer than in 34 previously studied breast cancer patients from the same scottish population (median 21, range 14 - 26; P < 0.0001), and women homozygous for the AR CAG repeat were over-represented in the ovarian cancer patients but not in the breast cancer series. These findings reinforce recent suggestions that AR may have a role in ovarian carcinogenesis.  相似文献   

7.
Epigenetic alterations are a common event in lung cancer and their identification can serve to inform on the carcinogenic process and provide clinically relevant biomarkers. Using paired tumor and non-tumor lung tissues from 146 individuals from three independent populations we sought to identify common changes in DNA methylation associated with the development of non-small cell lung cancer. Pathologically normal lung tissue taken at the time of cancer resection was matched to tumorous lung tissue and together were probed for methylation using Illumina GoldenGate arrays in the discovery set (n = 47 pairs) followed by bisulfite pyrosequencing for validation sets (n = 99 pairs). For each matched pair the change in methylation at each CpG was calculated (the odds ratio), and these ratios were averaged across individuals and ranked by magnitude to identify the CpGs with the greatest change in methylation associated with tumor development. We identified the top gene-loci representing an increase in methylation (HOXA9, 10.3-fold and SOX1, 5.9-fold) and decrease in methylation (DDR1, 8.1-fold). In replication testing sets, methylation was higher in tumors for HOXA9 (p < 2.2 × 10−16) and SOX1 (p < 2.2 × 10−16) and lower for DDR1 (p < 2.2 × 10−16). The magnitude and strength of these changes were consistent across squamous cell and adenocarcinoma tumors. Our data indicate that the identified genes consistently have altered methylation in lung tumors. Our identified genes should be included in translational studies that aim to develop screening for early disease detection.  相似文献   

8.
Frontiers of biomedical text mining: current progress   总被引:3,自引:0,他引:3  
It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a number of problems at the frontiers of biomedical text mining continue to present interesting challenges and opportunities for great improvements and interesting research. In this article we review the current state of the art in biomedical text mining or 'BioNLP' in general, focusing primarily on papers published within the past year.  相似文献   

9.
We have developed a modification of methylation sensitive arbitrarily primed PCR, one of the methods of differentially methylated CpG islands in cancer cells genomes screening. Seven genes undergoing abnormal epigenetic regulation in breast cancer, SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4 and PSMF1, have been identified by this method. Methylation and loss of expression frequencies were evaluated for each of the identified genes on 100 paired (cancer/morphologically intact control) breast tissue samples. Significant frequencies of abnormal methylation were detected for SEMA6B, BIN1, and LAMC3 (38%, 18%, and 8% correspondingly). Methylation of the above genes was not characteristic for morphologically intact breast tissues. Downregulation of SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4 and PSMF1 in breast cancer was as frequent as 44-94% by real-time PCR expression assay. The most pronounced functional alterations were demonstrated for SEMA6B and LAMC3 genes, which allows recommending their inclusion into the panels of carcinogenesis diagnostic panels. Fine methylation mapping was performed for the genes most frequently methylated in breast cancer (SEMA6B, BIN1, LAMC3), providing a fundamental basis for the development of effective methylation tests for these genes.  相似文献   

10.
11.
DNA methylation is recognized as one of several epigenetic regulators of gene expression and as potential driver of carcinogenesis through gene-silencing of tumor suppressors and activation of oncogenes. However, abnormal methylation, even of promoter regions, does not necessarily alter gene expression levels, especially if the gene is already silenced, leaving the exact mechanisms of methylation unanswered. Using a large cohort of matching DNA methylation and gene expression samples of colorectal cancer (CRC; n = 77) and normal adjacent mucosa tissues (n = 108), we investigated the regulatory role of methylation on gene expression. We show that on a subset of genes enriched in common cancer pathways, methylation is significantly associated with gene regulation through gene-specific mechanisms. We built two classification models to infer gene regulation in CRC from methylation differences of tumor and normal tissues, taking into account both gene-silencing and gene-activation effects through hyper- and hypo-methylation of CpGs. The classification models result in high prediction performances in both training and independent CRC testing cohorts (0.92<AUC<0.97) as well as in individual patient data (average AUC = 0.82), suggesting a robust interplay between methylation and gene regulation. Validation analysis in other cancerous tissues resulted in lower prediction performances (0.69<AUC<0.90); however, it identified genes that share robust dependencies across cancerous tissues. In conclusion, we present a robust classification approach that predicts the gene-specific regulation through DNA methylation in CRC tissues with possible transition to different cancer entities. Furthermore, we present HMGA1 as consistently associated with methylation across cancers, suggesting a potential candidate for DNA methylation targeting cancer therapy.  相似文献   

12.
王丽波  王芳  张岩 《生物信息学》2014,12(3):213-217
DNA甲基化是重要的表观遗传标记之一,在转录调控中起直接作用。DNA甲基化的异常与癌症的发生发展密切相关。高通量测序使得在单碱基分辨率下检测全基因组的DNA甲基化水平成为可能。本文基于临近CpGs位点甲基化水平的相关性挖掘DNA甲基化连锁区域。结果发现DNA甲基化连锁区域的甲基化水平和模式在癌症中存在异常,而且显著富集到分化/发育相关的生物学功能。DNA甲基化连锁区域的挖掘有助于对具有生物学功能的表观遗传标记的进一步理解,有助于对癌症诊断的表观遗传标记的挖掘。  相似文献   

13.
An optimized methylation-sensitive restriction fingerprinting technique was used to search for differentially methylated CpG islands in the tumor genome and detected seven genes subject to abnormal epigenetic regulation in breast cancer: SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4, and PSMF1. For each gene, the rate of promoter methylation and changes in expression were estimated in tumor and morphologically intact paired specimens of breast tissue (N = 100). Significant methylation rates of 38, 18, and 8% were found for SEMA6B, BIN1, and LAMC3, respectively. The genes were not methylated in morphologically intact breast tissue. The expression of SEMA6B, BIN1, VCPIP1, LAMC3, KCNH2, CACNG4, and PSMF1 was decreased in 44–94% of tumor specimens by the real-time RT-PCR assay. The most profound changes in SEMA6B and LAMC3 suggest that these genes can be included in biomarker panels for breast cancer diagnosis. Fine methylation mapping of the most frequently methylated CpG islands (SEMA6B, BIN1, and LAMC3) provides a fundamental basis for developing efficient methylation tests for these genes.  相似文献   

14.

Background  

Text mining has become a useful tool for biologists trying to understand the genetics of diseases. In particular, it can help identify the most interesting candidate genes for a disease for further experimental analysis. Many text mining approaches have been introduced, but the effect of disease-gene identification varies in different text mining models. Thus, the idea of incorporating more text mining models may be beneficial to obtain more refined and accurate knowledge. However, how to effectively combine these models still remains a challenging question in machine learning. In particular, it is a non-trivial issue to guarantee that the integrated model performs better than the best individual model.  相似文献   

15.
16.
17.
Age-dependent methylation of ESR1 gene in prostate cancer   总被引:4,自引:0,他引:4  
The incidence of prostate cancer increases dramatically with age and the mechanism underlying this association is unclear. Age-dependent methylation of estrogen receptor alpha (ESR1) gene has been previously implicated in other cancerous and benign diseases. We evaluated the age-dependent methylation of ESR1 in prostate cancer. The methylation status of ESR1 in 83 prostate cancer samples from patients aged 49 to 77 years (mean age at 67.4 years) was examined using the bisulfite genomic sequencing technique. The samples were divided into three age groups: men aged 60 years and under (n = 14), men aged 61-70 years (n = 40), and men aged over 70 years (n = 29). Overall, ESR1 promoter methylation was detected in 54 out of 83 (65.1%) prostate samples. The methylation rate of ESR1 increased dramatically with age from 50.0% in patients aged 60 years and under to 89.7% for patients aged 70 years and over. Logistic regression analyses revealed that age and Gleason score were the only variables that affect incidence of ESR1 methylation; other clinical factors such as prostate-specific antigen level and clinical stage did not. We also calculated ESR1 methylation density (the percentage of methylated CpGs among all CpGs within the analyzed region) and severity (the percentage of methylated CpG alleles) for each sample analyzed. Multiple regression analyses showed a positive correlation between age and methylation density (beta, 0.35; P, 0.012; 95% CI, 0.26-2.01); while Gleason score was positively associated with methylation severity (beta, 0.45; P, 0.018; 95% CI, 1.04-4.26). These findings suggest that methylation of ESR1 is both age-dependent and tumor differentiation-dependent and age-dependent methylation of ESR1 may represent a mechanism linking aging and prostate cancer.  相似文献   

18.
Innovative biomedical librarians and information specialists who want to expand their roles as expert searchers need to know about profound changes in biology and parallel trends in text mining. In recent years, conceptual biology has emerged as a complement to empirical biology. This is partly in response to the availability of massive digital resources such as the network of databases for molecular biologists at the National Center for Biotechnology Information. Developments in text mining and hypothesis discovery systems based on the early work of Swanson, a mathematician and information scientist, are coincident with the emergence of conceptual biology. Very little has been written to introduce biomedical digital librarians to these new trends. In this paper, background for data and text mining, as well as for knowledge discovery in databases (KDD) and in text (KDT) is presented, then a brief review of Swanson's ideas, followed by a discussion of recent approaches to hypothesis discovery and testing. 'Testing' in the context of text mining involves partially automated methods for finding evidence in the literature to support hypothetical relationships. Concluding remarks follow regarding (a) the limits of current strategies for evaluation of hypothesis discovery systems and (b) the role of literature-based discovery in concert with empirical research. Report of an informatics-driven literature review for biomarkers of systemic lupus erythematosus is mentioned. Swanson's vision of the hidden value in the literature of science and, by extension, in biomedical digital databases, is still remarkably generative for information scientists, biologists, and physicians.  相似文献   

19.
Text mining and ontologies in biomedicine: making sense of raw text   总被引:1,自引:0,他引:1  
The volume of biomedical literature is increasing at such a rate that it is becoming difficult to locate, retrieve and manage the reported information without text mining, which aims to automatically distill information, extract facts, discover implicit links and generate hypotheses relevant to user needs. Ontologies, as conceptual models, provide the necessary framework for semantic representation of textual information. The principal link between text and an ontology is terminology, which maps terms to domain-specific concepts. This paper summarises different approaches in which ontologies have been used for text-mining applications in biomedicine.  相似文献   

20.

Background

Multiple breast cancer gene expression profiles have been developed that appear to provide similar abilities to predict outcome and may outperform clinical-pathologic criteria; however, the extent to which seemingly disparate profiles provide additive prognostic information is not known, nor do we know whether prognostic profiles perform equally across clinically defined breast cancer subtypes. We evaluated whether combining the prognostic powers of standard breast cancer clinical variables with a large set of gene expression signatures could improve on our ability to predict patient outcomes.

Methods

Using clinical-pathological variables and a collection of 323 gene expression "modules", including 115 previously published signatures, we build multivariate Cox proportional hazards models using a dataset of 550 node-negative systemically untreated breast cancer patients. Models predictive of pathological complete response (pCR) to neoadjuvant chemotherapy were also built using this approach.

Results

We identified statistically significant prognostic models for relapse-free survival (RFS) at 7 years for the entire population, and for the subgroups of patients with ER-positive, or Luminal tumors. Furthermore, we found that combined models that included both clinical and genomic parameters improved prognostication compared with models with either clinical or genomic variables alone. Finally, we were able to build statistically significant combined models for pathological complete response (pCR) predictions for the entire population.

Conclusions

Integration of gene expression signatures and clinical-pathological factors is an improved method over either variable type alone. Highly prognostic models could be created when using all patients, and for the subset of patients with lymph node-negative and ER-positive breast cancers. Other variables beyond gene expression and clinical-pathological variables, like gene mutation status or DNA copy number changes, will be needed to build robust prognostic models for ER-negative breast cancer patients. This combined clinical and genomics model approach can also be used to build predictors of therapy responsiveness, and could ultimately be applied to other tumor types.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号