首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
caCORE: a common infrastructure for cancer informatics   总被引:4,自引:0,他引:4  
MOTIVATION:Sites with substantive bioinformatics operations are challenged to build data processing and delivery infrastructure that provides reliable access and enables data integration. Locally generated data must be processed and stored such that relationships to external data sources can be presented. Consistency and comparability across data sets requires annotation with controlled vocabularies and, further, metadata standards for data representation. Programmatic access to the processed data should be supported to ensure the maximum possible value is extracted. Confronted with these challenges at the National Cancer Institute Center for Bioinformatics, we decided to develop a robust infrastructure for data management and integration that supports advanced biomedical applications. RESULTS: We have developed an interconnected set of software and services called caCORE. Enterprise Vocabulary Services (EVS) provide controlled vocabulary, dictionary and thesaurus services. The Cancer Data Standards Repository (caDSR) provides a metadata registry for common data elements. Cancer Bioinformatics Infrastructure Objects (caBIO) implements an object-oriented model of the biomedical domain and provides Java, Simple Object Access Protocol and HTTP-XML application programming interfaces. caCORE has been used to develop scientific applications that bring together data from distinct genomic and clinical science sources. AVAILABILITY: caCORE downloads and web interfaces can be accessed from links on the caCORE web site (http://ncicb.nci.nih.gov/core). caBIO software is distributed under an open source license that permits unrestricted academic and commercial use. Vocabulary and metadata content in the EVS and caDSR, respectively, is similarly unrestricted, and is available through web applications and FTP downloads. SUPPLEMENTARY INFORMATION: http://ncicb.nci.nih.gov/core/publications contains links to the caBIO 1.0 class diagram and the caCORE 1.0 Technical Guide, which provide detailed information on the present caCORE architecture, data sources and APIs. Updated information appears on a regular basis on the caCORE web site (http://ncicb.nci.nih.gov/core).  相似文献   

2.
3.
The online service PROSIT (Pseudo-Rotational Online Service and Interactive Tool) is a free service available at http://cactus.nci.nih.gov/prosit/ that performs pseudorotational analysis of nucleosides(tides). PROSIT reads the 3D coordinates of nucleosides and returns the pseudorotational phase angle P, puckering amplitude νmax, and other related information. As examples, the sugar conformations in a parallel-stranded guanine tetraplex and a four-way Holliday junction are presented here.  相似文献   

4.
MOTIVATION: Microarray techniques provide a valuable way of characterizing the molecular nature of disease. Unfortunately expense and limited specimen availability often lead to studies with small sample sizes. This makes accurate estimation of variability difficult, since variance estimates made on a gene by gene basis will have few degrees of freedom, and the assumption that all genes share equal variance is unlikely to be true. RESULTS: We propose a model by which the within gene variances are drawn from an inverse gamma distribution, whose parameters are estimated across all genes. This results in a test statistic that is a minor variation of those used in standard linear models. We demonstrate that the model assumptions are valid on experimental data, and that the model has more power than standard tests to pick up large changes in expression, while not increasing the rate of false positives. AVAILABILITY: This method is incorporated into BRB-ArrayTools version 3.0 (http://linus.nci.nih.gov/BRB-ArrayTools.html). SUPPLEMENTARY MATERIAL: ftp://linus.nci.nih.gov/pub/techreport/RVM_supplement.pdf  相似文献   

5.
MOTIVATION: Business Architecture Models (BAMs) describe what a business does, who performs the activities, where and when activities are performed, how activities are accomplished and which data are present. The purpose of a BAM is to provide a common resource for understanding business functions and requirements and to guide software development. The cancer Biomedical Informatics Grid (caBIG?) Life Science BAM (LS BAM) provides a shared understanding of the vocabulary, goals and processes that are common in the business of LS research. RESULTS: LS BAM 1.1 includes 90 goals and 61 people and groups within Use Case and Activity Unified Modeling Language (UML) Diagrams. Here we report on the model's current release, LS BAM 1.1, its utility and usage, and plans for future use and continuing development for future releases. Availability and Implementation: The LS BAM is freely available as UML, PDF and HTML (https://wiki.nci.nih.gov/x/OFNyAQ).  相似文献   

6.
SUMMARY: VISDA (Visual Statistical Data Analyzer) is a caBIG analytical tool for cluster modeling, visualization and discovery that has met silver-level compatibility under the caBIG initiative. Being statistically principled and visually interfaced, VISDA exploits both hierarchical statistics modeling and human gift for pattern recognition to allow a progressive yet interactive discovery of hidden clusters within high dimensional and complex biomedical datasets. The distinctive features of VISDA are particularly useful for users across the cancer research and broader research communities to analyze complex biological data. AVAILABILITY: http://gforge.nci.nih.gov/projects/visda/  相似文献   

7.
Validation of clinical biomarkers and response to therapy is a challenging topic in cancer research. An important source of information for virtual validation is the datasets generated from multi-center cancer research projects such as The Cancer Genome Atlas project (TCGA). These data enable investigation of genetic and epigenetic changes responsible for cancer onset and progression, response to cancer therapies, and discovery of the molecular profiles of various cancers. However, these analyses often require bulk download of data and substantial bioinformatics expertise, which can be intimidating for investigators. Here, we report on the development of a new resource available to scientists: a data base called Glioblastoma Bio Discovery Portal (GBM-BioDP). GBM-BioDP is a free web-accessible resource that hosts a subset of the glioblastoma TCGA data and enables an intuitive query and interactive display of the resultant data. This resource provides visualization tools for the exploration of gene, miRNA, and protein expression, differential expression within the subtypes of GBM, and potential associations with clinical outcome, which are useful for virtual biological validation. The tool may also enable generation of hypotheses on how therapies impact GBM molecular profiles, which can help in personalization of treatment for optimal outcome. The resource can be accessed freely at http://gbm-biodp.nci.nih.gov (a tutorial is included).  相似文献   

8.
9.

Background  

In biological and medical domain, the use of web services made the data and computation functionality accessible in a unified manner, which helped automate the data pipeline that was previously performed manually. Workflow technology is widely used in the orchestration of multiple services to facilitate in-silico research. Cancer Biomedical Informatics Grid (caBIG) is an information network enabling the sharing of cancer research related resources and caGrid is its underlying service-based computation infrastructure. CaBIG requires that services are composed and orchestrated in a given sequence to realize data pipelines, which are often called scientific workflows.  相似文献   

10.
Gene Set Expression Comparison kit for BRB-ArrayTools   总被引:1,自引:0,他引:1  
  相似文献   

11.
MOTIVATION: A plugin for the Java-based PathVisio pathway editor has been developed to help users draw diagrams of bioregulatory networks according to the Molecular Interaction Map (MIM) notation. Together with the core PathVisio application, this plugin presents a simple to use and cross-platform application for the construction of complex MIM diagrams with the ability to annotate diagram elements with comments, literature references and links to external databases. This tool extends the capabilities of the PathVisio pathway editor by providing both MIM-specific glyphs and support for a MIM-specific markup language file format for exchange with other MIM-compatible tools and diagram validation. AVAILABILITY: The PathVisio-MIM plugin is freely available and works with versions of PathVisio 2.0.11 and later on Windows, Mac OS X and Linux. Information about MIM notation and the MIMML format is available at http://discover.nci.nih.gov/mim. The plugin, along with diagram examples, instructions and Java source code, may be downloaded at http://discover.nci.nih.gov/mim/mim_pathvisio.html.  相似文献   

12.
The NCBI (National Center for Biotechnology Information) at the National Institutes of Health collects a wide range of molecular biological data, and develops tools and databases to analyse and disseminate this information. Many life scientists are familiar with the website maintained by the NCBI (http://www.ncbi.nlm.nih.gov), because they use it to search GenBank for homologues of their genes of interest or to search the PubMed database for scientific literature of interest. There is also a database called the Bookshelf that includes searchable popular life science textbooks, medical and research reference books and NCBI reference materials. The Bookshelf can be useful for researchers and educators to find basic biological information. This article includes a representative list of the resources currently available on the Bookshelf, as well as instructions on how to access the information in these resources.  相似文献   

13.
SUMMARY: Bambino is a variant detector and graphical alignment viewer for next-generation sequencing data in the SAM/BAM format, which is capable of pooling data from multiple source files. The variant detector takes advantage of SAM-specific annotations, and produces detailed output suitable for genotyping and identification of somatic mutations. The assembly viewer can display reads in the context of either a user-provided or automatically generated reference sequence, retrieve genome annotation features from a UCSC genome annotation database, display histograms of non-reference allele frequencies, and predict protein-coding changes caused by SNPs. AVAILABILITY: Bambino is written in platform-independent Java and available from https://cgwb.nci.nih.gov/goldenPath/bamview/documentation/index.html, along with documentation and example data. Bambino may be launched online via Java Web Start or downloaded and run locally.  相似文献   

14.
VarSifter is a graphical software tool for desktop computers that allows investigators of varying computational skills to easily and quickly sort, filter, and sift through sequence variation data. A variety of filters and a custom query framework allow filtering based on any combination of sample and annotation information. By simplifying visualization and analyses of exome-scale sequence variation data, this program will help bring the power and promise of massively-parallel DNA sequencing to a broader group of researchers. Availability and Implementation: VarSifter is written in Java, and is freely available in source and binary versions, along with a User Guide, at http://research.nhgri.nih.gov/software/VarSifter/.  相似文献   

15.
Application of support vector machines for T-cell epitopes prediction   总被引:5,自引:0,他引:5  
MOTIVATION: The T-cell receptor, a major histocompatibility complex (MHC) molecule, and a bound antigenic peptide, play major roles in the process of antigen-specific T-cell activation. T-cell recognition was long considered exquisitely specific. Recent data also indicate that it is highly flexible, and one receptor may recognize thousands of different peptides. Deciphering the patterns of peptides that elicit a MHC restricted T-cell response is critical for vaccine development. RESULTS: For the first time we develop a support vector machine (SVM) for T-cell epitope prediction with an MHC type I restricted T-cell clone. Using cross-validation, we demonstrate that SVMs can be trained on relatively small data sets to provide prediction more accurate than those based on previously published methods or on MHC binding. SUPPLEMENTARY INFORMATION: Data for 203 synthesized peptides is available at http://linus.nci.nih.gov/Data/LAU203_Peptide.pdf  相似文献   

16.
17.
Combined analysis of the microarray and drug-activity datasets has the potential of revealing valuable knowledge about various relations among gene expressions and drug activities in the malignant cell. In this paper, we apply Bayesian networks, a tool for compact representation of the joint probability distribution, to such analysis. For the alleviation of data dimensionality problem, the huge datasets were condensed using a feature abstraction technique. The proposed analysis method was applied to the NCI60 dataset (http://discover.nci.nih.gov) consisting of gene expression profiles and drug activity patterns on human cancer cell lines. The Bayesian networks, learned from the condensed dataset, identified most of the salient pairwise correlations and some known relationships among several features in the original dataset, confirming the effectiveness of the proposed feature abstraction method. Also, a survey of the recent literature confirms the several relationships appearing in the learned Bayesian network to be biologically meaningful.  相似文献   

18.
The Comparative Mouse Genomics Centers Consortium (CMGCC) is a branch of the Environmental Genome Project sponsored by the National Institute of Environmental Health Sciences (NIEHS) focusing upon the identification of human single nucleotide polymorphisms (SNPs) that may confer disease susceptibility within the human population. The goal of the CMGCC (http://www.niehs.nih.gov/cmgcc/) is to make genetic mouse models for human SNPs within cell cycle control, DNA replication and DNA repair genes that may be associated with human pathologies. In order to facilitate information sharing and analysis within the consortium a set of informatics resources have been generated to support the mouse model development efforts. The primary entry point for information about the mouse models developed by the consortium is through the CMGCC Genotype Database (http://mrages.niehs.nih.gov/genotype/), which maintains both a consortium specific and public access display of the available and developing mouse models.  相似文献   

19.
NCBI's LocusLink and RefSeq   总被引:1,自引:0,他引:1       下载免费PDF全文
  相似文献   

20.
dbSNP: a database of single nucleotide polymorphisms   总被引:12,自引:0,他引:12       下载免费PDF全文
In response to a need for a general catalog of genome variation to address the large-scale sampling designs required by association studies, gene mapping and evolutionary biology, the National Cancer for Biotechnology Information (NCBI) has established the dbSNP database. Submissions to dbSNP will be integrated with other sources of information at NCBI such as GenBank, PubMed, LocusLink and the Human Genome Project data. The complete contents of dbSNP are available to the public at website: http://www.ncbi.nlm.nih.gov/SNP. Submitted SNPs can also be downloaded via anonymous FTP at ftp://ncbi.nlm.nih.gov/snp/  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号