首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
Systematic extraction of relevant biological facts from available massive scientific knowledge source is emerging as a significant task for the science community. Its success depends on several key factors, including the precision of a given search, the time of its accomplishment, and the communicative prowess of the mined information to the users. GeneCite - a stand-alone Java-based high-throughput data mining tool - is designed to carry out these tasks for several important knowledge sources simultaneously, allowing the users to integrate the results and interpret biological significance in a time-efficient manner. GeneCite provides an integrated high-throughput search platform serving as an information retrieval (IR) tool for probing online literature database (PubMed) and the sequence-tagged sites' database (UniSTS), respectively. It also operates as a data retrieval (DR) tool to mine an archive of biological pathways integrated into the software itself. Furthermore, GeneCite supports a retrieved data management system (DMS) showcasing the final output in a spread-sheet format. Each cell of the output file holds a real-time connection (hyperlink) to the given online archive reachable at the users' convenience. The software is free and currently available online www.bioinformatics.org; www.wrair.army.mil/Resources.  相似文献   

2.
A web-based version of the RLIMS-P literature mining system was developed for online mining of protein phosphorylation information from MEDLINE abstracts. The online tool presents extracted phosphorylation objects (phosphorylated proteins, phosphorylation sites and protein kinases) in summary tables and full reports with evidence-tagged abstracts. The tool further allows mapping of phosphorylated proteins to protein entries in the UniProt Knowledgebase based on PubMed ID and/or protein name. The literature mining, coupled with database association, allows retrieval of rich biological information for the phosphorylated proteins and facilitates database annotation of phosphorylation features.  相似文献   

3.
DAGchainer: a tool for mining segmental genome duplications and synteny   总被引:8,自引:0,他引:8  
SUMMARY: Given the positions of protein-coding genes along genomic sequence and probability values for protein alignments between genes, DAGchainer identifies chains of gene pairs sharing conserved order between genomic regions, by identifying paths through a directed acyclic graph (DAG). These chains of collinear gene pairs can represent segmentally duplicated regions and genes within a single genome or syntenic regions between related genomes. Automated mining of the Arabidopsis genome for segmental duplications illustrates the use of DAGchainer.  相似文献   

4.
MBA: a literature mining system for extracting biomedical abbreviations   总被引:1,自引:0,他引:1  

Background  

The exploding growth of the biomedical literature presents many challenges for biological researchers. One such challenge is from the use of a great deal of abbreviations. Extracting abbreviations and their definitions accurately is very helpful to biologists and also facilitates biomedical text analysis. Existing approaches fall into four broad categories: rule based, machine learning based, text alignment based and statistically based. State of the art methods either focus exclusively on acronym-type abbreviations, or could not recognize rare abbreviations. We propose a systematic method to extract abbreviations effectively. At first a scoring method is used to classify the abbreviations into acronym-type and non-acronym-type abbreviations, and then their corresponding definitions are identified by two different methods: text alignment algorithm for the former, statistical method for the latter.  相似文献   

5.
Sequence variants, in particular single nucleotide polymorphisms (SNPs), are key elements for the identification of genes associated with complex diseases and with particular drug responses. The search for literature about sequence variation is hampered by the large number of allelic variants reported for many genes and by the variability in both gene and sequence variants nomenclatures. We describe OSIRIS, a search tool that integrates different sources of information with the aim to retrieve literature about sequence variation of a gene. In addition, it provides a method to link a dbSNP entry with the articles referring to it. AVAILABILITY: OSIRIS is available for public use at http://ibi.imim.es/  相似文献   

6.
CressExpress is a user-friendly, online, coexpression analysis tool for Arabidopsis (Arabidopsis thaliana) microarray expression data that computes patterns of correlated expression between user-entered query genes and the rest of the genes in the genome. Unlike other coexpression tools, CressExpress allows characterization of tissue-specific coexpression networks through user-driven filtering of input data based on sample tissue type. CressExpress also performs pathway-level coexpression analysis on each set of query genes, identifying and ranking genes based on their common connections with two or more query genes. This allows identification of novel candidates for involvement in common processes and functions represented by the query group. Users launch experiments using an easy-to-use Web-based interface and then receive the full complement of results, along with a record of tool settings and parameters, via an e-mail link to the CressExpress Web site. Data sets featured in CressExpress are strictly versioned and include expression data from MAS5, GCRMA, and RMA array processing algorithms. To demonstrate applications for CressExpress, we present coexpression analyses of cellulose synthase genes, indolic glucosinolate biosynthesis, and flowering. We show that subselecting sample types produces a richer network for genes involved in flowering in Arabidopsis. CressExpress provides direct access to expression values via an easy-to-use URL-based Web service, allowing users to determine quickly if their query genes are coexpressed with each other and likely to yield informative pathway-level coexpression results. The tool is available at http://www.cressexpress.org.  相似文献   

7.
Proton Transfer Reaction-Mass Spectrometry (PTR-MS) in its recently developed implementation based on a time-of-flight mass spectrometer (PTR-ToF-MS) has been evaluated as a possible tool for rapid non-destructive investigation of the volatile compounds present in the metabolome of apple cultivars and clones. Clone characterization is a cutting-edge problem in technical management and royalty application, not only for apple, aiming at unveiling real properties which differentiate the mutated individuals. We show that PTR-ToF-MS coupled with multivariate and data mining methods may successfully be employed to obtain accurate varietal and clonal apple fingerprint. In particular, we studied the VOC emission profile of five different clones belonging to three well known apple cultivars, such as ??Fuji??, ??Golden Delicious?? and ??Gala??. In all three cases it was possible to set classification models which can distinguish all cultivars and some of the clones considered in this study. Furthermore, in the case of ??Gala?? we also identified estragole and hexyl 2-methyl butanoate contributing to such clone characterization. Beside its applied relevance, no data on the volatile profiling of apple clones are available so far, our study indicates the general viability of a metabolomic approach for volatile compounds in fruit based on rapid PTR-ToF-MS fingerprinting.  相似文献   

8.
9.
The use of quantitative PCR is recommended to monitor the level of residual hematological malignancies. The proposed multiplex IgH/ras PCR uses a co-amplification of the clonal CDR3 rearrangement of the immunoglobulin heavy chain gene (IgH) as a disease marker and a segment of the Hras 1 gene containing codon 61 (ras) as a control gene. Serial dilutions of stored diagnostic DNAs are examined together in the same PCR at a sub-plateau phase and, after analysis by densitometry, the amount of CDR3 product is related to the ras product. An increase of this ratio at comparable amounts of DNA is viewed as an increase of malignant cells. This endpoint PCR quantifying approach appears to be applicable in monitoring B-lymphoproliferative disorders as was shown to be true in B-cell non-Hodgkin's lymphoma and may provide information on disease activity and treatment outcome.  相似文献   

10.

Background  

Over the past two decades more than fifty thousand unique clinical and biological samples have been assayed using the Affymetrix HG-U133 and HG-U95 GeneChip microarray platforms. This substantial repository has been used extensively to characterize changes in gene expression between biological samples, but has not been previously mined en masse for changes in mRNA processing. We explored the possibility of using HG-U133 microarray data to identify changes in alternative mRNA processing in several available archival datasets.  相似文献   

11.
12.

Background  

Extracting and visualizing of protein-protein interaction (PPI) from text literatures are a meaningful topic in protein science. It assists the identification of interactions among proteins. There is a lack of tools to extract PPI, visualize and classify the results.  相似文献   

13.
SUMMARY: An efficient tool for mining complex inbred genealogies that identify clusters of individuals sharing the same expected amount of relatedness is described. Additionally it allows for the reconstruction of sub-pedigrees suitable for genetic mapping in a systematic way. AVAILABILITY: http://www.jenti.org.  相似文献   

14.
15.
Partitioning closely related genes into clusters has become an important element of practically all statistical analyses of microarray data. A number of computer algorithms have been developed for this task. Although these algorithms have demonstrated their usefulness for gene clustering, some basic problems remain. This paper describes our work on extracting functional keywords from MEDLINE for a set of genes that are isolated for further study from microarray experiments based on their differential expression patterns. The sharing of functional keywords among genes is used as a basis for clustering in a new approach called BEA-PARTITION in this paper. Functional keywords associated with genes were extracted from MEDLINE abstracts. We modified the Bond Energy Algorithm (BEA), which is widely accepted in psychology and database design but is virtually unknown in bioinformatics, to cluster genes by functional keyword associations. The results showed that BEA-PARTITION and hierarchical clustering algorithm outperformed k-means clustering and self-organizing map by correctly assigning 25 of 26 genes in a test set of four known gene groups. To evaluate the effectiveness of BEA-PARTITION for clustering genes identified by microarray profiles, 44 yeast genes that are differentially expressed during the cell cycle and have been widely studied in the literature were used as a second test set. Using established measures of cluster quality, the results produced by BEA-PARTITION had higher purity, lower entropy, and higher mutual information than those produced by k-means and self-organizing map. Whereas BEA-PARTITION and the hierarchical clustering produced similar quality of clusters, BEA-PARTITION provides clear cluster boundaries compared to the hierarchical clustering. BEA-PARTITION is simple to implement and provides a powerful approach to clustering genes or to any clustering problem where starting matrices are available from experimental observations.  相似文献   

16.

Background  

Allium sativum., commonly known as garlic, is a species in the onion genus (Allium), which is a large and diverse one containing over 1,250 species. Its close relatives include chives, onion, leek and shallot. Garlic has been used throughout recorded history for culinary, medicinal use and health benefits. Currently, the interest in garlic is highly increasing due to nutritional and pharmaceutical value including high blood pressure and cholesterol, atherosclerosis and cancer. For all that, there are no comprehensive databases available for Expressed Sequence Tags(EST) of garlic for gene discovery and future efforts of genome annotation. That is why we developed a new garlic database and applications to enable comprehensive analysis of garlic gene expression.  相似文献   

17.
梁群  王琪  张秀清  汪建 《生物信息学》2010,8(2):150-152,155
单碱基突变的筛选和分类是SNP分析的基础。为解决手工进行突变位点挖掘工作的困难,编写了VersusSNP软件。它可以解析并过滤序列比对结果,并根据突变类型将位点加以分类,以图形界面呈现给用户。使用VersusSNP,用户可以直观地了解基因组中单碱基突变的情况。其程序及源代码可以从http://sourceforge.net/projects/versussnp下载。  相似文献   

18.
Amyloid fibril forming regions in protein sequences are associated with a number of diseases. Experimental evidences compel in favor of the hypothesis that short motif regions are responsible for its amyloidogenic behavior. Thus, identifying these short peptides is critical in understanding the cause of diseases associated with aggregation of proteins and developing sequencetargeted anti-aggregation drugs. Owing to the constraints of wet lab molecular techniques for the identification of amyloid fibril forming targets, computational methods are implemented to offer better and affordable in silico predictions. The present study takes into consideration an assessment and perspective of the recent tools available for predicting a peptide status: amyloidogenic or non-amyloidogenic. To the best of our knowledge, the existing review articles on amyloidogenic prediction tools have not touched upon their effectiveness in terms of true positive rates or false positive rates. In this work, we compare few tools such as Aggrescan, Amylpred and FoldAmyloid to evaluate the performance of their predictability based on the experimentally proved data in terms of specificity, sensitivity, Matthews Correlation Coefficient and Balanced accuracy. As evident from the results, a significant reduction of sensitivity associated with a gain in specificity is noted in all the tools considered under the present study.  相似文献   

19.
20.
Biological processes involve complex networks of interactions between molecules. Various large-scale experiments and curation efforts have led to preliminary versions of complete cellular networks for a number of organisms. To grapple with these networks, we developed TopNet-like Yale Network Analyzer (tYNA), a Web system for managing, comparing and mining multiple networks, both directed and undirected. tYNA efficiently implements methods that have proven useful in network analysis, including identifying defective cliques, finding small network motifs (such as feed-forward loops), calculating global statistics (such as the clustering coefficient and eccentricity), and identifying hubs and bottlenecks. It also allows one to manage a large number of private and public networks using a flexible tagging system, to filter them based on a variety of criteria, and to visualize them through an interactive graphical interface. A number of commonly used biological datasets have been pre-loaded into tYNA, standardized and grouped into different categories. AVAILABILITY: The tYNA system can be accessed at http://networks.gersteinlab.org/tyna. The source code, JavaDoc API and WSDL can also be downloaded from the website. tYNA can also be accessed from the Cytoscape software using a plugin.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号