首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
Microarray technology has become employed widely for biological researchers to identify genes associated with conditions such as diseases and drugs. To date, many methods have been developed to analyze data covering a large number of genes, but they focus only on statistical significance and cannot decipher the data with biological concepts. Gene Ontology (GO) is utilized to understand the data with biological interpretation; however, it is restricted to specific ontology such as biological process, molecular function, and cellular component. Here, we attempted to apply MeSH (Medical Subject Headings) to interpret groups of genes from biological viewpoint. To assign MeSH terms to genes, in this study, contexts associated with genes are retrieved from full set of MEDLINE data using machine learning, and then extracted MeSH terms from retrieved articles. Utilizing the developed method, we implemented a software called BioCompass. It generates high-scoring lists and hierarchical lists for diseases MeSH terms associated with groups of genes to utilize MeSH and GO tree, and illustrated a wiring diagram by linking genes with extracted association from articles. Researchers can easily retrieve genes and keywords of interest, such as diseases and drugs, associated with groups of genes. Using retrieved MeSH terms and OMIM in conjunction with, we could obtain more disease information associated with target gene. BioCompass helps researchers to interpret groups of genes such as microarray data from a biological viewpoint.  相似文献   

2.

Background  

Visualization of sequence annotation is a common feature in many bioinformatics tools. For many applications it is desirable to restrict the display of such annotation according to a score cutoff, as biological interpretation can be difficult in the presence of the entire data. Unfortunately, many visualisation solutions are somewhat static in the way they handle such score cutoffs.  相似文献   

3.
The exploding number of computational models produced by Systems Biologists over the last years is an invitation to structure and exploit this new wealth of information. Researchers would like to trace models relevant to specific scientific questions, to explore their biological content, to align and combine them, and to match them with experimental data. To automate these processes, it is essential to consider semantic annotations, which describe their biological meaning. As a prerequisite for a wide range of computational methods, we propose general and flexible similarity measures for Systems Biology models computed from semantic annotations. By using these measures and a large extensible ontology, we implement a platform that can retrieve, cluster, and align Systems Biology models and experimental data sets. At present, its major application is the search for relevant models in the BioModels Database, starting from initial models, data sets, or lists of biological concepts. Beyond similarity searches, the representation of models by semantic feature vectors may pave the way for visualisation, exploration, and statistical analysis of large collections of models and corresponding data.  相似文献   

4.
生物小分子microRNA可以对基因表达进行正向或负向调控,研究microRNA与基因之间的关系对于机体稳态的维持和疾病治疗都有着重要意义。利用深度学习方法对microRNA和基因靶向关系进行预测,提出了TransformerMGI模型。在特征工程阶段,针对生物序列潜在信息难以准确地提取这一问题,TransformerMGI模型分别采用了基于图卷积神经网络的GP-GCN方法和DNA2Vec模型对microRNA和基因数据的潜在信息进行提取,得到了二者的表征嵌入矩阵,在模型方面,TransformerMGI模型引入了幂归一化来改进经典的深度学习模型。利用microRNA和基因数据经过特征提取后得到两个表征矩阵,这两个矩阵分别被放入TransformerMGI模型中,通过TransformerMGI模型内部的Attention机制对二者自身和相互的特征信息进行了聚合和关联运算,最终预测出microRNA调控基因的概率。采用ROC曲线下面积和准确召回率曲线作为模型性能评价指标,将TransformerMGI与其他现有模型进行了比较评估。实验结果表明,TransformerMGI模型的AUC和AUPRC评分均可达0.91以上,优于现有的其他模型。TransformerMGI模型能在不考虑生物学原理和基因组背景的前提下,仅依赖microRNA和基因的碱基序列信息,实现microRNA靶向基因的预测,从而为后续的microRNA靶向基因预测研究提供了可借鉴的深度学习方法。  相似文献   

5.
MOTIVATION: Biological data come in very different shapes. Databanks are maintained and used by distinct organizations. Text is the de facto Standard exchange format. The SRS system can integrate heterogeneous textual databanks but it was lacking a way to structure the extracted data. RESULTS: This paper presents a CORBA interface to the SRS system which manages databanks in a flat file format. SRS Object Servers are CORBA wrappers for SRS. They allow client applications (visualisation tools, data mining tools, etc.) to access and query SRS servers remotely through an Object Request Broker (ORB). They provide loader objects that contain the information extracted from the databanks by SRS. Loader objects are not hard-coded but generated in a flexible way by using loader specifications which allow SRS administrators to package data coming from distinct databanks. AVAILABILITY: The prototype may be available for beta-testing. Please contact the SRS group (http://srs.ebi.ac.uk).  相似文献   

6.
With the explosive growth of biological data, the development of new means of data storage was needed. More and more often biological information is no longer published in the conventional way via a publication in a scientific journal, but only deposited into a database. In the last two decades these databases have become essential tools for researchers in biological sciences. Biological databases can be classified according to the type of information they contain. There are basically three types of sequence-related databases (nucleic acid sequences, protein sequences and protein tertiary structures) as well as various specialized data collections. It is important to provide the users of biomolecular databases with a degree of integration between these databases as by nature all of these databases are connected in a scientific sense and each one of them is an important piece to biological complexity. In this review we will highlight our effort in connecting biological information as demonstrated in the SWISS-PROT protein database.  相似文献   

7.
An integral part of functional genomics studies is to assess the enrichment of specific biological terms in lists of genes found to be playing an important role in biological phenomena. Contrasting the observed frequency of annotated terms with those of the background is at the core of overrepresentation analysis (ORA). Gene Ontology (GO) is a means to consistently classify and annotate gene products and has become a mainstay in ORA. Alternatively, Medical Subject Headings (MeSH) offers a comprehensive life science vocabulary including additional categories that are not covered by GO. Although MeSH is applied predominantly in human and model organism research, its full potential in livestock genetics is yet to be explored. In this study, MeSH ORA was evaluated to discern biological properties of identified genes and contrast them with the results obtained from GO enrichment analysis. Three published datasets were employed for this purpose, representing a gene expression study in dairy cattle, the use of SNPs for genome‐wide prediction in swine and the identification of genomic regions targeted by selection in horses. We found that several overrepresented MeSH annotations linked to these gene sets share similar concepts with those of GO terms. Moreover, MeSH yielded unique annotations, which are not directly provided by GO terms, suggesting that MeSH has the potential to refine and enrich the representation of biological knowledge. We demonstrated that MeSH can be regarded as another choice of annotation to draw biological inferences from genes identified via experimental analyses. When used in combination with GO terms, our results indicate that MeSH can enhance our functional interpretations for specific biological conditions or the genetic basis of complex traits in livestock species.  相似文献   

8.
9.
Oldham P  Hall S  Burton G 《PloS one》2012,7(4):e34368
This article uses data from Thomson Reuters Web of Science to map and analyse the scientific landscape for synthetic biology. The article draws on recent advances in data visualisation and analytics with the aim of informing upcoming international policy debates on the governance of synthetic biology by the Subsidiary Body on Scientific, Technical and Technological Advice (SBSTTA) of the United Nations Convention on Biological Diversity. We use mapping techniques to identify how synthetic biology can best be understood and the range of institutions, researchers and funding agencies involved. Debates under the Convention are likely to focus on a possible moratorium on the field release of synthetic organisms, cells or genomes. Based on the empirical evidence we propose that guidance could be provided to funding agencies to respect the letter and spirit of the Convention on Biological Diversity in making research investments. Building on the recommendations of the United States Presidential Commission for the Study of Bioethical Issues we demonstrate that it is possible to promote independent and transparent monitoring of developments in synthetic biology using modern information tools. In particular, public and policy understanding and engagement with synthetic biology can be enhanced through the use of online interactive tools. As a step forward in this process we make existing data on the scientific literature on synthetic biology available in an online interactive workbook so that researchers, policy makers and civil society can explore the data and draw conclusions for themselves.  相似文献   

10.
11.

Background

Meaningful exchange of microarray data is currently difficult because it is rare that published data provide sufficient information depth or are even in the same format from one publication to another. Only when data can be easily exchanged will the entire biological community be able to derive the full benefit from such microarray studies.

Results

To this end we have developed three key ingredients towards standardizing the storage and exchange of microarray data. First, we have created a minimal information for the annotation of a microarray experiment (MIAME)-compliant conceptualization of microarray experiments modeled using the unified modeling language (UML) named MAGE-OM (microarray gene expression object model). Second, we have translated MAGE-OM into an XML-based data format, MAGE-ML, to facilitate the exchange of data. Third, some of us are now using MAGE (or its progenitors) in data production settings. Finally, we have developed a freely available software tool kit (MAGE-STK) that eases the integration of MAGE-ML into end users' systems.

Conclusions

MAGE will help microarray data producers and users to exchange information by providing a common platform for data exchange, and MAGE-STK will make the adoption of MAGE easier.  相似文献   

12.
The strength of the rat as a model organism lies in its utility in pharmacology, biochemistry and physiology research. Data resulting from such studies is difficult to represent in databases and the creation of user-friendly data mining tools has proved difficult. The Rat Genome Database has developed a comprehensive ontology-based data structure and annotation system to integrate physiological data along with environmental and experimental factors, as well as genetic and genomic information. RGD uses multiple ontologies to integrate complex biological information from the molecular level to the whole organism, and to develop data mining and presentation tools. This approach allows RGD to indicate not only the phenotypes seen in a strain but also the specific values under each diet and atmospheric condition, as well as gender differences. Harnessing the power of ontologies in this way allows the user to gather and filter data in a customized fashion, so that a researcher can retrieve all phenotype readings for which a high hypoxia is a factor. Utilizing the same data structure for expression data, pathways and biological processes, RGD will provide a comprehensive research platform which allows users to investigate the conditions under which biological processes are altered and to elucidate the mechanisms of disease.  相似文献   

13.
Harrington ED  Jensen LJ  Bork P 《FEBS letters》2008,582(8):1251-1258
Continuing improvements in DNA sequencing technologies are providing us with vast amounts of genomic data from an ever-widening range of organisms. The resulting challenge for bioinformatics is to interpret this deluge of data and place it back into its biological context. Biological networks provide a conceptual framework with which we can describe part of this context, namely the different interactions that occur between the molecular components of a cell. Here, we review the computational methods available to predict biological networks from genomic sequence data and discuss how they relate to high-throughput experimental methods.  相似文献   

14.
15.
16.
Biological data, and particularly annotation data, are increasingly being represented in directed acyclic graphs (DAGs). However, while relevant biological information is implicit in the links between multiple domains, annotations from these different domains are usually represented in distinct, unconnected DAGs, making links between the domains represented difficult to determine. We develop a novel family of general statistical tests for the discovery of strong associations between two directed acyclic graphs. Our method takes the topology of the input graphs and the specificity and relevance of associations between nodes into consideration. We apply our method to the extraction of associations between biomedical ontologies in an extensive use-case. Through a manual and an automatic evaluation, we show that our tests discover biologically relevant relations. The suite of statistical tests we develop for this purpose is implemented and freely available for download.  相似文献   

17.
MOTIVATION: Many models and analysis of signaling pathways have been proposed. However, neither of them takes into account that a biological pathway is not a fixed system, but instead it depends on the organism, tissue and cell type as well as on physiological, pathological and experimental conditions. RESULTS: The Biological Connection Markup Language (BCML) is a format to describe, annotate and visualize pathways. BCML is able to store multiple information, permitting a selective view of the pathway as it exists and/or behave in specific organisms, tissues and cells. Furthermore, BCML can be automatically converted into data formats suitable for analysis and into a fully SBGN-compliant graphical representation, making it an important tool that can be used by both computational biologists and 'wet lab' scientists. Availability and implementation: The XML schema and the BCML software suite are freely available under the LGPL for download at http://bcml.dc-atlas.net. They are implemented in Java and supported on MS Windows, Linux and OS X.  相似文献   

18.
The significance of co‐evolution over ecological timescales is well established, yet it remains unclear to what extent co‐evolutionary processes contribute to driving large‐scale evolutionary and ecological changes over geological timescales. Some of the most intriguing and pervasive long‐term co‐evolutionary hypotheses relate to proposed interactions between herbivorous non‐avian dinosaurs and Mesozoic plants, including cycads. Dinosaurs have been proposed as key dispersers of cycad seeds during the Mesozoic, and temporal variation in cycad diversity and abundance has been linked to dinosaur faunal changes. Here we assess the evidence for proposed hypotheses of trophic and evolutionary interactions between these two groups using diversity analyses, a new database of Cretaceous dinosaur and plant co‐occurrence data, and a geographical information system (GIS) as a visualisation tool. Phylogenetic evidence suggests that the origins of several key biological properties of cycads (e.g. toxins, bright‐coloured seeds) likely predated the origin of dinosaurs. Direct evidence of dinosaur–cycad interactions is lacking, but evidence from extant ecosystems suggests that dinosaurs may plausibly have acted as seed dispersers for cycads, although it is likely that other vertebrate groups (e.g. birds, early mammals) also played a role. Although the Late Triassic radiations of dinosaurs and cycads appear to have been approximately contemporaneous, few significant changes in dinosaur faunas coincide with the late Early Cretaceous cycad decline. No significant spatiotemporal associations between particular dinosaur groups and cycads can be identified – GIS visualisation reveals disparities between the spatiotemporal distributions of some dinosaur groups (e.g. sauropodomorphs) and cycads that are inconsistent with co‐evolutionary hypotheses. The available data provide no unequivocal support for any of the proposed co‐evolutionary interactions between cycads and herbivorous dinosaurs – diffuse co‐evolutionary scenarios that are proposed to operate over geological timescales are plausible, but such hypotheses need to be firmly grounded on direct evidence of interaction and may be difficult to support given the patchiness of the fossil record.  相似文献   

19.
Biological data,represented by the data from omics platforms,are accumulating exponentially.As some other data-intensive scientific disciplines such as high-energy physics,climatology,meteorology,geology,geography and environmental sciences,modern life sciences have entered the information-rich era,the era of the 4th paradigm.The creation of Chinese information engineering infrastructure for pan-omics studies(CIEIPOS) has been long overdue as part of national scientific infrastructure,in accelerating the further development of Chinese life sciences,and translating rich data into knowledge and medical applications.By gathering facts of current status of international and Chinese bioinformatics communities in collecting,managing and utilizing biological data,the essay stresses the significance and urgency to create a ’data hub’ in CIEIPOS,discusses challenges and possible solutions to integrate,query and visualize these data.Another important component of CIEIPOS,which is not part of traditional biological data centers such as NCBI and EBI,is omics informatics.Mass spectroscopy platform was taken as an example to illustrate the complexity of omics informatics.Its heavy dependency on computational power is highlighted.The demand for such power in omics studies is argued as the fundamental function to meet for CIEIPOS.Implementation outlook of CIEIPOS in hardware and network is discussed.  相似文献   

20.
Biological specificity is usually described in terms of the lock-and-key metaphor. However, this metaphor is to a certain extent misleading and does not grasp the complexity underlying biological specificity. The failure of the lock-and-key metaphor makes it difficult to understand immune recognition. This is the reason why immune specificity has been described as the "Specificity Enigma." In this article, I point at three important differences between biological specificity and mechanical specificity, and suggest an alternative lens through which immune specificity can be considered.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号