首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 33 毫秒
1.
The diversity of online resources storing biological data in different formats provides a challenge for bioinformaticians to integrate and analyse their biological data. The semantic web provides a standard to facilitate knowledge integration using statements built as triples describing a relation between two objects. WikiPathways, an online collaborative pathway resource, is now available in the semantic web through a SPARQL endpoint at http://sparql.wikipathways.org. Having biological pathways in the semantic web allows rapid integration with data from other resources that contain information about elements present in pathways using SPARQL queries. In order to convert WikiPathways content into meaningful triples we developed two new vocabularies that capture the graphical representation and the pathway logic, respectively. Each gene, protein, and metabolite in a given pathway is defined with a standard set of identifiers to support linking to several other biological resources in the semantic web. WikiPathways triples were loaded into the Open PHACTS discovery platform and are available through its Web API (https://dev.openphacts.org/docs) to be used in various tools for drug development. We combined various semantic web resources with the newly converted WikiPathways content using a variety of SPARQL query types and third-party resources, such as the Open PHACTS API. The ability to use pathway information to form new links across diverse biological data highlights the utility of integrating WikiPathways in the semantic web.  相似文献   

2.
The National Agricultural Biotechnology Information Center (NABIC) in South Korea reconstructed a RiceQTLPro database for gene positional analysis and structure prediction of the chromosomes. This database is an integrated web-based system providing information about quantitative trait loci (QTL) markers in rice plant. The RiceQTLPro has the three main features namely, (1) QTL markers list, (2) searching of markers using keyword, and (3) searching of marker position on the rice chromosomes. This updated database provides 112 QTL markers information with 817 polymorphic markers on each of the 12 chromosomes in rice.

Availability

The database is available for free at http://nabic.rda.go.kr/gere/rice/geneticMap/  相似文献   

3.
We present Quip, a lossless compression algorithm for next-generation sequencing data in the FASTQ and SAM/BAM formats. In addition to implementing reference-based compression, we have developed, to our knowledge, the first assembly-based compressor, using a novel de novo assembly algorithm. A probabilistic data structure is used to dramatically reduce the memory required by traditional de Bruijn graph assemblers, allowing millions of reads to be assembled very efficiently. Read sequences are then stored as positions within the assembled contigs. This is combined with statistical compression of read identifiers, quality scores, alignment information and sequences, effectively collapsing very large data sets to <15% of their original size with no loss of information. Availability: Quip is freely available under the 3-clause BSD license from http://cs.washington.edu/homes/dcjones/quip.  相似文献   

4.
Agricultural soils have tremendous potential to sequester soil organic carbon (SOC) and mitigate global climate change. However, agricultural land use has a profound impact on SOC dynamics, and few studies have explored how agricultural land use combined with soil conditions affect SOC changes throughout the soil profile. Based on a paired soil resampling campaign in the 1980s and 2010s, this study investigated the SOC changes of the soil profile caused by agricultural land use and the correlations with parent material and topography across the Chengdu Plain of China. The results showed that the SOC content increased by 3.78 g C/kg in the topsoil (0–20 cm), but decreased in the 20–40 cm and 40–60 cm soil layers by 0.90 and 1.26 g C/kg respectively. SOC increases in topsoil were observed for all types of agricultural land. Afforestation on former agricultural land also caused SOC decreases in the 20–60 cm soil layers, while SOC decreases only occurred in the 40–60 cm soil layer for agricultural land using a traditional crop rotation (i.e. traditional rice–wheat/rapeseed rotation) and with rice–vegetable rotations converted from the traditional rotations. For each agricultural land use, SOC decreases in deep soils only occurred in high relief areas and in soils formed from Q4 (Quaternary Holocene) grey‐brown alluvium and Q4 grey alluvium that had a relatively low soil bulk density and clay content. The results indicated that SOC change caused by agricultural land use was depth dependent and that the effects of agricultural land use on soil profile SOC dynamics varied with soil characteristics and topography. Subsoil SOC decreases were more likely to occur in high relief areas and in soils with low soil bulk density and low clay content.  相似文献   

5.
Advances in the development of bioinformatic tools continue to improve investigators’ ability to interrogate, organize, and derive knowledge from large amounts of heterogeneous information. These tools often require advanced technical skills not possessed by life scientists. User-friendly, low-barrier-to-entry methods of visualizing nutrigenomics information are yet to be developed. We utilized concept mapping software from the Institute for Human and Machine Cognition to create a conceptual model of diet and health-related data that provides a foundation for future nutrigenomics ontologies describing published nutrient–gene/polymorphism–phenotype data. In this model, maps containing phenotype, nutrient, gene product, and genetic polymorphism interactions are visualized as triples of two concepts linked together by a linking phrase. These triples, or “knowledge propositions,” contextualize aggregated data and information into easy-to-read knowledge maps. Maps of these triples enable visualization of genes spanning the One-Carbon Metabolism (OCM) pathway, their sequence variants, and multiple literature-mined associations including concepts relevant to nutrition, phenotypes, and health. The concept map development process documents the incongruity of information derived from pathway databases versus literature resources. This conceptual model highlights the importance of incorporating information about genes in upstream pathways that provide substrates, as well as downstream pathways that utilize products of the pathway under investigation, in this case OCM. Other genes and their polymorphisms, such as TCN2 and FUT2, although not directly involved in OCM, potentially alter OCM pathway functionality. These upstream gene products regulate substrates such as B12. Constellations of polymorphisms affecting the functionality of genes along OCM, together with substrate and cofactor availability, may impact resultant phenotypes. These conceptual maps provide a foundational framework for development of nutrient–gene/polymorphism–phenotype ontologies and systems visualization.  相似文献   

6.
高梅香  朱家祺  刘爽  程鑫  刘冬  李彦胜 《生态学报》2023,43(16):6862-6877
土壤动物学面临以全新知识体系为科学研究框架的变革时期,其核心内容是以数据驱动为主要特征的人工智能技术方法。目前广泛应用的基于数据库的数据处理分析方法,面临着数据多源异构、快速增长和处理能力不足之间的矛盾。基于快速发展的大数据科学和人工智能技术的数据挖掘方法在解决前述矛盾中有突出优势,但需要依赖一个强大的领域知识库,然而土壤动物领域知识图谱的研究十分匮乏。土壤动物知识图谱是一个具有有向图结构的知识库,其中图的节点代表与土壤动物相关的实体或概念,图的边代表实体或概念之间的各种语义关系。提出了土壤动物知识图谱的定义、内涵、理论模型和构建方法,以浙江天目山土壤螨类多样性为例,分析了构建山地土壤动物知识图谱的技术方法;以土壤动物多样性研究关注的物种分布、物种共存、环境条件对物种的影响作用为例,探讨了基于山地土壤动物知识图谱可以解决的相关科学问题。研究表明,土壤动物知识图谱在解决生物多样性重要科学问题方面具有独特的潜力和优势,有力推动了土壤动物学、信息科学和数据科学交叉的土壤动物信息学的发展。  相似文献   

7.
The National Agricultural Biotechnology Information Center (NABIC) reconstructed an AllergenPro database for allergenic proteins analysis and allergenicity prediction. The AllergenPro is an integrated web-based system providing information about allergen in foods, microorganisms, animals and plants. The allergen database has the three main features namely, (1) allergen list with epitopes, (2) searching of allergen using keyword, and (3) methods for allergenicity prediction. This updated AllergenPro outputs the search based allergen information through a user-friendly web interface, and users can run tools for allergenicity prediction using three different methods namely, (1) FAO/WHO, (2) motif-based and (3) epitope-based methods.

Availability

The database is available for free at http://nabic.rda.go.kr/allergen/  相似文献   

8.
乡村振兴是实现农业农村现代化的必由之路,农业高校承担着服务乡村振兴人才培养的历史使命。其中,生物工程、生物技术、生物科学专业(简称“三生”专业)人才在乡村振兴战略中起着重要的支撑作用。然而,农业高校“三生”专业人才培养方面存在学生知农爱农情怀待加强、学生自主选修空间需扩大、实践教学要强化等问题。本文以华中农业大学为例,提出新时代加强“三生”专业建设的举措,如加强情怀教育、重构课程体系、推进协同育人、加强师资建设、建立质量文化等,以培养高素质的“三生”专业人才,服务乡村振兴战略。  相似文献   

9.
我国新时代十年是生态环境保护认识最深、力度最大、举措最实、推进最快、成效最显著的十年。生态环境治理取得成效的同时,管理措施也逐步成熟和规范化,相关生态管理知识成果的文本、视频、照片等多模态数据也日益丰厚。采用先进的知识图谱理念创新我国生态环境保护工作,对未来助力打赢污染防治攻坚战,构建现代环境治理体系具有重要意义。聚焦我国美丽中国和生态文明建设工程领域,将典型污染防治攻坚战、生态恢复工程多模态素材作为数据源,通过数据整合、知识抽取、知识融合后形成标准知识表述,构建生态管理知识图谱体系。具体包括(1)定量分析深圳市"散乱污"企业整治成功案例数据,抽取管理主体、管理对象等实体,挖掘其空间特征、污染特征、治理效果关系;(2)关联分析企业驻点、污染物热点和城市空间相互关系;(3)通过我国典型生态环境损害赔偿案件中的"实施行为-破坏对象-损害功能"特定关系分析,抽取"生态治理行为--受影响环境要素--生态服务提升程度"生态环境管理知识图谱;(4)最终形成了整合"散乱污"治理、生态环境治理行为的综合性生态管理知识图谱,构建了包含12类本体、82个实体,4类、201条关系的图数据库。研究表明,通过污染防治攻坚战成功案例、生态恢复工程成效的多模态数据构建我国生态管理知识图谱,能够形成贴近现实需求的知识体系,有助于依法治污、科学治污和精准治污全过程;也有助于生态环境损害鉴定评估工作中的"多因一果"和"一因多果"分析。建议未来加大生态管理知识图谱的应用,精准识别管理对象、实现科学分析与智能决策,促进公众参与生态管理和加快生态产品价值实现。  相似文献   

10.
Short-chain dehydrogenase Gox2181 from Gluconobacter oxydans catalyzes the reduction of 2,3-pentanedione by using NADH as the physiological electron donor. To realize its synthetic biological application for coenzyme recycling use, computational design and site-directed mutagenesis have been used to engineer Gox2181 to utilize not only NADH but also NADPH as the electron donor. Single and double mutations at residues Q20 and D43 were made in a recombinant expression system that corresponded to Gox2181-D43Q and Gox2181-Q20R&D43Q, respectively. The design of mutant Q20R not only resolved the hydrogen bond interaction and electrostatic interaction between R and 2′-phosphate of NADPH, but also could enhance the binding with 2′-phophated of NADPH by combining with D43Q. Molecular dynamics simulation has been carried out to testify the hydrogen bond interactions between mutation sites and 2′-phosphate of NADPH. Steady-state turnover measurement results indicated that Gox2181-D43Q could use both NADH and NADPH as its coenzyme, and so could Gox2181-Q20R&D43Q. Meanwhile, compared to the wild-type enzyme, Gox2181-D43Q exhibited dramatically reduced enzymatic activity while Gox2181-Q20R&D43Q successfully retained the majority of enzymatic activity.  相似文献   

11.
In 2013, National Agricultural Biotechnology Information Center (NABIC) reconstructs a molecular marker database for useful genetic resources. The web-based marker database consists of three major functional categories: map viewer, RSN marker and gene annotation. It provides 7250 marker locations, 3301 RSN marker property, 3280 molecular marker annotation information in agricultural plants. The individual molecular marker provides information such as marker name, expressed sequence tag number, gene definition and general marker information. This updated marker-based database provides useful information through a user-friendly web interface that assisted in tracing any new structures of the chromosomes and gene positional functions using specific molecular markers.

Availability

The database is available for free at http://nabic.rda.go.kr/gere/rice/molecularMarkers/  相似文献   

12.
Medical forms are very heterogeneous: on a European scale there are thousands of data items in several hundred different systems. To enable data exchange for clinical care and research purposes there is a need to develop interoperable documentation systems with harmonized forms for data capture. A prerequisite in this harmonization process is comparison of forms. So far – to our knowledge – an automated method for comparison of medical forms is not available. A form contains a list of data items with corresponding medical concepts. An automatic comparison needs data types, item names and especially item with these unique concept codes from medical terminologies. The scope of the proposed method is a comparison of these items by comparing their concept codes (coded in UMLS). Each data item is represented by item name, concept code and value domain. Two items are called identical, if item name, concept code and value domain are the same. Two items are called matching, if only concept code and value domain are the same. Two items are called similar, if their concept codes are the same, but the value domains are different. Based on these definitions an open-source implementation for automated comparison of medical forms in ODM format with UMLS-based semantic annotations was developed. It is available as package compareODM from http://cran.r-project.org. To evaluate this method, it was applied to a set of 7 real medical forms with 285 data items from a large public ODM repository with forms for different medical purposes (research, quality management, routine care). Comparison results were visualized with grid images and dendrograms. Automated comparison of semantically annotated medical forms is feasible. Dendrograms allow a view on clustered similar forms. The approach is scalable for a large set of real medical forms.  相似文献   

13.
Graphlets are small subgraphs, usually containing up to five vertices, that can be found in a larger graph. Identification of the graphlets that a vertex in an explored graph touches can provide useful information about the local structure of the graph around that vertex. Actually finding all graphlets in a large graph can be time-consuming, however. As the graphlets grow in size, more different graphlets emerge and the time needed to find each graphlet also scales up. If it is not needed to find each instance of each graphlet, but knowing the number of graphlets touching each node of the graph suffices, the problem is less hard. Previous research shows a way to simplify counting the graphlets: instead of looking for the graphlets needed, smaller graphlets are searched, as well as the number of common neighbors of vertices. Solving a system of equations then gives the number of times a vertex is part of each graphlet of the desired size. However, until now, equations only exist to count graphlets with 4 or 5 nodes. In this paper, two new techniques are presented. The first allows to generate the equations needed in an automatic way. This eliminates the tedious work needed to do so manually each time an extra node is added to the graphlets. The technique is independent on the number of nodes in the graphlets and can thus be used to count larger graphlets than previously possible. The second technique gives all graphlets a unique ordering which is easily extended to name graphlets of any size. Both techniques were used to generate equations to count graphlets with 4, 5 and 6 vertices, which extends all previous results. Code can be found at https://github.com/IneMelckenbeeck/equation-generator and https://github.com/IneMelckenbeeck/graphlet-naming.  相似文献   

14.
Data presentation for scientific publications in small sample size studies has not changed substantially in decades. It relies on static figures and tables that may not provide sufficient information for critical evaluation, particularly of the results from small sample size studies. Interactive graphics have the potential to transform scientific publications from static reports of experiments into interactive datasets. We designed an interactive line graph that demonstrates how dynamic alternatives to static graphics for small sample size studies allow for additional exploration of empirical datasets. This simple, free, web-based tool (http://statistika.mfub.bg.ac.rs/interactive-graph/) demonstrates the overall concept and may promote widespread use of interactive graphics.  相似文献   

15.
Accurate identification of DNA polymorphisms using next-generation sequencing technology is challenging because of a high rate of sequencing error and incorrect mapping of reads to reference genomes. Currently available short read aligners and DNA variant callers suffer from these problems. We developed the Coval software to improve the quality of short read alignments. Coval is designed to minimize the incidence of spurious alignment of short reads, by filtering mismatched reads that remained in alignments after local realignment and error correction of mismatched reads. The error correction is executed based on the base quality and allele frequency at the non-reference positions for an individual or pooled sample. We demonstrated the utility of Coval by applying it to simulated genomes and experimentally obtained short-read data of rice, nematode, and mouse. Moreover, we found an unexpectedly large number of incorrectly mapped reads in ‘targeted’ alignments, where the whole genome sequencing reads had been aligned to a local genomic segment, and showed that Coval effectively eliminated such spurious alignments. We conclude that Coval significantly improves the quality of short-read sequence alignments, thereby increasing the calling accuracy of currently available tools for SNP and indel identification. Coval is available at http://sourceforge.net/projects/coval105/.  相似文献   

16.
Electronic health records and scientific articles possess differing linguistic characteristics that may impact the performance of natural language processing tools developed for one or the other. In this paper, we investigate the performance of four extant concept recognition tools: the clinical Text Analysis and Knowledge Extraction System (cTAKES), the National Center for Biomedical Ontology (NCBO) Annotator, the Biomedical Concept Annotation System (BeCAS) and MetaMap. Each of the four concept recognition systems is applied to four different corpora: the i2b2 corpus of clinical documents, a PubMed corpus of Medline abstracts, a clinical trails corpus and the ShARe/CLEF corpus. In addition, we assess the individual system performances with respect to one gold standard annotation set, available for the ShARe/CLEF corpus. Furthermore, we built a silver standard annotation set from the individual systems’ output and assess the quality as well as the contribution of individual systems to the quality of the silver standard. Our results demonstrate that mainly the NCBO annotator and cTAKES contribute to the silver standard corpora (F1-measures in the range of 21% to 74%) and their quality (best F1-measure of 33%), independent from the type of text investigated. While BeCAS and MetaMap can contribute to the precision of silver standard annotations (precision of up to 42%), the F1-measure drops when combined with NCBO Annotator and cTAKES due to a low recall. In conclusion, the performances of individual systems need to be improved independently from the text types, and the leveraging strategies to best take advantage of individual systems’ annotations need to be revised. The textual content of the PubMed corpus, accession numbers for the clinical trials corpus, and assigned annotations of the four concept recognition systems as well as the generated silver standard annotation sets are available from http://purl.org/phenotype/resources. The textual content of the ShARe/CLEF (https://sites.google.com/site/shareclefehealth/data) and i2b2 (https://i2b2.org/NLP/DataSets/) corpora needs to be requested with the individual corpus providers.  相似文献   

17.
Agricultural production, food systems and population health are intimately linked. While there is a strong evidence base to inform our knowledge of what constitutes a healthy human diet, we know little about actual food production or consumption in many populations and how developments in the food and agricultural system will affect dietary intake patterns and health. The paucity of information on food production and consumption is arguably most acute in low- and middle-income countries, where it is most urgently needed to monitor levels of under-nutrition, the health impacts of rapid dietary transition and the increasing ‘double burden’ of nutrition-related disease. Food availability statistics based on food commodity production data are currently widely used as a proxy measure of national-level food consumption, but using data from the UK and Mexico we highlight the potential pitfalls of this approach. Despite limited resources for data collection, better systems of measurement are possible. Important drivers to improve collection systems may include efforts to meet international development goals and partnership with the private sector. A clearer understanding of the links between the agriculture and food system and population health will ensure that health becomes a critical driver of agricultural change.  相似文献   

18.
19.
20.
BackgroundMobile text messaging and medication monitors (medication monitor boxes) have the potential to improve adherence to tuberculosis (TB) treatment and reduce the need for directly observed treatment (DOT), but to our knowledge they have not been properly evaluated in TB patients. We assessed the effectiveness of text messaging and medication monitors to improve medication adherence in TB patients.ConclusionsThis study is the first to our knowledge to utilise a randomised trial design to demonstrate the effectiveness of a medication monitor to improve medication adherence in TB patients. Reminders from medication monitors improved medication adherence in TB patients, but text messaging reminders did not. In a setting such as China where universal use of DOT is not feasible, innovative approaches to support patients in adhering to TB treatment, such as this, are needed.

Trial Registration

Current Controlled Trials, ISRCTN46846388  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号