首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
系统提取并分析了农作物种质资源普查数据、调查数据、评价数据和保存数据等数据信息,采用基于数据元技术方法制定了农作物种质资源调查数据标准和数据元目录;定义了种质资源调查数据集以及对象和属性的映射关系;给出了基于XML数据标准存储及交换策略。标准的制定使农作物种质资源调查在"数据层"上达到统一,规范了数据库构建,促进了农作物种质资源调查数据的整合和共享。  相似文献   

2.
3.

Background

In recent years, increasing amounts of genomic and clinical cancer data have become publically available through large-scale collaborative projects such as The Cancer Genome Atlas (TCGA). However, as long as these datasets are difficult to access and interpret, they are essentially useless for a major part of the research community and their scientific potential will not be fully realized. To address these issues we developed MEXPRESS, a straightforward and easy-to-use web tool for the integration and visualization of the expression, DNA methylation and clinical TCGA data on a single-gene level (http://mexpress.be).

Results

In comparison to existing tools, MEXPRESS allows researchers to quickly visualize and interpret the different TCGA datasets and their relationships for a single gene, as demonstrated for GSTP1 in prostate adenocarcinoma. We also used MEXPRESS to reveal the differences in the DNA methylation status of the PAM50 marker gene MLPH between the breast cancer subtypes and how these differences were linked to the expression of MPLH.

Conclusions

We have created a user-friendly tool for the visualization and interpretation of TCGA data, offering clinical researchers a simple way to evaluate the TCGA data for their genes or candidate biomarkers of interest.

Electronic supplementary material

The online version of this article (doi:10.1186/s12864-015-1847-z) contains supplementary material, which is available to authorized users.  相似文献   

4.
As projects progress from pilot studies with few simple variables and small samples, the research process as a whole becomes qualitatively more complex and subject to an array of contamination by errors and mistakes. Data usually undergo a series of manipulations (e.g., recording, computer entry, transmission) prior to final statistical analysis. The process, then, consists of numerous operations only ending with eventual statistical analysis and write-up. We present a means of estimating the impact of process error in the same terms as psychometric reliability and discuss the implications for reducing the impact of errors on overall data quality.  相似文献   

5.
Pollen stratigraphies are the most spatially extensive data available for the reconstruction of past land-cover change. Detailed knowledge of past land-cover is becoming increasingly important to evaluate the present trends in, and drivers of, vegetation composition. The European Pollen Database (EPD) was established in the late 1980s and developed in the early 1990s to provide a structure for archiving, exchanging, and analysing Quaternary pollen data from Europe. It provides a forum for scientists to meet and engage in collaborative investigations or data analysis. In May 2007 several EPD support groups were developed to assist in the task of maintaining and updating the database. The mapping and data accuracy work group (MADCAP) aims to produce an atlas of past plant distributions as detected by pollen analyses in Europe, in order to meet the growing need for this data from palaeoecologists and the wider scientific community. Due to data handling problems in the past, a significant number of EPD datasets have errors. The initial task of the work group, therefore, was a systematic review of pollen sequences, in order to identify and correct errors. The EPD currently (January 2009) archives 1,032 pollen sequences, of which 668 have age-depth models that allow chronological comparison. Many errors have been identified and corrected, or flagged for users, most notably errors in the pollen count data. The application of spatial analyses to pollen data is related to the number of data points that are available for analysis. We therefore take this opportunity to encourage the submission of pollen analytical results to the EPD or other relevant pollen databases. Only in this way will the scientific community be able to gain a better understanding of past vegetation dynamics.  相似文献   

6.
The application of mass spectrometry imaging (MS imaging) is rapidly growing with a constantly increasing number of different instrumental systems and software tools. The data format imzML was developed to allow the flexible and efficient exchange of MS imaging data between different instruments and data analysis software. imzML data is divided in two files which are linked by a universally unique identifier (UUID). Experimental details are stored in an XML file which is based on the HUPO-PSI format mzML. Information is provided in the form of a 'controlled vocabulary' (CV) in order to unequivocally describe the parameters and to avoid redundancy in nomenclature. Mass spectral data are stored in a binary file in order to allow efficient storage. imzML is supported by a growing number of software tools. Users will be no longer limited to proprietary software, but are able to use the processing software best suited for a specific question or application. MS imaging data from different instruments can be converted to imzML and displayed with identical parameters in one software package for easier comparison. All technical details necessary to implement imzML and additional background information is available at www.imzml.org.  相似文献   

7.
介绍DICOM3.0医学图像文件的格式和C#语言的特点,首次利用Visual C#语言对该标准的图像进行显示和处理,能够直接读取DICOM格式原始图像数据,并可批量转换成BMP等格式进行处理,此项工作可为医学图像处理研究及相关医学图像软件开发奠定基础。  相似文献   

8.
Data visualization and interactive data exploration are important aspects of illustrating complex concepts and results from analyses of omics data. A suitable visualization has to be intuitive and accessible. Web-based dashboards have become popular tools for the arrangement, consolidation, and display of such visualizations. However, the combination of automated data processing pipelines handling omics data and dynamically generated, interactive dashboards is poorly solved. Here, we present i2dash, an R package intended to encapsulate functionality for the programmatic creation of customized dashboards. It supports interactive and responsive (linked) visualizations across a set of predefined graphical layouts. i2dash addresses the needs of data analysts/software developers for a tool that is compatible and attachable to any R-based analysis pipeline, thereby fostering the separation of data visualization on one hand and data analysis tasks on the other hand. In addition, the generic design of i2dash enables the development of modular extensions for specific needs. As a proof of principle, we provide an extension of i2dash optimized for single-cell RNA sequencing analysis, supporting the creation of dashboards for the visualization needs of such experiments. Equipped with these features, i2dash is suitable for extensive use in large-scale sequencing/bioinformatics facilities. Along this line, we provide i2dash as a containerized solution, enabling a straightforward large-scale deployment and sharing of dashboards using cloud services. i2dash is freely available via the R package archive CRAN (https://CRAN.R-project.org/package=i2dash).  相似文献   

9.
Development of a clearer understanding of the causes and consequences of environmental change is an important issue globally. The consequent demand for objective, reliable and up-to-date environmental information has led to the establishment of long-term integrated environmental monitoring programmes, including the UK's Environmental Change Network (ECN). Databases form the core information resource for such programmes. The UK Environmental Change Network Data Centre manages data on behalf of ECN (as well as other related UK integrated environmental monitoring networks) and provides a robust and integrated system of information management. This paper describes how data are captured – through standardised protocols and data entry systems – as well more recent approaches such as wireless sensors. Data are managed centrally through a database and GIS. Quality control is built in at all levels of the system. Data are then made accessible through a variety of data access methods – through bespoke web interfaces, as well as third-party data portals. This paper describes the informatics approach of the ECN Data Centre which aims to develop a seamless system of data capture, management and data access interfaces to support research.  相似文献   

10.
11.
12.
Absolute protein concentration determination is becoming increasingly important in a number of fields including diagnostics, biomarker discovery and systems biology modeling. The recently introduced quantification concatamer methodology provides a novel approach to performing such determinations, and it has been applied to both microbial and mammalian systems. While a number of software tools exist for performing analyses of quantitative data generated by related methodologies such as SILAC, there is currently no analysis package dedicated to the quantification concatamer approach. Furthermore, most tools that are currently available in the field of quantitative proteomics do not manage storage and dissemination of such data sets.  相似文献   

13.
基质辅助激光解吸电离飞行时间质谱(MALDI-TOF MS)因其具有快速、准确、高通量等特点在食品微生物检测和临床微生物鉴定领域有广泛的应用。对MALDI-TOF MS数据的预处理和分析是微生物鉴定的关键步骤,通过对数据的处理可以从大量的数据中提取微生物的特征肽或者蛋白信息,并通过有监督和无监督学习方法对这些特征信息进行分类和聚类,从而实现对微生物的鉴定、分型和同源性分析。本文就MALDI-TOF MS鉴定微生物中所应用的数理统计分析方法和数据分析软件进行综述。  相似文献   

14.
Protein design aims at designing new protein molecules of desired structure and functionality. One of the major obstacles to large-scale protein design are the extensive time and manpower requirements for experimental validation of designed sequences. Recent advances in protein structure prediction have provided potentials for an automated assessment of the designed sequences via folding simulations. We present a new protocol for protein design and validation. The sequence space is initially searched by Monte Carlo sampling guided by a public atomic potential, with candidate sequences selected by the clustering of sequence decoys. The designed sequences are then assessed by I-TASSER folding simulations, which generate full-length atomic structural models by the iterative assembly of threading fragments. The protocol is tested on 52 nonhomologous single-domain proteins, with an average sequence identity of 24% between the designed sequences and the native sequences. Despite this low sequence identity, three-dimensional models predicted for the first designed sequence have an RMSD of < 2 Å to the target structure in 62% of cases. This percentage increases to 77% if we consider the three-dimensional models from the top 10 designed sequences. Such a striking consistency between the target structure and the structural prediction from nonhomologous sequences, despite the fact that the design and folding algorithms adopt completely different force fields, indicates that the design algorithm captures the features essential to the global fold of the target. On average, the designed sequences have a free energy that is 0.39 kcal/(mol residue) lower than in the native sequences, potentially affording a greater stability to synthesized target folds.  相似文献   

15.
16.
Four spatial points that define enzyme families   总被引:1,自引:0,他引:1  
The catalytic properties of enzymes, containing the Asp-His-Ser triads are deeply investigated for a long time. Serine endopeptidases, cutinases, acetylcholinesterases, cellulases, among other enzymes, contain these triads. We found that solely the geometric properties of just four points in the spatial structure of these enzymes are characteristic to their family (Fig. 3).  相似文献   

17.
Patil A  Nakamura H 《FEBS letters》2006,580(8):2041-2045
We investigate the structural properties of hubs that enable them to interact with several partners in protein-protein interaction networks. We find that hubs have more observed and predicted disordered residues with fewer loops/coils, and more charged residues on the surface as compared to non-hubs. Smaller hubs have fewer disordered residues and more charged residues on the surface than larger hubs. We conclude that the global flexibility provided by disordered domains, and high surface charge are complementary factors that play a significant role in the binding ability of hubs.  相似文献   

18.
19.
20.
A method is described for preparing large gelatine-embedded soil sections for ecological studies. Sampling methods reduce structural disturbance of the samples to a minimum and include freezing the samples in the field to kill soil invertebrates in their natural micro-habitats. The samples are vacuum-embedded in gelatine and the construction of a simple embedding apparatus is described. Projects are suggested, suitable for upper secondary school or tertiary level education, where soil sections can be used to investigate various aspects of soil micro-arthropod ecology and the dynamics of organic matter in soil and litter profiles.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号