首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.

Background

Applications in biomedical science and life science produce large data sets using increasingly powerful imaging devices and computer simulations. It is becoming increasingly difficult for scientists to explore and analyze these data using traditional tools. Interactive data processing and visualization tools can support scientists to overcome these limitations.

Results

We show that new data processing tools and visualization systems can be used successfully in biomedical and life science applications. We present an adaptive high-resolution display system suitable for biomedical image data, algorithms for analyzing and visualization protein surfaces and retinal optical coherence tomography data, and visualization tools for 3D gene expression data.

Conclusion

We demonstrated that interactive processing and visualization methods and systems can support scientists in a variety of biomedical and life science application areas concerned with massive data analysis.
  相似文献   

2.
3.
4.

Background  

Modern, high-throughput biological experiments generate copious, heterogeneous, interconnected data sets. Research is dynamic, with frequently changing protocols, techniques, instruments, and file formats. Because of these factors, systems designed to manage and integrate modern biological data sets often end up as large, unwieldy databases that become difficult to maintain or evolve. The novel rule-based approach of the Ultra-Structure design methodology presents a potential solution to this problem. By representing both data and processes as formal rules within a database, an Ultra-Structure system constitutes a flexible framework that enables users to explicitly store domain knowledge in both a machine- and human-readable form. End users themselves can change the system's capabilities without programmer intervention, simply by altering database contents; no computer code or schemas need be modified. This provides flexibility in adapting to change, and allows integration of disparate, heterogenous data sets within a small core set of database tables, facilitating joint analysis and visualization without becoming unwieldy. Here, we examine the application of Ultra-Structure to our ongoing research program for the integration of large proteomic and genomic data sets (proteogenomic mapping).  相似文献   

5.

Background

The visualization of large volumes of data is a computationally challenging task that often promises rewarding new insights. There is great potential in the application of new algorithms and models from combinatorial optimisation. Datasets often contain “hidden regularities” and a combined identification and visualization method should reveal these structures and present them in a way that helps analysis. While several methodologies exist, including those that use non-linear optimization algorithms, severe limitations exist even when working with only a few hundred objects.

Methodology/Principal Findings

We present a new data visualization approach (QAPgrid) that reveals patterns of similarities and differences in large datasets of objects for which a similarity measure can be computed. Objects are assigned to positions on an underlying square grid in a two-dimensional space. We use the Quadratic Assignment Problem (QAP) as a mathematical model to provide an objective function for assignment of objects to positions on the grid. We employ a Memetic Algorithm (a powerful metaheuristic) to tackle the large instances of this NP-hard combinatorial optimization problem, and we show its performance on the visualization of real data sets.

Conclusions/Significance

Overall, the results show that QAPgrid algorithm is able to produce a layout that represents the relationships between objects in the data set. Furthermore, it also represents the relationships between clusters that are feed into the algorithm. We apply the QAPgrid on the 84 Indo-European languages instance, producing a near-optimal layout. Next, we produce a layout of 470 world universities with an observed high degree of correlation with the score used by the Academic Ranking of World Universities compiled in the The Shanghai Jiao Tong University Academic Ranking of World Universities without the need of an ad hoc weighting of attributes. Finally, our Gene Ontology-based study on Saccharomyces cerevisiae fully demonstrates the scalability and precision of our method as a novel alternative tool for functional genomics.  相似文献   

6.

Background

To achieve more realistic simulations, meteorologists develop and use models with increasing spatial and temporal resolution. The analyzing, comparing, and visualizing of resulting simulations becomes more and more challenging due to the growing amounts and multifaceted character of the data. Various data sources, numerous variables and multiple simulations lead to a complex database. Although a variety of software exists suited for the visualization of meteorological data, none of them fulfills all of the typical domain-specific requirements: support for quasi-standard data formats and different grid types, standard visualization techniques for scalar and vector data, visualization of the context (e.g., topography) and other static data, support for multiple presentation devices used in modern sciences (e.g., virtual reality), a user-friendly interface, and suitability for cooperative work.

Methods and Results

Instead of attempting to develop yet another new visualization system to fulfill all possible needs in this application domain, our approach is to provide a flexible workflow that combines different existing state-of-the-art visualization software components in order to hide the complexity of 3D data visualization tools from the end user. To complete the workflow and to enable the domain scientists to interactively visualize their data without advanced skills in 3D visualization systems, we developed a lightweight custom visualization application (MEVA - multifaceted environmental data visualization application) that supports the most relevant visualization and interaction techniques and can be easily deployed. Specifically, our workflow combines a variety of different data abstraction methods provided by a state-of-the-art 3D visualization application with the interaction and presentation features of a computer-games engine. Our customized application includes solutions for the analysis of multirun data, specifically with respect to data uncertainty and differences between simulation runs. In an iterative development process, our easy-to-use application was developed in close cooperation with meteorologists and visualization experts. The usability of the application has been validated with user tests. We report on how this application supports the users to prove and disprove existing hypotheses and discover new insights. In addition, the application has been used at public events to communicate research results.  相似文献   

7.
MOTIVATION: Experimental limitations have resulted in the popularity of parametric statistical tests as a method for identifying differentially regulated genes in microarray data sets. However, these tests assume that the data follow a normal distribution. To date, the assumption that replicate expression values for any gene are normally distributed, has not been critically addressed for Affymetrix GeneChip data. RESULTS: The normality of the expression values calculated using four different commercial and academic software packages was investigated using a data set consisting of the same target RNA applied to 59 human Affymetrix U95A GeneChips using a combination of statistical tests and visualization techniques. For the majority of probe sets obtained from each analysis suite, the expression data showed a good correlation with normality. The exception was a large number of low-expressed genes in the data set produced using Affymetrix Microarray Suite 5.0, which showed a striking non-normal distribution. In summary, our data provide strong support for the application of parametric tests to GeneChip data sets without the need for data transformation.  相似文献   

8.

Background  

Multiple sequence alignments are a fundamental tool for the comparative analysis of proteins and nucleic acids. However, large data sets are no longer manageable for visualization and investigation using the traditional stacked sequence alignment representation.  相似文献   

9.
The effective extraction of information from multidimensional data sets derived from phenotyping experiments is a growing challenge in biology. Data visualization tools are important resources that can aid in exploratory data analysis of complex data sets. Phenotyping experiments of model organisms produce data sets in which a large number of phenotypic measures are collected for each individual in a group. A critical initial step in the analysis of such multidimensional data sets is the exploratory analysis of data distribution and correlation. To facilitate the rapid visualization and exploratory analysis of multidimensional complex trait data, we have developed a user-friendly, web-based software tool called Phenostat. Phenostat is composed of a dynamic graphical environment that allows the user to inspect the distribution of multiple variables in a data set simultaneously. Individuals can be selected by directly clicking on the graphs and thus displaying their identity, highlighting corresponding values in all graphs, allowing their inclusion or exclusion from the analysis. Statistical analysis is provided by R package functions. Phenostat is particularly suited for rapid distribution and correlation analysis of subsets of data. An analysis of behavioral and physiologic data stemming from a large mouse phenotyping experiment using Phenostat reveals previously unsuspected correlations. Phenostat is freely available to academic institutions and nonprofit organizations and can be used from our website at .  相似文献   

10.
Serial analysis of gene expression (SAGE) technology produces large sets of interesting genes that are difficult to analyze directly. Bioinformatics tools are needed to interpret the functional information in these gene sets. We present an interactive web-based tool, called Gene Class, which allows functional annotation of SAGE data using the Gene Ontology (GO) database. This tool performs searches in the GO database for each SAGE tag, making associations in the selected GO category for a level selected in the hierarchy. This system provides user-friendly data navigation and visualization for mapping SAGE data onto the gene ontology structure. This tool also provides graphical visualization of the percentage of SAGE tags in each GO category, along with confidence intervals and hypothesis testing.  相似文献   

11.
The availability of cellular markers tagged with the green fluorescent protein (GFP) has recently allowed a large number of cell biological studies to be carried out in live cells, thereby addressing the dynamic organization of cellular structures. Typically, microscopes capable of video recording are used to generate time-resolved data sets. Dynamic imaging data are complex and often difficult to interpret by pure visual inspection. Therefore, specialized image processing methods for object detection, motion estimation, visualization, and quantitation are required. In this review, we discuss concepts for automated analysis of multidimensional image data from live cell microscopy and their application to the dynamics of cell nuclear subcompartments.  相似文献   

12.
In just the past 20 years systematics has progressed from the sequencing of individual genes for a few taxa to routine sequencing of complete plastid and even nuclear genomes. Recent technological advances have made it possible to compile very large data sets, the analyses of which have in turn provided unprecedented insights into phylogeny and evolution. Indeed, this narrow window of a few decades will likely be viewed as a golden era in systematics. Relationships have been resolved at all taxonomic levels across all groups of photosynthetic life. In the angiosperms, problematic deep-level relationships have either been largely resolved, or will be resolved within the next several years. The same large data sets have also provided new insights into the many rapid radiations that have characterized angiosperm evolution. For example, all of the major lineages of angiosperms likely arose within a narrow window of just a few million years. At the population level, the ease of DNA sequencing has given new life to phylogeographic studies, and microsatellite analyses have become more commonplace, with a concomitant impact on conservation and population biology. With the wealth of sequence data soon to be available, we are on the cusp of assembling the first semi-comprehensive tree of life for many of the 15,000 genera of flowering plants and indeed for much of green life. Accompanying these opportunities are also enormous new computational/informatic challenges including the management and phylogenetic analysis of such large, sometimes fragmentary data sets, and visualization of trees with thousands of terminals.  相似文献   

13.

Background  

The enormity of the information contained in large data sets makes it difficult to develop intuitive understanding. It would be useful to have software that allows visualization of possible correlations between properties that can be associated with a core data set. In the case of bacterial genomes, existing visualization tools focus on either global properties such as variations in composition or detailed local displays of the features that comprise the annotation. It is not easy to visualize other information in the context of this core information.  相似文献   

14.
DnaSP, DNA polymorphism analyses by the coalescent and other methods   总被引:170,自引:0,他引:170  
SUMMARY: DnaSP is a software package for the analysis of DNA polymorphism data. Present version introduces several new modules and features which, among other options allow: (1) handling big data sets (approximately 5 Mb per sequence); (2) conducting a large number of coalescent-based tests by Monte Carlo computer simulations; (3) extensive analyses of the genetic differentiation and gene flow among populations; (4) analysing the evolutionary pattern of preferred and unpreferred codons; (5) generating graphical outputs for an easy visualization of results. AVAILABILITY: The software package, including complete documentation and examples, is freely available to academic users from: http://www.ub.es/dnasp  相似文献   

15.

Background  

Genomics research produces vast amounts of experimental data that needs to be integrated in order to understand, model, and interpret the underlying biological phenomena. Interpreting these large and complex data sets is challenging and different visualization methods are needed to help produce knowledge from the data.  相似文献   

16.
Recently, applications of mass spectrometry in the field of clinical proteomics have gained tremendous visibility in the scientific and clinical community. One major objective is the search for potential biomarkers in complex body fluids like serum, plasma, urine, saliva, or cerebral spinal fluid. For this purpose, efficient visualization of large data sets derived from patient cohorts is crucial to provide clinical experts an interactive impression of the data quality. Additionally, it is necessary to apply statistical analysis and pattern matching algorithms to attain validated signal patterns that may allow for later applications in sample classification. We introduce the new ClinProTools bioinformatics software, which performs all major steps of profiling, screening, and monitoring applications in clinical proteomics. ClinProTools is the data interpretation software of the mass spectrometry-based ClinProt solutions for biomarker analysis. ClinProTools performs data pretreatment, visualization, statistics, pattern determination, pattern evaluation, and classification of spectra. This article will focus on ClinProTool's powerful and intuitive visualization options for clinical proteomics applications.  相似文献   

17.
《Journal of molecular biology》2019,431(8):1519-1539
The epiproteome describes the set of all post-translational modifications (PTMs) made to the proteins comprising a cell or organism. The extent of the epiproteome is still largely unknown; however, advances in experimental techniques are beginning to produce a deluge of data, tracking dynamic changes to the epiproteome in response to cellular stimuli. These data have potential to revolutionize our understanding of biology and disease. This review covers a range of recent visualization methods and tools developed specifically for dynamic epiproteome data sets. These methods have been designed primarily for data sets on phosphorylation, as this the most studied PTM; however, most of these methods are also applicable to other types of PTMs. Unfortunately, the currently available methods are often inadequate for existing data sets; thus, realizing the potential buried in epiproteome data sets will require new, tailored bioinformatics methods that will help researchers analyze, visualize, and interactively explore these complex data sets.  相似文献   

18.
Recent advances in computer networks and information technologies have created exciting new possibilities for sharing and analyzing scientific research data. Although individual datasets can be studied efficiently, many scientists are still largely limited to considering data collected by themselves, their students, or closely affiliated research groups. Increasingly widespread high-speed network connections and the existence of large, coordinated research programs suggest the potential for scientists to access and learn from data from outside their immediate research circle. We are developing a web-based application that facilitates the sharing of scientific data within a research network using the now-common “virtual globe” in combination with advanced visualization methods designed for geographically distributed scientific data. Two major components of the system enable the rapid assessment of geographically distributed scientific data: a database built from information submitted by network members, and a module featuring novel and sophisticated geographic data visualization techniques. By enabling scientists to share results with each other and view their shared data through a common virtual-globe interface, the system provides a new platform for important meta-analyses and the analysis of broad-scale patterns. Here we present the design and capabilities of the SFMN GeoSearch platform for the Sustainable Forest Management Network, a pan-Canadian network of forest researchers who have accumulated data for more than a decade. Through the development and dissemination of this new tool, we hope to help scientists, students, and the general public to understand the depth and breadth of scientific data across potentially large areas.  相似文献   

19.
The use of self-organizing maps to analyze data often depends on finding effective methods to visualize the SOM's structure. In this paper we propose a new way to perform that visualization using a variant of Andrews' Curves. Also we show that the interaction between these two methods allows us to find sub-clusters within identified clusters. Perhaps more importantly, using the SOM to pre-process data by identifying gross features enables us to use Andrews' Curves on data sets which would have previously been too large for the methodology. Finally we show how a three way interaction between the human user and these two methods can be a valuable exploratory data analysis tool.  相似文献   

20.
A large and growing network (“cloud”) of interlinked terms and records of items of Systems Biology knowledge is available from the web. These items include pathways, reactions, substances, literature references, organisms, and anatomy, all described in different data sets. Here, we discuss how the knowledge from the cloud can be molded into representations (views) useful for data visualization and modeling. We discuss methods to create and use various views relevant for visualization, modeling, and model annotations, while hiding irrelevant details without unacceptable loss or distortion. We show that views are compatible with understanding substances and processes as sets of microscopic compounds and events respectively, which allows the representation of specializations and generalizations as subsets and supersets respectively. We explain how these methods can be implemented based on the bridging ontology Systems Biological Pathway Exchange (SBPAX) in the Systems Biology Linker (SyBiL) we have developed.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号