首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 507 毫秒
1.
Hive plots--rational approach to visualizing networks   总被引:1,自引:0,他引:1  
Networks are typically visualized with force-based or spectral layouts. These algorithms lack reproducibility and perceptual uniformity because they do not use a node coordinate system. The layouts can be difficult to interpret and are unsuitable for assessing differences in networks. To address these issues, we introduce hive plots (http://www.hiveplot.com) for generating informative, quantitative and comparable network layouts. Hive plots depict network structure transparently, are simple to understand and can be easily tuned to identify patterns of interest. The method is computationally straightforward, scales well and is amenable to a plugin for existing tools.  相似文献   

2.
PurposeMonte Carlo (MC) is the reference computation method for medical physics. In radiotherapy, MC computations are necessary for some issues (such as assessing figures of merit, double checks, and dose conversions). A tool based on GATE is proposed to easily create full MC simulations of the Varian TrueBeam STx.MethodsGAMMORA is a package that contains photon phase spaces as a pre-trained generative adversarial network (GAN) and the TrueBeam’s full geometry. It allows users to easily create MC simulations for simple or complex radiotherapy plans such as VMAT. To validate the model, the characteristics of generated photons are first compared to those provided by Varian (IAEA format). Simulated data are also compared to measurements in water and heterogeneous media. Simulations of 8 SBRT plans are compared to measurements (in a phantom). Two examples of applications (a second check and interplay effect assessment) are presented.ResultsThe simulated photons generated by the GAN have the same characteristics (energy, position, and direction) as the IAEA data. Computed dose distributions of simple cases (in water) and complex plans delivered in a phantom are compared to measurements, and the Gamma index (3%/3mm) was always superior to 98%. The feasibility of both clinical applications is shown.ConclusionsThis model is now shared as a free and open-source tool that generates radiotherapy MC simulations. It has been validated and used for five years. Several applications can be envisaged for research and clinical purposes.  相似文献   

3.

Background  

Clearly visualized biopathways provide a great help in understanding biological systems. However, manual drawing of large-scale biopathways is time consuming. We proposed a grid layout algorithm that can handle gene-regulatory networks and signal transduction pathways by considering edge-edge crossing, node-edge crossing, distance measure between nodes, and subcellular localization information from Gene Ontology. Consequently, the layout algorithm succeeded in drastically reducing these crossings in the apoptosis model. However, for larger-scale networks, we encountered three problems: (i) the initial layout is often very far from any local optimum because nodes are initially placed at random, (ii) from a biological viewpoint, human layouts still exceed automatic layouts in understanding because except subcellular localization, it does not fully utilize biological information of pathways, and (iii) it employs a local search strategy in which the neighborhood is obtained by moving one node at each step, and automatic layouts suggest that simultaneous movements of multiple nodes are necessary for better layouts, while such extension may face worsening the time complexity.  相似文献   

4.
Genome scans with many genetic markers provide the opportunity to investigate local adaptation in natural populations and identify candidate genes under selection. In particular, SNPs are dense throughout the genome of most organisms and are commonly observed in functional genes making them ideal markers to study adaptive molecular variation. This approach has become commonly employed in ecological and population genetics studies to detect outlier loci that are putatively under selection. However, there are several challenges to address with outlier approaches including genotyping errors, underlying population structure and false positives, variation in mutation rate and limited sensitivity (false negatives). In this study, we evaluated multiple outlier tests and their type I (false positive) and type II (false negative) error rates in a series of simulated data sets. Comparisons included simulation procedures (FDIST2, ARLEQUIN v.3.5 and BAYESCAN) as well as more conventional tools such as global F(ST) histograms. Of the three simulation methods, FDIST2 and BAYESCAN typically had the lowest type II error, BAYESCAN had the least type I error and Arlequin had highest type I and II error. High error rates in Arlequin with a hierarchical approach were partially because of confounding scenarios where patterns of adaptive variation were contrary to neutral structure; however, Arlequin consistently had highest type I and type II error in all four simulation scenarios tested in this study. Given the results provided here, it is important that outlier loci are interpreted cautiously and error rates of various methods are taken into consideration in studies of adaptive molecular variation, especially when hierarchical structure is included.  相似文献   

5.
Networks are rarely completely observed and prediction of unobserved edges is an important problem, especially in disease spread modeling where networks are used to represent the pattern of contacts. We focus on a partially observed cattle movement network in the U.S. and present a method for scaling up to a full network based on Bayesian inference, with the aim of informing epidemic disease spread models in the United States. The observed network is a 10% state stratified sample of Interstate Certificates of Veterinary Inspection that are required for interstate movement; describing approximately 20,000 movements from 47 of the contiguous states, with origins and destinations aggregated at the county level. We address how to scale up the 10% sample and predict unobserved intrastate movements based on observed movement distances. Edge prediction based on a distance kernel is not straightforward because the probability of movement does not always decline monotonically with distance due to underlying industry infrastructure. Hence, we propose a spatially explicit model where the probability of movement depends on distance, number of premises per county and historical imports of animals. Our model performs well in recapturing overall metrics of the observed network at the node level (U.S. counties), including degree centrality and betweenness; and performs better compared to randomized networks. Kernel generated movement networks also recapture observed global network metrics, including network size, transitivity, reciprocity, and assortativity better than randomized networks. In addition, predicted movements are similar to observed when aggregated at the state level (a broader geographic level relevant for policy) and are concentrated around states where key infrastructures, such as feedlots, are common. We conclude that the method generally performs well in predicting both coarse geographical patterns and network structure and is a promising method to generate full networks that incorporate the uncertainty of sampled and unobserved contacts.  相似文献   

6.
A certain minimal amount of RNA from biological samples is necessary to perform a microarray experiment with suitable replication. In some cases, the amount of RNA available is insufficient, necessitating RNA amplification prior to target synthesis. However, there is some uncertainty about the reliability of targets that have been generated from amplified RNA, because of nonlinearity and preferential amplification. This current work develops a straightforward strategy to assess the reliability of microarray data obtained from amplified RNA. The tabular method we developed, which utilises a Down-Up-Missing-Below (DUMB) classification scheme, shows that microarrays generated with amplified RNA targets are reliable within constraints. There was an increase in false negatives because of the need for increased filtering. Furthermore, this analysis method is generic and can be broadly applied to evaluate all microarray data. A copy of the Microsoft Excel spreadsheet is available upon request from Edward Bearden.  相似文献   

7.
MOTIVATION: Due to the steadily growing computational demands in bioinformatics and related scientific disciplines, one is forced to make optimal use of the available resources. A straightforward solution is to build a network of idle computers and let each of them work on a small piece of a scientific challenge, as done by Seti@Home (http://setiathome.berkeley.edu), the world's largest distributed computing project. RESULTS: We developed a generally applicable distributed computing solution that uses a screensaver system similar to Seti@Home. The software exploits the coarse-grained nature of typical bioinformatics projects. Three major considerations for the design were: (1) often, many different programs are needed, while the time is lacking to parallelize them. Models@Home can run any program in parallel without modifications to the source code; (2) in contrast to the Seti project, bioinformatics applications are normally more sensitive to lost jobs. Models@Home therefore includes stringent control over job scheduling; (3) to allow use in heterogeneous environments, Linux and Windows based workstations can be combined with dedicated PCs to build a homogeneous cluster. We present three practical applications of Models@Home, running the modeling programs WHAT IF and YASARA on 30 PCs: force field parameterization, molecular dynamics docking, and database maintenance.  相似文献   

8.
This is a methodological study exploring the use of quantitative histopathology applied to the cervix to discriminate between normal and cancerous (consisting of adenocarcinoma and adenocarcinoma in situ) tissue samples. The goal is classifying tissue samples, which are populations of cells, from measurements on the cells. Our method uses one particular feature, the IODs-Index, to create a tissue level feature. The specific goal of this study is to find a threshold for the IODs-Index that is used to create the tissue level feature. The main statistical tool is Receiver Operating Characteristic (ROC) curve analysis. When applied to the data, our method achieved promising results with good estimated sensitivity and specificity for our data set. The optimal threshold for the IODs-Index was found to be 2.12.  相似文献   

9.
It has been suggested that neural systems across several scales of organization show optimal component placement, in which any spatial rearrangement of the components would lead to an increase of total wiring. Using extensive connectivity datasets for diverse neural networks combined with spatial coordinates for network nodes, we applied an optimization algorithm to the network layouts, in order to search for wire-saving component rearrangements. We found that optimized component rearrangements could substantially reduce total wiring length in all tested neural networks. Specifically, total wiring among 95 primate (Macaque) cortical areas could be decreased by 32%, and wiring of neuronal networks in the nematode Caenorhabditis elegans could be reduced by 48% on the global level, and by 49% for neurons within frontal ganglia. Wiring length reductions were possible due to the existence of long-distance projections in neural networks. We explored the role of these projections by comparing the original networks with minimally rewired networks of the same size, which possessed only the shortest possible connections. In the minimally rewired networks, the number of processing steps along the shortest paths between components was significantly increased compared to the original networks. Additional benchmark comparisons also indicated that neural networks are more similar to network layouts that minimize the length of processing paths, rather than wiring length. These findings suggest that neural systems are not exclusively optimized for minimal global wiring, but for a variety of factors including the minimization of processing steps.  相似文献   

10.
Genome engineering has been developed to create useful strains for biological studies and industrial uses. However, a continuous challenge remained in the field: technical limitations in high-throughput screening and precise manipulation of strains. Today, technical improvements have made genome engineering more rapid and efficient. This review introduces recent advances in genome engineering technologies applied to Escherichia coli as well as multiplex automated genome engineering (MAGE), a recent technique proposed as a powerful toolkit due to its straightforward process, rapid experimental procedures, and highly efficient properties.  相似文献   

11.
Split networks are increasingly being used in phylogenetic analysis. Usually, a simple equal angle algorithm is used to draw such networks, producing layouts that leave much room for improvement. Addressing the problem of producing better layouts of split networks, this paper presents an algorithm for maximizing the area covered by the network, describes an extension of the equal-daylight algorithm to networks, looks into using a spring embedder and discusses how to construct rooted split networks.  相似文献   

12.
The figures included in many of the biomedical publications play an important role in understanding the biological experiments and facts described within. Recent studies have shown that it is possible to integrate the information that is extracted from figures in classical document classification and retrieval tasks in order to improve their accuracy. One important observation about the figures included in biomedical publications is that they are often composed of multiple subfigures or panels, each describing different methodologies or results. The use of these multimodal figures is a common practice in bioscience, as experimental results are graphically validated via multiple methodologies or procedures. Thus, for a better use of multimodal figures in document classification or retrieval tasks, as well as for providing the evidence source for derived assertions, it is important to automatically segment multimodal figures into subfigures and panels. This is a challenging task, however, as different panels can contain similar objects (i.e., barcharts and linecharts) with multiple layouts. Also, certain types of biomedical figures are text-heavy (e.g., DNA sequences and protein sequences images) and they differ from traditional images. As a result, classical image segmentation techniques based on low-level image features, such as edges or color, are not directly applicable to robustly partition multimodal figures into single modal panels.In this paper, we describe a robust solution for automatically identifying and segmenting unimodal panels from a multimodal figure. Our framework starts by robustly harvesting figure-caption pairs from biomedical articles. We base our approach on the observation that the document layout can be used to identify encoded figures and figure boundaries within PDF files. Taking into consideration the document layout allows us to correctly extract figures from the PDF document and associate their corresponding caption. We combine pixel-level representations of the extracted images with information gathered from their corresponding captions to estimate the number of panels in the figure. Thus, our approach simultaneously identifies the number of panels and the layout of figures.In order to evaluate the approach described here, we applied our system on documents containing protein-protein interactions (PPIs) and compared the results against a gold standard that was annotated by biologists. Experimental results showed that our automatic figure segmentation approach surpasses pure caption-based and image-based approaches, achieving a 96.64% accuracy. To allow for efficient retrieval of information, as well as to provide the basis for integration into document classification and retrieval systems among other, we further developed a web-based interface that lets users easily retrieve panels containing the terms specified in the user queries.  相似文献   

13.
The selection of a single method of analysis is problematic when the data could have been generated by one of several possible models. We examine the properties of two tests designed to have high power over a range of models. The first one, the maximum efficiency robust test (MERT), uses the linear combination of the optimal statistics for each model that maximizes the minimum efficiency. The second procedure, called the MX, uses the maximum of the optimal statistics. Both approaches yield efficiency robust procedures for survival analysis and ordinal categorical data. Guidelines for choosing between them are provided.  相似文献   

14.
An algorithm using feedforward neural network model for determining optimal substrate feeding policies for fed-batch fermentation process is presented in this work. The algorithm involves developing the neural network model of the process using the sampled data. The trained neural network model in turn is used for optimization purposes. The advantages of this technique is that optimization can be achieved without detailed kinetic model of the process and the computation of gradient of objective function with respect to control variables is straightforward. The application of the technique is demonstrated with two examples, namely, production of secreted protein and invertase. The simulation results show that the discrete-time dynamics of fed-batch bioreactor can be satisfactorily approximated using a feedforward sigmoidal neural network. The optimal policies obtained with the neural network model agree reasonably well with the previously reported results.  相似文献   

15.
Spatial and temporal characteristics of human walking are frequently evaluated to identify possible gait impairments, mainly in orthopedic and neurological patients1-4, but also in healthy older adults5,6. The quantitative gait analysis described in this protocol is performed with a recently-introduced photoelectric system (see Materials table) which has the potential to be used in the clinic because it is portable, easy to set up (no subject preparation is required before a test), and does not require maintenance and sensor calibration. The photoelectric system consists of series of high-density floor-based photoelectric cells with light-emitting and light-receiving diodes that are placed parallel to each other to create a corridor, and are oriented perpendicular to the line of progression7. The system simply detects interruptions in light signal, for instance due to the presence of feet within the recording area. Temporal gait parameters and 1D spatial coordinates of consecutive steps are subsequently calculated to provide common gait parameters such as step length, single limb support and walking velocity8, whose validity against a criterion instrument has recently been demonstrated7,9. The measurement procedures are very straightforward; a single patient can be tested in less than 5 min and a comprehensive report can be generated in less than 1 min.  相似文献   

16.
Understanding the age structure of vegetation is important for effective land management, especially in fire-prone landscapes where the effects of fire can persist for decades and centuries. In many parts of the world, such information is limited due to an inability to map disturbance histories before the availability of satellite images (~1972). Here, we describe a method for creating a spatial model of the age structure of canopy species that established pre-1972. We built predictive neural network models based on remotely sensed data and ecological field survey data. These models determined the relationship between sites of known fire age and remotely sensed data. The predictive model was applied across a 104,000 km2 study region in semi-arid Australia to create a spatial model of vegetation age structure, which is primarily the result of stand-replacing fires which occurred before 1972. An assessment of the predictive capacity of the model using independent validation data showed a significant correlation (rs = 0.64) between predicted and known age at test sites. Application of the model provides valuable insights into the distribution of vegetation age-classes and fire history in the study region. This is a relatively straightforward method which uses widely available data sources that can be applied in other regions to predict age-class distribution beyond the limits imposed by satellite imagery.  相似文献   

17.
Synthetic biology aims at designing and engineering organisms. The engineering process typically requires the establishment of suitable DNA constructs generated through fusion of multiple protein coding and regulatory sequences. Conventional cloning techniques, including those involving restriction enzymes and ligases, are often of limited scope, in particular when many DNA fragments must be joined or scar-free fusions are mandatory. Overlap-based-cloning methods have the potential to overcome such limitations. One such method uses seamless ligation cloning extract (SLiCE) prepared from Escherichia coli cells for straightforward and efficient in vitro fusion of DNA fragments. Here, we systematically characterized extracts prepared from the unmodified E. coli strain DH10B for SLiCE-mediated cloning and determined DNA sequence-associated parameters that affect cloning efficiency. Our data revealed the virtual absence of length restrictions for vector backbone (up to 13.5 kbp) and insert (90 bp to 1.6 kbp). Furthermore, differences in GC content in homology regions are easily tolerated and the deletion of unwanted vector sequences concomitant with targeted fragment insertion is straightforward. Thus, SLiCE represents a highly versatile DNA fusion method suitable for cloning projects in virtually all molecular and synthetic biology projects.  相似文献   

18.
We have developed a software package named PEAS to facilitate analyses of large data sets of single nucleotide polymorphisms (SNPs) for population genetics and molecular phylogenetics studies. PEAS reads SNP data in various formats as input and is versatile in data formatting; using PEAS, it is easy to create input files for many popular packages, such as STRUCTURE, frappe, Arlequin, Haploview, LDhat, PLINK, EIGENSOFT, PHASE, fastPHASE, MEGA and PHYLIP. In addition, PEAS fills up several analysis gaps in currently available computer programs in population genetics and molecular phylogenetics. Notably, (i) It calculates genetic distance matrices with bootstrapping for both individuals and populations from genome-wide high-density SNP data, and the output can be streamlined to MEGA and PHYLIP programs for further processing; (ii) It calculates genetic distances from STRUCTURE output and generates MEGA file to reconstruct component trees; (iii) It provides tools to conduct haplotype sharing analysis for phylogenetic studies based on high-density SNP data. To our knowledge, these analyses are not available in any other computer program. PEAS for Windows is freely available for academic users from http://www.picb.ac.cn/~xushua/index.files/Download_PEAS.htm.  相似文献   

19.
The force‐directed layout is commonly used in computer‐generated visualizations of protein–protein interaction networks. While it is good for providing a visual outline of the protein complexes and their interactions, it has two limitations when used as a visual analysis method. The first is poor reproducibility. Repeated running of the algorithm does not necessarily generate the same layout, therefore, demanding cognitive readaptation on the investigator's part. The second limitation is that it does not explicitly display complementary biological information, e.g. Gene Ontology, other than the protein names or gene symbols. Here, we present an alternative layout called the clustered circular layout. Using the human DNA replication protein–protein interaction network as a case study, we compared the two network layouts for their merits and limitations in supporting visual analysis.  相似文献   

20.
In light of the vast amounts of genomic data that are now being generated, we propose a new measure, the Bayesian false-discovery probability (BFDP), for assessing the noteworthiness of an observed association. BFDP shares the ease of calculation of the recently proposed false-positive report probability (FPRP) but uses more information, has a noteworthy threshold defined naturally in terms of the costs of false discovery and nondiscovery, and has a sound methodological foundation. In addition, in a multiple-testing situation, it is straightforward to estimate the expected numbers of false discoveries and false nondiscoveries. We provide an in-depth discussion of FPRP, including a comparison with the q value, and examine the empirical behavior of these measures, along with BFDP, via simulation. Finally, we use BFDP to assess the association between 131 single-nucleotide polymorphisms and lung cancer in a case-control study.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号