首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The emergence of cloud computing has made it become an attractive solution for large-scale data processing and storage applications. Cloud infrastructures provide users a remote access to powerful computing capacity, large storage space and high network bandwidth to deploy various applications. With the support of cloud computing, many large-scale applications have been migrated to cloud infrastructures instead of running on in-house local servers. Among these applications, continuous write applications (CWAs) such as online surveillance systems, can significantly benefit due to the flexibility and advantages of cloud computing. However, with specific characteristics such as continuous data writing and processing, and high level demand of data availability, cloud service providers prefer to use sophisticated models for provisioning resources to meet CWAs’ demands while minimizing the operational cost of the infrastructure. In this paper, we present a novel architecture of multiple cloud service providers (CSPs) or commonly referred to as Cloud-of-Clouds. Based on this architecture, we propose two operational cost-aware algorithms for provisioning cloud resources for CWAs, namely neighboring optimal resource provisioning algorithm and global optimal resource provisioning algorithm, in order to minimize the operational cost and thereby maximizing the revenue of CSPs. We validate the proposed algorithms through comprehensive simulations. The two proposed algorithms are compared against each other to assess their effectiveness, and with a commonly used and practically viable round-robin approach. The results demonstrate that NORPA and GORPA outperform the conventional round-robin algorithm by reducing the operational cost by up to 28 and 57 %, respectively. The low complexity of the proposed cost-aware algorithms allows us to apply it to a realistic Cloud-of-Clouds environment in industry as well as academia.  相似文献   

2.
Cloud computing should inherently support various types of data-intensive workloads with different storage access patterns. This makes a high-performance storage system in the Cloud an important component. Emerging flash device technologies such as solid state drives (SSDs) are a viable choice for building high performance computing (HPC) cloud storage systems to address more fine-grained data access patterns. However, the bit-per-dollar SSD price is still higher than the prices of HDDs. This study proposes an optimized progressive file layout (PFL) method to leverage the advantages of SSDs in a parallel file system such as Lustre so that small file I/O performance can be significantly improved. A PFL can dynamically adjust chunk sizes and stripe patterns according to various I/O traffics. Extensive experimental results show that this approach (i.e. building a hybrid storage system based on a combination of SSDs and HDDs) can actually achieve balanced throughput over mixed I/O workloads consisting of large and small file access patterns.  相似文献   

3.
In many scientific and engineering areas there are emerging software services available over the Web. The reason for deploying such services in the Cloud is either to reduce the operational costs or to support the peaks in their usage profiles. The algorithms employed in such services are usually result of a long term research and technology development work, so it is beneficial to reuse those critical application parts when developing new Cloud applications. This paper investigates the possibilities to introduce a Model Driven Architecture (MDA) for the Cloud computing domain, which would support composition, customization, flexibility, maintenance and reusability of Cloud application components in the particular case of scientific and engineering applications. The underlying middleware technology of choice is the mOSAIC Platform as a Service (PaaS) solution. This choice is motivated by the fact that in mOSAIC a Cloud application consists of loosely coupled components, which are either generic and provide for key resource types needed by an application (computation, storage, communication) or custom made, e.g. based on existing legacy software. The MDA approach is illustrated through the design and operation of an application for analysis of structures under static loading. It is shown that a relatively simple design can be used to address two application bottlenecks: the varying number of users and the computational complexity of the given problem. The design reduces the necessary application development efforts and the key components can be reused for similar applications.  相似文献   

4.
BRANCH SUPPORT AND TREE STABILITY   总被引:38,自引:1,他引:37  
Abstract— Branch support is quantified as the extra length needed to lose a branch in the consensus of near-most-parsimonious trees. This approach is based solely on the original data, as opposed to the data perturbation used in the bootstrap procedure. If trees have been generated by Farris's successive approximations approach to character weighting, branch support should be examined in terms of weighted extra length needed to lose a branch. The sum of all branch support values over the tree divided by the length of the most parsimonious tree[s] provides a new index, the total support index. This index is a measure of tree stability in terms of supported resolutions, which is of prime importance in cladistic analysis.  相似文献   

5.
Classification is a data mining task the goal of which is to learn a model, from a training dataset, that can predict the class of a new data instance, while clustering aims to discover natural instance-groupings within a given dataset. Learning cluster-based classification systems involves partitioning a training set into data subsets (clusters) and building a local classification model for each data cluster. The class of a new instance is predicted by first assigning the instance to its nearest cluster and then using that cluster’s local classification model to predict the instance’s class. In this paper, we present an ant colony optimization (ACO) approach to building cluster-based classification systems. Our ACO approach optimizes the number of clusters, the positioning of the clusters, and the choice of classification algorithm to use as the local classifier for each cluster. We also present an ensemble approach that allows the system to decide on the class of a given instance by considering the predictions of all local classifiers, employing a weighted voting mechanism based on the fuzzy degree of membership in each cluster. Our experimental evaluation employs five widely used classification algorithms: naïve Bayes, nearest neighbour, Ripper, C4.5, and support vector machines, and results are reported on a suite of 54 popular UCI benchmark datasets.  相似文献   

6.
To use crystallography for the determination of the three-dimensional structures of proteins, protein crystals need to be grown. Automated imaging systems are increasingly being used to monitor these crystallization experiments. These present problems of accessibility to the data, repeatability of any image analysis performed and the amount of storage required. Various image formats and techniques can be combined to provide effective solutions to high volume processing problems such as these, however lack of widespread support for the most effective algorithms, such as JPeg2000 which yielded a 64% improvement in file size over the bitmap, currently inhibits the immediate take up of this approach.  相似文献   

7.
Three-dimensional reconstruction of trees and the estimation of biophysical parameters is significant for the management of forest resources, ecological studies carbon cycle and biodiversity. Terrestrial LiDAR data provides detailed, objective and three-dimensional measurement of forest structure and exact metrics of the tree canopies. Several methods for tree detection including canopy height models and raster interpolation models are based on commercial software and huge data processing. The objective of the given study is the three-dimensional reconstruction of trees by implementing segmentation algorithms and thereby estimating the Leaf Area Index of individual tree segments by terrestrial laser scanned data in the Mudumalai forests of Western Ghats, India. The hierarchical minimum cut segmentation method is used for the three-dimensional reconstruction of the individual trees by tracking cylinders along individual branches and trees in a hierarchical order. Super voxel clustering method is also implemented in the study for tree reconstruction and estimating the tree parameters. Leaf area index is calculated by applying a multivariate regression technique for the heights and the diameter obtained from both the segmentation methods. Results obtained indicated a strong correlation with the in-situ measurements which are obtained from the instruments. The approach addresses the applicability of segmentation algorithms which can be run fully automatically. The approach successfully reconstructed a high precision and realistic model of trees in the Western Ghats region which failed in the case of traditional tree modeling methods which requires multiple instruments operating simultaneously for extracting each parameter. The method proved that using TLS; multiple forest parameters can be estimated simultaneously.  相似文献   

8.
西双版纳不同林茶复合生态系统碳储量   总被引:2,自引:0,他引:2  
为了探明上层遮荫树种对茶园碳储量的影响,根据所建立的茶园上层树种及茶树的生物量模型估算了不同林茶复合生态系统的生物量,结合植物、土壤样品碳含量的实测值,对西双版纳州勐海县4种茶园组合模式及纯茶园的碳储量进行了分析。结果表明:樟树+茶、樟-杉+茶2种组合模式的碳储量分别比纯茶园碳储量(223.442t·hm-2)高22.701、3.871t·hm-2,而4种遮荫树种+茶、6种遮荫树种+茶2种组合模式的碳储量则分别比纯茶园低10.828、5.717t·hm-2。各茶园总碳储量以土壤的碳储量所占比例最大,达91.8%~96.0%,随上层树种数量的增加而降低,并在4种遮荫树种+茶组合模式达到最低;而植物体的碳储量仅占总碳储量的4.0%~8.2%,呈现随上层树种数量增加而先增加后降低的趋势。表明西双版纳的人工茶园复合态系统具有很强的碳储存能力。  相似文献   

9.
Data with categorical attributes are ubiquitous in the real world. However, existing partitional clustering algorithms for categorical data are prone to fall into local optima. To address this issue, in this paper we propose a novel clustering algorithm, ABC-K-Modes (Artificial Bee Colony clustering based on K-Modes), based on the traditional k-modes clustering algorithm and the artificial bee colony approach. In our approach, we first introduce a one-step k-modes procedure, and then integrate this procedure with the artificial bee colony approach to deal with categorical data. In the search process performed by scout bees, we adopt the multi-source search inspired by the idea of batch processing to accelerate the convergence of ABC-K-Modes. The performance of ABC-K-Modes is evaluated by a series of experiments in comparison with that of the other popular algorithms for categorical data.  相似文献   

10.

Background

In genomics, hierarchical clustering (HC) is a popular method for grouping similar samples based on a distance measure. HC algorithms do not actually create clusters, but compute a hierarchical representation of the data set. Usually, a fixed height on the HC tree is used, and each contiguous branch of samples below that height is considered a separate cluster. Due to the fixed-height cutting, those clusters may not unravel significant functional coherence hidden deeper in the tree. Besides that, most existing approaches do not make use of available clinical information to guide cluster extraction from the HC. Thus, the identified subgroups may be difficult to interpret in relation to that information.

Results

We develop a novel framework for decomposing the HC tree into clusters by semi-supervised piecewise snipping. The framework, called guided piecewise snipping, utilizes both molecular data and clinical information to decompose the HC tree into clusters. It cuts the given HC tree at variable heights to find a partition (a set of non-overlapping clusters) which does not only represent a structure deemed to underlie the data from which HC tree is derived, but is also maximally consistent with the supplied clinical data. Moreover, the approach does not require the user to specify the number of clusters prior to the analysis. Extensive results on simulated and multiple medical data sets show that our approach consistently produces more meaningful clusters than the standard fixed-height cut and/or non-guided approaches.

Conclusions

The guided piecewise snipping approach features several novelties and advantages over existing approaches. The proposed algorithm is generic, and can be combined with other algorithms that operate on detected clusters. This approach represents an advancement in several regards: (1) a piecewise tree snipping framework that efficiently extracts clusters by snipping the HC tree possibly at variable heights while preserving the HC tree structure; (2) a flexible implementation allowing a variety of data types for both building and snipping the HC tree, including patient follow-up data like survival as auxiliary information.The data sets and R code are provided as supplementary files. The proposed method is available from Bioconductor as the R-package HCsnip.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-014-0448-1) contains supplementary material, which is available to authorized users.  相似文献   

11.
Many different phylogenetic clustering techniques are used currently. One approach is to first determine the topology with a common clustering method and then calculate the branch lengths of the tree. If the resulting tree is not optimal exchanging tree branches can make some local changes in the tree topology. The whole process can be iterated until a satisfactory result has been obtained. The efficiency of this method fully depends on the initially generated tree. Although local changes are made, the optimal tree will never be found if the initial tree is poorly chosen. In this article, genetic algorithms are applied such that the optimal tree can be found even with a bad initial tree topology. This tree generating method is tested by comparing its results with the results of the FITCH program in the PHYLIP software package. Two simulated data sets and a real data set are used.  相似文献   

12.
为探讨自然恢复过程中喀斯特森林土壤有机质含量(SOM)与土壤理化指标及植物多样性指数的相关性,对贵州省茂兰国家级自然保护区中不同森林类型的SOM、土壤理化性质和植物多样性进行了研究。根据乔木层物种的重要值,将保护区的41个调查样地划分为香叶树-枫香林、檵木-马尾松林、槭树-朴树林、小叶栾树-化香林、灯台-小花梾木林和四照花-青冈栎林类型。结果表明,部分森林类型土壤A层或B层的SOM差异显著,且部分森林类型的植物种数、直径、高度和密度,以及Margalef指数、Simpson指数、Shannon-Wiener指数和Pielou指数也差异显著。土壤孔隙度、蓄水量和主要肥力与养分指标随SOM增加而增大。乔木层的植物多样性指数与SOM呈正相关,与土壤A层SOM相关显著、Simpson指数和Pielou指数与土壤B层SOM相关显著。灌木层、草本层的植物多样性指数与SOM相关不显著。多元分析结果表明,植物多样性指数对土壤A层SOM的总贡献率呈灌木层乔木层草本层、对土壤B层SOM的总贡献率呈草本层乔木层灌木层的趋势,表明喀斯特地区SOM管理的植物多样性措施适宜以乔木树种为主、辅以灌木与草本层植物的复合经营方式。同时,土壤SOM不仅受乔木层植物多样性指数的影响、也受林分所处演替阶段与结构指标的影响,植物多样性指数的二次多项式拐点可成为喀斯特石漠化治理工程中物种量化管理的参考依据之一。  相似文献   

13.
利用第八次森林资源连续清查数据和不同树种的树干密度、含碳率等参数,运用生物量清单法,估算了西藏自治区森林乔木层植被碳储量和碳密度.结果表明: 西藏森林生态系统乔木层植被总碳储量为1.067×109 t,平均碳密度为72.49 t·hm-2.不同林分乔木层碳储量依次为:乔木林>散生木>疏林>四旁树.不同林种乔木层碳储量大小依次为:防护林>特殊用途林>用材林>薪炭林,其中前两者所占比例为88.5%;不同林种乔木层平均碳密度为88.09 t·hm-2.不同林组乔木层碳储量与其分布面积排序一致,依次为:成熟林>过熟林>近熟林>中龄林>幼龄林.其中,成熟林乔木层碳储量占不同林组乔木层总碳储量的50%,并且不同林组乔木层碳储量随着林龄的增加呈先上升后下降的趋势.  相似文献   

14.
Gene family evolution is determined by microevolutionary processes (e.g., point mutations) and macroevolutionary processes (e.g., gene duplication and loss), yet macroevolutionary considerations are rarely incorporated into gene phylogeny reconstruction methods. We present a dynamic program to find the most parsimonious gene family tree with respect to a macroevolutionary optimization criterion, the weighted sum of the number of gene duplications and losses. The existence of a polynomial delay algorithm for duplication/loss phylogeny reconstruction stands in contrast to most formulations of phylogeny reconstruction, which are NP-complete. We next extend this result to obtain a two-phase method for gene tree reconstruction that takes both micro- and macroevolution into account. In the first phase, a gene tree is constructed from sequence data, using any of the previously known algorithms for gene phylogeny construction. In the second phase, the tree is refined by rearranging regions of the tree that do not have strong support in the sequence data to minimize the duplication/lost cost. Components of the tree with strong support are left intact. This hybrid approach incorporates both micro- and macroevolutionary considerations, yet its computational requirements are modest in practice because the two-phase approach constrains the search space. Our hybrid algorithm can also be used to resolve nonbinary nodes in a multifurcating gene tree. We have implemented these algorithms in a software tool, NOTUNG 2.0, that can be used as a unified framework for gene tree reconstruction or as an exploratory analysis tool that can be applied post hoc to any rooted tree with bootstrap values. The NOTUNG 2.0 graphical user interface can be used to visualize alternate duplication/loss histories, root trees according to duplication and loss parsimony, manipulate and annotate gene trees, and estimate gene duplication times. It also offers a command line option that enables high-throughput analysis of a large number of trees.  相似文献   

15.
The accumulation associated protein (Aap) of Staphylococcus epidermidis mediates intercellular adhesion events necessary for biofilm growth. This process depends upon Zn2+‐induced self‐assembly of G5 domains within the B‐repeat region of the protein, forming anti‐parallel, intertwined protein “ropes” between cells. Pleomorphism in the Zn2+‐coordinating residues was observed in previously solved crystal structures, suggesting that the metal binding site might accommodate other transition metals and thereby support dimerization. By use of carefully selected buffer systems and a specialized approach to analyze sedimentation velocity analytical ultracentrifugation data, we were able to analyze low‐affinity metal binding events in solution. Our data show that both Zn2+ and Cu2+ support B‐repeat assembly, whereas Mn2+, Co2+, and Ni2+ bind to Aap but do not support self‐association. As the number of G5 domains are increased in longer B‐repeat constructs, the total concentration of metal required for dimerization decreases and the transition between monomer and dimer becomes more abrupt. These characteristics allow Aap to function as an environmental sensor that regulates biofilm formation in response to local concentrations of Zn2+ and Cu2+, both of which are implicated in immune cell activity.  相似文献   

16.
为阐明植物群落结构特征和物种多样性之间的相互关系,选择额尔齐斯河流域白桦林国家森林公园的天然垂枝桦(Betula pendula)纯林、垂枝桦苦杨混交林、垂枝桦白柳混交林为研究对象,分林层调查群落的基本特征参数(高度、枝下高、冠幅、胸径、盖度等),计算物种重要值、丰富度指数、多样性指数、均匀度指数,并进行典范对应分析。结果表明:(1)垂枝桦白柳混交林的乔木层树高、枝下高和灌木层的地径、盖度均最高;3种群落的草本层特征参数(除基径)均具有显著差异;垂枝桦白柳林的盖度分别比垂枝桦苦杨林和垂枝桦纯林高19.1%和51.8%。(2)3种群落的乔木层重要值最高为垂枝桦,灌木层为疏花蔷薇(Rosa laxa)、阿尔泰山楂(Crataegus altaica),草本层为莎薹草(Carex bohemica)。(3)3种群落类型中乔木层的丰富度指数R、Simpson指数、Shannon-Wiener指数和灌木层的Pielou指数、Alatalo指数呈现出相同规律,即垂枝桦白柳混交林>垂枝桦苦杨混交林>垂枝桦纯林;草本层中除Alatalo指数之外,其他指数均呈现垂枝桦纯林>垂枝桦苦杨混交林>垂枝桦白柳混交林(P>0.05)。(4)CCA排序结果表明,不同垂枝桦林群落结构特征与物种多样性关系有差异。其中,垂枝桦纯林中,对物种多样性影响最大的是乔木枝下高、灌木株高、冠幅以及草本盖度;垂枝桦苦杨混交林中,对物种多样性影响最大的是乔木高度和冠幅、灌木冠幅和草本高度;垂枝桦白柳林中,对物种多样性影响最大的是乔木胸径、灌木冠幅、盖度以及草本高度。研究表明,乔木枝下高度、灌木冠幅、草本高度是影响3种群落类型物种多样性的主要因素。  相似文献   

17.
Toadlets of the genus Brachycephalus are endemic to the Atlantic rainforests of southeastern and southern Brazil. The 14 species currently described have snout-vent lengths less than 18 mm and are thought to have evolved through miniaturization: an evolutionary process leading to an extremely small adult body size. Here, we present the first comprehensive phylogenetic analysis for Brachycephalus, using a multilocus approach based on two nuclear (Rag-1 and Tyr) and three mitochondrial (Cyt b, 12S, and 16S rRNA) gene regions. Phylogenetic relationships were inferred using a partitioned Bayesian analysis of concatenated sequences and the hierarchical Bayesian method (BEST) that estimates species trees based on the multispecies coalescent model. Individual gene trees showed conflict and also varied in resolution. With the exception of the mitochondrial gene tree, no gene tree was completely resolved. The concatenated gene tree was completely resolved and is identical in topology and degree of statistical support to the individual mtDNA gene tree. On the other hand, the BEST species tree showed reduced significant node support relative to the concatenate tree and recovered a basal trichotomy, although some bipartitions were significantly supported at the tips of the species tree. Comparison of the log likelihoods for the concatenated and BEST trees suggests that the method implemented in BEST explains the multilocus data for Brachycephalus better than the Bayesian analysis of concatenated data. Landmark-based geometric morphometrics revealed marked variation in cranial shape between the species of Brachycephalus. In addition, a statistically significant association was demonstrated between variation in cranial shape and genetic distances estimated from the mtDNA and nuclear loci. Notably, B. ephippium and B. garbeana that are predicted to be sister-species in the individual and concatenated gene trees and the BEST species tree share an evolutionary novelty, the hyperossified dorsal plate.  相似文献   

18.
Software architecture definition for on-demand cloud provisioning   总被引:1,自引:0,他引:1  
Cloud computing is a promising paradigm for the provisioning of IT services. Cloud computing infrastructures, such as those offered by the RESERVOIR project, aim to facilitate the deployment, management and execution of services across multiple physical locations in a seamless manner. In order for service providers to meet their quality of service objectives, it is important to examine how software architectures can be described to take full advantage of the capabilities introduced by such platforms. When dealing with software systems involving numerous loosely coupled components, architectural constraints need to be made explicit to ensure continuous operation when allocating and migrating services from one host in the Cloud to another. In addition, the need for optimising resources and minimising over-provisioning requires service providers to control the dynamic adjustment of capacity throughout the entire service lifecycle. We discuss the implications for software architecture definitions of distributed applications that are to be deployed on Clouds. In particular, we identify novel primitives to support service elasticity, co-location and other requirements, propose language abstractions for these primitives and define their behavioural semantics precisely by establishing constraints on the relationship between architecture definitions and Cloud management infrastructures using a model denotational approach in order to derive appropriate service management cycles. Using these primitives and semantic definition as a basis, we define a service management framework implementation that supports on demand cloud provisioning and present a novel monitoring framework that meets the demands of Cloud based applications.  相似文献   

19.
Brain-computer interaction (BCI) and physiological computing are terms that refer to using processed neural or physiological signals to influence human interaction with computers, environment, and each other. A major challenge in developing these systems arises from the large individual differences typically seen in the neural/physiological responses. As a result, many researchers use individually-trained recognition algorithms to process this data. In order to minimize time, cost, and barriers to use, there is a need to minimize the amount of individual training data required, or equivalently, to increase the recognition accuracy without increasing the number of user-specific training samples. One promising method for achieving this is collaborative filtering, which combines training data from the individual subject with additional training data from other, similar subjects. This paper describes a successful application of a collaborative filtering approach intended for a BCI system. This approach is based on transfer learning (TL), active class selection (ACS), and a mean squared difference user-similarity heuristic. The resulting BCI system uses neural and physiological signals for automatic task difficulty recognition. TL improves the learning performance by combining a small number of user-specific training samples with a large number of auxiliary training samples from other similar subjects. ACS optimally selects the classes to generate user-specific training samples. Experimental results on 18 subjects, using both nearest neighbors and support vector machine classifiers, demonstrate that the proposed approach can significantly reduce the number of user-specific training data samples. This collaborative filtering approach will also be generalizable to handling individual differences in many other applications that involve human neural or physiological data, such as affective computing.  相似文献   

20.
湖北省主要森林类型生态系统生物量与碳密度比较   总被引:2,自引:0,他引:2  
利用野外调查数据对湖北省封山育林下的次生林、次生林、人工林森林生态系统碳密度进行了分析,结果表明:封山育林下的次生林、次生林和人工林生态系统乔木层平均碳密度分别为133.87、73.42和111.62t·hm-2,灌木层平均碳密度分别为1.65、1.40和1.52t·hm-2,草本层平均碳密度分别为0.13、0.09和0.13t·hm-2,枯落物层平均碳密度分别为0.47、1.34和0.93t·hm-2,乔木层碳密度作为生态系统碳储量的主要贡献者占总生物碳密度的98.35%、96.29%和97.74%,林下植被(灌木层和草本层)碳密度分别占1.31%、1.95%和1.44%,凋落物层碳密度分别占0.34%、1.76%和0.82%。土壤(0~100cm)碳密度平均值分别为57.04、66.92和54.12t·hm-2,土壤碳密度的60%储存在0~40cm土壤中,并随土层深度增加,各层次土壤碳密度逐渐减少。森林生态系统的乔木层、灌木层、草本层、凋落物层生物量和土壤层碳密度均表现出:封山育林下的次生林、次生林大于人工林。封山育林下的次生林、次生林和人工林碳密度分布序列为土壤(0~100cm)>乔木层>灌木层>草本层>枯落物层。可见,封山育林下的次生林更有助于提高森林碳汇,实施近自然林经营是提升该区域森林碳汇能力的重要途径。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号