首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Gossip protocols provide a means by which failures can be detected in large, distributed systems in an asynchronous manner without the limits associated with reliable multicasting for group communications. However, in order to be effective with application recovery and reconfiguration, these protocols require mechanisms by which failures can be detected with system-wide consensus in a scalable fashion. This paper presents three new gossip-style protocols supported by a novel algorithm to achieve consensus in scalable, heterogeneous clusters. The round-robin protocol improves on basic randomized gossiping by distributing gossip messages in a deterministic order that optimizes bandwidth consumption. Redundant gossiping is completely eliminated in the binary round-robin protocol, and the round-robin with sequence check protocol is a useful extension that yields efficient detection times without the need for system-specific optimization. The distributed consensus algorithm works with these gossip protocols to achieve agreement among the operable nodes in the cluster on the state of the system featuring either a flat or a layered design. The various protocols are simulated and evaluated in terms of consensus time and scalability using a high-fidelity, fault-injection model for distributed systems comprised of clusters of workstations connected by high-performance networks.  相似文献   

2.
Gossip protocols and services provide a means by which failures can be detected in large, distributed systems in an asynchronous manner without the limits associated with reliable multicasting for group communications. Extending the gossip protocol such that a system reaches consensus on detected faults can be performed via a flat structure, or it can be hierarchically distributed across cooperating layers of nodes. In this paper, the performance of gossip services employing flat and hierarchical schemes is analyzed on an experimental testbed in terms of consensus time, resource utilization and scalability. Performance associated with a hierarchically arranged gossip scheme is analyzed with varying group sizes and is shown to scale well. Resource utilization of the gossip-style failure detection and consensus service is measured in terms of network bandwidth utilization and CPU utilization. Analytical models are developed for resource utilization and performance projections are made for large system sizes.  相似文献   

3.
Failure instances in distributed computing systems (DCSs) have exhibited temporal and spatial correlations, where a single failure instance can trigger a set of failure instances simultaneously or successively within a short time interval. In this work, we propose a correlated failure prediction approach (CFPA) to predict correlated failures of computing elements in DCSs. The approach models correlated-failure patterns using the concept of probabilistic shared risk groups and makes a prediction for correlated failures by exploiting an association rule mining approach in a parallel way. We conduct extensive experiments to evaluate the feasibility and effectiveness of CFPA using both failure traces from Los Alamos National Lab and simulated datasets. The experimental results show that the proposed approach outperforms other approaches in both the failure prediction performance and the execution time, and can potentially provide better prediction performance in a larger system.  相似文献   

4.
High performance and distributed computing systems such as peta-scale, grid and cloud infrastructure are increasingly used for running scientific models and business services. These systems experience large availability variations through hardware and software failures. Resource providers need to account for these variations while providing the required QoS at appropriate costs in dynamic resource and application environments. Although the performance and reliability of these systems have been studied separately, there has been little analysis of the lost Quality of Service (QoS) experienced with varying availability levels. In this paper, we present a resource performability model to estimate lost performance and corresponding cost considerations with varying availability levels. We use the resulting model in a multi-phase planning approach for scheduling a set of deadline-sensitive meteorological workflows atop grid and cloud resources to trade-off performance, reliability and cost. We use simulation results driven by failure data collected over the lifetime of high performance systems to demonstrate how the proposed scheme better accounts for resource availability.  相似文献   

5.
Mark rate, or the proportion of the population with unique, identifiable marks, must be determined in order to estimate population size from photographic identification data. In this study we address field sampling protocols and estimation methods for robust estimation of mark rate and its uncertainty in cetacean populations. We present two alternatives for estimating the variance of mark rate: (1) a variance estimator for clusters of unequal sizes (SRCS) and (2) a hierarchical Bayesian model (SRCS-Bayes), and compare them to the simple random sampling (SRS) variance estimator. We tested these variance estimators using a simulation to see how they perform at varying mark rates, number of groups sampled, photos per group, and mean group sizes. The hierarchical Bayesian model outperformed the frequentist variance estimators, with the true mark rate of the population held in its 95% HDI 91.9% of the time (compared with coverage of 79% for the SRS method and 76.3% for the SRCS-Cochran method). The simulation results suggest that, ideally, mark rate and its precision should be quantified using hierarchical Bayesian modeling, and researchers should attempt to sample as many unique groups as possible to improve accuracy and precision.  相似文献   

6.
Gossip protocols have proven to be effective means by which failures can be detected in large, distributed systems in an asynchronous manner without the limitations associated with reliable multicasting for group communications. In this paper, we discuss the development and features of a Gossip-Enabled Monitoring Service (GEMS), a highly responsive and scalable resource monitoring service, to monitor health and performance information in heterogeneous distributed systems. GEMS has many novel and essential features such as detection of network partitions and dynamic insertion of new nodes into the service. Easily extensible, GEMS also incorporates facilities for distributing arbitrary system and application-specific data. We present experiments and analytical projections demonstrating scalability, fast response times and low resource utilization requirements, making GEMS a potent solution for resource monitoring in distributed computing.  相似文献   

7.
Frailty models are useful for measuring unobserved heterogeneity in risk of failures across clusters, providing cluster-specific risk prediction. In a frailty model, the latent frailties shared by members within a cluster are assumed to act multiplicatively on the hazard function. In order to obtain parameter and frailty variate estimates, we consider the hierarchical likelihood (H-likelihood) approach (Ha, Lee and Song, 2001. Hierarchical-likelihood approach for frailty models. Biometrika 88, 233-243) in which the latent frailties are treated as "parameters" and estimated jointly with other parameters of interest. We find that the H-likelihood estimators perform well when the censoring rate is low, however, they are substantially biased when the censoring rate is moderate to high. In this paper, we propose a simple and easy-to-implement bias correction method for the H-likelihood estimators under a shared frailty model. We also extend the method to a multivariate frailty model, which incorporates complex dependence structure within clusters. We conduct an extensive simulation study and show that the proposed approach performs very well for censoring rates as high as 80%. We also illustrate the method with a breast cancer data set. Since the H-likelihood is the same as the penalized likelihood function, the proposed bias correction method is also applicable to the penalized likelihood estimators.  相似文献   

8.
Metabolic profiling using gas chromatography-mass spectrometry technologies is a technique whose potential in the field of functional genomics is largely untapped. To demonstrate the general usefulness of this technique, we applied to diverse plant genotypes a recently developed profiling protocol that allows detection of a wide range of hydrophilic metabolites within a single chromatographic run. For this purpose, we chose four independent potato genotypes characterized by modifications in sucrose metabolism. Using data-mining tools, including hierarchical cluster analysis and principle component analysis, we were able to assign clusters to the individual plant systems and to determine relative distances between these clusters. Extraction analysis allowed identification of the most important components of these clusters. Furthermore, correlation analysis revealed close linkages between a broad spectrum of metabolites. In a second, complementary approach, we subjected wild-type potato tissue to environmental manipulations. The metabolic profiles from these experiments were compared with the data sets obtained for the transgenic systems, thus illustrating the potential of metabolic profiling in assessing how a genetic modification can be phenocopied by environmental conditions. In summary, these data demonstrate the use of metabolic profiling in conjunction with data-mining tools as a technique for the comprehensive characterization of a plant genotype.  相似文献   

9.
In this paper, we present a fault tolerant and recovery system called FRASystem (Fault Tolerant & Recovery Agent System) using multi-agent in distributed computing systems. Previous rollback-recovery protocols were dependent on an inherent communication and an underlying operating system, which caused a decline of computing performance. We propose a rollback-recovery protocol that works independently on an operating system and leads to an increasing portability and extensibility. We define four types of agents: (1) a recovery agent performs a rollback-recovery protocol after a failure, (2) an information agent constructs domain knowledge as a rule of fault tolerance and information during a failure-free operation, (3) a facilitator agent controls the communication between agents, (4) a garbage collection agent performs garbage collection of the useless fault tolerance information. Since agent failures may lead to inconsistent states of a system and a domino effect, we propose an agent recovery algorithm. A garbage collection protocol addresses the performance degradation caused by the increment of saved fault tolerance information in a stable storage. We implemented a prototype of FRASystem using Java and CORBA and experimented the proposed rollback-recovery protocol. The simulations results indicate that the performance of our protocol is better than previous rollback-recovery protocols which use independent checkpointing and pessimistic message logging without using agents. Our contributions are as follows: (1) this is the first rollback-recovery protocol using agents, (2) FRASystem is not dependent on an operating system, and (3) FRASystem provides a portability and extensibility.  相似文献   

10.
T Noguti  N Go 《Proteins》1989,5(2):104-112
Conformational fluctuations in a globular protein, bovine pancreatic trypsin inhibitor, in the time range between picoseconds and nanoseconds are studied by a Monte Carlo simulation method. Multiple energy minima are derived from sampled conformations by minimizing their energy. They are distributed in clusters in the conformational space. A hierarchical structure is observed in the simulated dynamics. In the time range between 10(-14) and 10(-10) seconds dynamics is well represented by a superposition of vibrational motions within an energy well with transitions among minima within each cluster. Transitions among clusters take place in the time range of nanoseconds or longer.  相似文献   

11.
The calcium dependent plasticity (CaDP) approach to the modeling of synaptic weight change is applied using a neural field approach to realistic repetitive transcranial magnetic stimulation (rTMS) protocols. A spatially-symmetric nonlinear neural field model consisting of populations of excitatory and inhibitory neurons is used. The plasticity between excitatory cell populations is then evaluated using a CaDP approach that incorporates metaplasticity. The direction and size of the plasticity (potentiation or depression) depends on both the amplitude of stimulation and duration of the protocol. The breaks in the inhibitory theta-burst stimulation protocol are crucial to ensuring that the stimulation bursts are potentiating in nature. Tuning the parameters of a spike-timing dependent plasticity (STDP) window with a Monte Carlo approach to maximize agreement between STDP predictions and the CaDP results reproduces a realistically-shaped window with two regions of depression in agreement with the existing literature. Developing understanding of how TMS interacts with cells at a network level may be important for future investigation.  相似文献   

12.
Recently much effort has been spent on providing a shared address space abstraction on clusters of small-scale symmetric multiprocessors. However, advances in technology will soon make it possible to construct these clusters with larger-scale cc-NUMA nodes, connected with non-coherent networks that offer latencies and bandwidth comparable to interconnection networks used in hardware cache-coherent systems. The shared memory abstraction can be provided on these systems in software across nodes and hardware within nodes.Recent simulation results have demonstrated that certain features of modern system area networks can be used to greatly reduce shared virtual memory (SVM) overheads [5,19]. In this work we leverage these results and we use detailed system emulation to investigate building future software shared memory clusters. We use an existing, large-scale hardware cache-coherent system with 64 processors to emulate a complete future cluster. We port our existing infrastructure (communication layer and shared memory protocol) on this system and study the behavior of a set of real applications. We present results for both 32- and 64-processor system configurations.We find that: (i) System emulation is invaluable in quantifying potential benefits from changes in the technology of commodity components. More importantly, it reveals potential problems in future systems that are easily overlooked in simulation studies. Thus, system emulation should be used along with other modeling techniques (e.g., simulation, implementation) to investigate future trends. (ii) Our work shows that current SVM protocols can only partially take advantage of faster interconnects and wider nodes due to operating system and architectural implications. We quantify the related issues and identify the areas where more research is required for future SVM clusters.  相似文献   

13.
Diao G  Lin DY 《Biometrics》2005,61(3):789-798
Statistical methods for the detection of genes influencing quantitative traits with the aid of genetic markers are well developed for normally distributed, fully observed phenotypes. Many experiments are concerned with failure-time phenotypes, which have skewed distributions and which are usually subject to censoring because of random loss to follow-up, failures from competing causes, or limited duration of the experiment. In this article, we develop semiparametric statistical methods for mapping quantitative trait loci (QTLs) based on censored failure-time phenotypes. We formulate the effects of the QTL genotype on the failure time through the Cox (1972, Journal of the Royal Statistical Society, Series B 34, 187-220) proportional hazards model and derive efficient likelihood-based inference procedures. In addition, we show how to assess statistical significance when searching several regions or the entire genome for QTLs. Extensive simulation studies demonstrate that the proposed methods perform well in practical situations. Applications to two animal studies are provided.  相似文献   

14.
This paper presents a framework for building and deploying protocols for migrating mobile agents over the Internet. The framework enables network protocols for agent migration to be naturally implemented within mobile agents and then dynamically deployed at remote hosts by migrating the agents that perform the protocols. It is built on a hierarchical mobile agent system, called MobileSpaces, and several protocols for migrating agents for managing cluster computing systems have been designed and implemented based on the framework. This paper describes the framework and its prototype implementation, which uses Java as both the implementation language and the protocol development language.  相似文献   

15.
Case-cohort designs and analysis for clustered failure time data   总被引:1,自引:0,他引:1  
Lu SE  Shih JH 《Biometrics》2006,62(4):1138-1148
Case-cohort design is an efficient and economical design to study risk factors for infrequent disease in a large cohort. It involves the collection of covariate data from all failures ascertained throughout the entire cohort, and from the members of a random subcohort selected at the onset of follow-up. In the literature, the case-cohort design has been extensively studied, but was exclusively considered for univariate failure time data. In this article, we propose case-cohort designs adapted to multivariate failure time data. An estimation procedure with the independence working model approach is used to estimate the regression parameters in the marginal proportional hazards model, where the correlation structure between individuals within a cluster is left unspecified. Statistical properties of the proposed estimators are developed. The performance of the proposed estimators and comparisons of statistical efficiencies are investigated with simulation studies. A data example from the Translating Research into Action for Diabetes (TRIAD) study is used to illustrate the proposed methodology.  相似文献   

16.
The mechanics of peptide–protein docking has long been an area of intense interest to the computational community. Here we discuss an improved docking protocol named XPairIt which uses a multitier approach, combining the PyRosetta docking software with the NAMD molecular dynamics package through a biomolecular simulation programming interface written in Python. This protocol is designed for systems where no a priori information of ligand structure (beyond sequence) or binding location is known. It provides for efficient incorporation of both ligand and target flexibility, is HPC-ready and is easily extensible for use of custom code. We apply this protocol to a set of 11 test cases drawn from benchmarking databases and from previously published studies for direct comparison with existing protocols. Strengths, weaknesses and areas of improvement are discussed.  相似文献   

17.
An environment to support the modeling, analysis, simulation, and development of state transition models, SMOOCHES (State Machines for Object-Oriented Concurrent Hierarchical Engineering Specifications), is presented. SMOOCHES allows the hierarchical construction, analysis, and simulation of state transition models in an object-oriented distributed environment. Statecharts (see Harel 1987b), a powerful mechanism for state transition specification, are fundamental to the development of SMOOCHES. To assist in the specification of hierarchical state transition models for distributed and reactive systems, statecharts are extended by introducing the concept of exit-safe states. SMOOCHES allows the specification of objects in the system with hierarchical state transition models and the derivation of new classes of objects through inheritance. A graphical monitoring system has been developed to represent and simulate the object state life cycles and monitor event generations. The example presented illustrates the modeling and simulation of different state life cycles of an assembly robot.  相似文献   

18.
We present a distributed component-object model (DCOM) based single system image (SSI) for dependable parallel implementation of genetic programming (DPIGP). DPIGP is aimed to significantly and reliably improve the computational performance of genetic programming (GP) exploiting the inherent parallelism in GP among the evaluation of individuals. It runs on cost-effective clusters of commodity, non-dedicated, heterogeneous workstations or PCs. Developed SSI represents the pool of heterogeneous workstations as a single, unified virtual resource – a metacomputer, and addresses the issues of locating and allocating the physical resources, communicating between the entities of DPIGP, scheduling and load balancing. In addition, addressing the issue of fault tolerance, SSI allows for building a highly available metacomputer in which the cases of workstation failure result only in a corresponding partial degradation of the overall performance characteristics of DPIGP. Adopting DCOM as a communicating paradigm offers the benefits of software platform- and network protocol neutrality of proposed approach; and the generic support for the issues of locating, allocating and security of the distributed entities of DPIGP.  相似文献   

19.
We describe a Monte Carlo simulation of the within-host dynamics of human immunodeficiency virus 1 (HIV-1). The simulation proceeds at the level of individual T-cells and virions in a small volume of plasma, thus capturing the inherent stochasticity in viral replication, mutation and T-cell infection. When cell lifetimes are distributed exponentially in the Monte Carlo approach, our simulation results are in perfect agreement with the predictions of the corresponding systems of differential equations from the literature. The Monte Carlo model, however, uniquely allows us to estimate the natural variability in important parameters such as the T-cell count, viral load, and the basic reproductive ratio, in both the presence and absence of drug therapy. The simulation also yields the probability that an infection will not become established after exposure to a viral inoculum of a given size. Finally, we extend the Monte Carlo approach to include distributions of cell lifetimes that are less-dispersed than exponential.  相似文献   

20.
We conducted a comprehensive metabolic phenotyping of potato (Solanum tuberosum L. cv Desiree) tuber tissue that had been modified either by transgenesis or exposure to different environmental conditions using a recently developed gas chromatography-mass spectrometry profiling protocol. Applying this technique, we were able to identify and quantify the major constituent metabolites of the potato tuber within a single chromatographic run. The plant systems that we selected to profile were tuber discs incubated in varying concentrations of fructose, sucrose, and mannitol and transgenic plants impaired in their starch biosynthesis. The resultant profiles were then compared, first at the level of individual metabolites and then using the statistical tools hierarchical cluster analysis and principal component analysis. These tools allowed us to assign clusters to the individual plant systems and to determine relative distances between these clusters; furthermore, analyzing the loadings of these analyses enabled identification of the most important metabolites in the definition of these clusters. The metabolic profiles of the sugar-fed discs were dramatically different from the wild-type steady-state values. When these profiles were compared with one another and also with those we assessed in previous studies, however, we were able to evaluate potential phenocopies. These comparisons highlight the importance of such an approach in the functional and qualitative assessment of diverse systems to gain insights into important mediators of metabolism.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号