首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
The large choice of Distributed Computing Infrastructures (DCIs) available allows users to select and combine their preferred architectures amongst Clusters, Grids, Clouds, Desktop Grids and more. In these hybrid DCIs, elasticity is emerging as a key property. In elastic infrastructures, resources available to execute application continuously vary, either because of application requirements or because of constraints on the infrastructure, such as node volatility. In the former case, there is no guarantee that the computing resources will remain available during the entire execution of an application. In this paper, we show that Bag-of-Tasks (BoT) execution on these “Best-Effort” infrastructures suffer from a drop of the task completion rate at the end of the execution. The SpeQuloS service presented in this paper improves the Quality of Service (QoS) of BoT applications executed on hybrid and elastic infrastructures. SpeQuloS monitors the execution of the BoT, and dynamically supplies fast and reliable Cloud resources when the critical part of the BoT is executed. SpeQuloS offers several features to hybrid DCIs users, such as estimating completion time and execution speedup. Performance evaluation shows that BoT executions can be accelerated by a factor 2, while offloading less than 2.5 % of the workload to the Cloud. We report on several scenarios where SpeQuloS is deployed on hybrid infrastructures featuring a large variety of infrastructures combinations. In the context of the European Desktop Grid Initiative (EDGI), SpeQuloS is operated to improve QoS of Desktop Grids using resources from private Clouds. We present a use case where SpeQuloS uses both EC2 regular and spot instances to decrease the cost of computation while preserving a similar QoS level. Finally, in the last scenario SpeQuloS allows to optimize Grid5000 resources utilization.  相似文献   

2.
3.
Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has a sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments. © 1994 Wiley-Liss, Inc.  相似文献   

4.
5.
Hsu JB  Bretaña NA  Lee TY  Huang HD 《PloS one》2011,6(11):e27567
Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identified. Furthermore, experimental methods for splicing factor identification are time-consuming and lab-intensive. Although many computational methods have been proposed for the identification of RNA-binding proteins, there exists no development that focuses on the identification of RNA-splicing related proteins so far. Therefore, we are motivated to design a method that focuses on the identification of human splicing factors using experimentally verified splicing factors. The investigation of amino acid composition reveals that there are remarkable differences between splicing factors and non-splicing proteins. A support vector machine (SVM) is utilized to construct a predictive model, and the five-fold cross-validation evaluation indicates that the SVM model trained with amino acid composition could provide a promising accuracy (80.22%). Another basic feature, amino acid dipeptide composition, is also examined to yield a similar predictive performance to amino acid composition. In addition, this work presents that the incorporation of evolutionary information and domain information could improve the predictive performance. The constructed models have been demonstrated to effectively classify (73.65% accuracy) an independent data set of human splicing factors. The result of independent testing indicates that in silico identification could be a feasible means of conducting preliminary analyses of splicing factors and significantly reducing the number of potential targets that require further in vivo or in vitro confirmation.  相似文献   

6.
7.
In CAPRI rounds 6-12, RosettaDock successfully predicted 2 of 5 unbound-unbound targets to medium accuracy. Improvement over the previous method was achieved with computational mutagenesis to select decoys that match the energetics of experimentally determined hot spots. In the case of Target 21, Orc1/Sir1, this resulted in a successful docking prediction where RosettaDock alone or with simple site constraints failed. Experimental information also helped limit the interacting region of TolB/Pal, producing a successful prediction of Target 26. In addition, we docked multiple loop conformations for Target 20, and we developed a novel flexible docking algorithm to simultaneously optimize backbone conformation and rigid-body orientation to generate a wide diversity of conformations for Target 24. Continued challenges included docking of homology targets that differ substantially from their template (sequence identity <50%) and accounting for large conformational changes upon binding. Despite a larger number of unbound-unbound and homology model binding targets, Rounds 6-12 reinforced that RosettaDock is a powerful algorithm for predicting bound complex structures, especially when combined with experimental data.  相似文献   

8.
9.
To facilitate rigorous analysis of molecular motions in proteins, DNA, and RNA, we present a new version of ROTDIF, a program for determining the overall rotational diffusion tensor from single- or multiple-field nuclear magnetic resonance relaxation data. We introduce four major features that expand the program’s versatility and usability. The first feature is the ability to analyze, separately or together, 13C and/or 15N relaxation data collected at a single or multiple fields. A significant improvement in the accuracy compared to direct analysis of R 2/R 1 ratios, especially critical for analysis of 13C relaxation data, is achieved by subtracting high-frequency contributions to relaxation rates. The second new feature is an improved method for computing the rotational diffusion tensor in the presence of biased errors, such as large conformational exchange contributions, that significantly enhances the accuracy of the computation. The third new feature is the integration of the domain alignment and docking module for relaxation-based structure determination of multi-domain systems. Finally, to improve accessibility to all the program features, we introduced a graphical user interface that simplifies and speeds up the analysis of the data. Written in Java, the new ROTDIF can run on virtually any computer platform. In addition, the new ROTDIF achieves an order of magnitude speedup over the previous version by implementing a more efficient deterministic minimization algorithm. We not only demonstrate the improvement in accuracy and speed of the new algorithm for synthetic and experimental 13C and 15N relaxation data for several proteins and nucleic acids, but also show that careful analysis required especially for characterizing RNA dynamics allowed us to uncover subtle conformational changes in RNA as a function of temperature that were opaque to previous analysis.  相似文献   

10.
Selection of protein targets for study is central to structural biology and may be influenced by numerous factors. A key aim is to maximise returns for effort invested by identifying proteins with the balance of biophysical properties that are conducive to success at all stages (e.g. solubility, crystallisation) in the route towards a high resolution structural model. Selected targets can be optimised through construct design (e.g. to minimise protein disorder), switching to a homologous protein, and selection of experimental methodology (e.g. choice of expression system) to prime for efficient progress through the structural proteomics pipeline. Here we discuss computational techniques in target selection and optimisation, with more detailed focus on tools developed within the Scottish Structural Proteomics Facility (SSPF); namely XANNpred, ParCrys, OB-Score (target selection) and TarO (target optimisation). TarO runs a large number of algorithms, searching for homologues and annotating the pool of possible alternative targets. This pool of putative homologues is presented in a ranked, tabulated format and results are also visualised as an automatically generated and annotated multiple sequence alignment. The target selection algorithms each predict the propensity of a selected protein target to progress through the experimental stages leading to diffracting crystals. This single predictor approach has advantages for target selection, when compared with an approach using two or more predictors that each predict for success at a single experimental stage. The tools described here helped SSPF achieve a high (21%) success rate in progressing cloned targets to diffraction-quality crystals.  相似文献   

11.

Background

Barrett''s esophagus predisposes to esophageal adenocarcinoma. However, the value of endoscopic surveillance in Barrett''s esophagus has been debated because of the low incidence of esophageal adenocarcinoma in Barrett''s esophagus. Moreover, high inter-observer and sampling-dependent variation in the histologic staging of dysplasia make clinical risk assessment problematic. In this study, we developed a 3-tiered risk stratification strategy, based on systematically selected epigenetic and clinical parameters, to improve Barrett''s esophagus surveillance efficiency.

Methods and Findings

We defined high-grade dysplasia as endpoint of progression, and Barrett''s esophagus progressor patients as Barrett''s esophagus patients with either no dysplasia or low-grade dysplasia who later developed high-grade dysplasia or esophageal adenocarcinoma. We analyzed 4 epigenetic and 3 clinical parameters in 118 Barrett''s esophagus tissues obtained from 35 progressor and 27 non-progressor Barrett''s esophagus patients from Baltimore Veterans Affairs Maryland Health Care Systems and Mayo Clinic. Based on 2-year and 4-year prediction models using linear discriminant analysis (area under the receiver-operator characteristic (ROC) curve: 0.8386 and 0.7910, respectively), Barrett''s esophagus specimens were stratified into high-risk (HR), intermediate-risk (IR), or low-risk (LR) groups. This 3-tiered stratification method retained both the high specificity of the 2-year model and the high sensitivity of the 4-year model. Progression-free survivals differed significantly among the 3 risk groups, with p = 0.0022 (HR vs. IR) and p<0.0001 (HR or IR vs. LR). Incremental value analyses demonstrated that the number of methylated genes contributed most influentially to prediction accuracy.

Conclusions

This 3-tiered risk stratification strategy has the potential to exert a profound impact on Barrett''s esophagus surveillance accuracy and efficiency.  相似文献   

12.
Kim C  Park D  Seol Y  Hahn J 《Bioinformation》2011,6(6):246-247
The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.  相似文献   

13.
14.
15.
The fragmentation of habitats in intensively managed farming landscapes is often considered to be partly responsible for butterfly population decline in Europe and the USA. Although relatively little is known about New Zealand butterfly ecology, agricultural landscapes in lowland New Zealand are managed similarly to those in Europe and ecosystem services (ES) in these landscapes are generally at a low level. In the northern hemisphere, attempts are being made to address the problem through agri-environment schemes, but such farmer compensation is not available in New Zealand. Instead, landowner- and research-led initiatives are currently the only potential approaches. One such project in the Canterbury province, New Zealand, is the Greening Waipara project. This aims to return native plants to viticultural landscapes and enhance ES, and while research has sought to quantify economic benefits of the project, there has been no work to establish if the plantings are improving or are likely to improve non-target invertebrate biodiversity, for example arthropods that are not biocontrol agents. In the first study of its kind in New Zealand, butterfly surveys were conducted in vineyards and linear mixed modelling techniques were used to identify the most important vegetation and structural features which may influence butterfly distribution. While the native planting areas were not important for butterflies, remnant patches of native vegetation in unproductive areas were vital for sedentary species. These results are discussed in relation to the conservation of native species in New Zealand vineyards and in the context of conservation in and around farmland in general.  相似文献   

16.

Background

Predicting disease causative genes (or simply, disease genes) has played critical roles in understanding the genetic basis of human diseases and further providing disease treatment guidelines. While various computational methods have been proposed for disease gene prediction, with the recent increasing availability of biological information for genes, it is highly motivated to leverage these valuable data sources and extract useful information for accurately predicting disease genes.

Results

We present an integrative framework called N2VKO to predict disease genes. Firstly, we learn the node embeddings from protein-protein interaction (PPI) network for genes by adapting the well-known representation learning method node2vec. Secondly, we combine the learned node embeddings with various biological annotations as rich feature representation for genes, and subsequently build binary classification models for disease gene prediction. Finally, as the data for disease gene prediction is usually imbalanced (i.e. the number of the causative genes for a specific disease is much less than that of its non-causative genes), we further address this serious data imbalance issue by applying oversampling techniques for imbalance data correction to improve the prediction performance. Comprehensive experiments demonstrate that our proposed N2VKO significantly outperforms four state-of-the-art methods for disease gene prediction across seven diseases.

Conclusions

In this study, we show that node embeddings learned from PPI networks work well for disease gene prediction, while integrating node embeddings with other biological annotations further improves the performance of classification models. Moreover, oversampling techniques for imbalance correction further enhances the prediction performance. In addition, the literature search of predicted disease genes also shows the effectiveness of our proposed N2VKO framework for disease gene prediction.
  相似文献   

17.

The modeling of complex computational applications as giant computational workflows has been a critically effective means of better understanding the intricacies of applications and of determining the best approach to their realization. It is a challenging assignment to schedule such workflows in the cloud while also considering users’ different quality of service requirements. The present paper introduces a new direction based on a divide-and-conquer approach to scheduling these workflows. The proposed Divide-and-conquer Workflow Scheduling algorithm (DQWS) is designed with the objective of minimizing the cost of workflow execution while respecting its deadline. The critical path concept is the inspiration behind the divide-and-conquer process. DQWS finds the critical path, schedules it, removes the critical path from the workflow, and effectively divides the leftover into some mini workflows. The process continues until only chain structured workflows, called linear graphs, remain. Scheduling linear graphs is performed in the final phase of the algorithm. Experiments show that DQWS outperforms its competitors, both in terms of meeting deadlines and minimizing the monetary costs of executing scheduled workflows.

  相似文献   

18.
Aruna  M. G.  Hasan  Mohammad Kamrul  Islam  Shayla  Mohan  K. G.  Sharan  Preeta  Hassan  Rosilah 《Cluster computing》2022,25(4):2317-2331
Cluster Computing - The Coronavirus pandemic and the work-from-anywhere has created a shift toward cloud-based services. The pandemic is causing an explosion in cloud migration, expected that by...  相似文献   

19.
Co-crystallization of membrane proteins with antibody fragments may emerge as a general tool to facilitate crystal growth and improve crystal quality. The bound antibody fragment enlarges the hydrophilic part of the mostly hydrophobic membrane protein, thereby increasing the interaction area for possible protein-protein contacts in the crystal. Additionally, it may restrain flexible parts or lock the membrane protein in a defined conformational state. For successful co-crystallization trials, the antibody fragments must be stable in detergents during the extended period of crystal growth and must be easily produced in amounts necessary for crystallography. Therefore, we constructed a library of antibody Fab fragments from a framework subset of the HuCAL GOLD library (Morphosys, Munich, Germany). By combining the most stable and well expressed frameworks, V(H)3 and V(kappa)3, with the further stabilizing constant domains, a Fab library with the desired properties was obtained in a standard phage display format. As a proof of principle, we selected binders with phage display against the detergent-solubilized citrate transporter CitS of Klebsiella pneumoniae. We describe efficient methods for the immobilization of the membrane protein during selection, for ELISA screening, and for BIAcore evaluation. We demonstrate that the selected Fab fragments form stable complexes with native CitS and recognize conformational epitopes with affinities in the low nanomolar range.  相似文献   

20.
One of the main challenges in the biomedical sciences is the determination of reaction mechanisms that constitute a biochemical pathway. During the last decades, advances have been made in building complex diagrams showing the static interactions of proteins. The challenge for systems biologists is to build realistic models of the dynamical behavior of reactants, intermediates and products. For this purpose, several methods have been recently proposed to deduce the reaction mechanisms or to estimate the kinetic parameters of the elementary reactions that constitute the pathway. One such method is MIKANA: Method to Infer Kinetics And Network Architecture. MIKANA is a computational method to infer both reaction mechanisms and estimate the kinetic parameters of biochemical pathways from time course data. To make it available to the scientific community, we developed a Graphical User Interface (GUI) for MIKANA. Among other features, the GUI validates and processes an input time course data, displays the inferred reactions, generates the differential equations for the chemical species in the pathway and plots the prediction curves on top of the input time course data. We also added a new feature to MIKANA that allows the user to exclude a priori known reactions from the inferred mechanism. This addition improves the performance of the method. In this article, we illustrate the GUI for MIKANA with three examples: an irreversible Michaelis-Menten reaction mechanism; the interaction map of chemical species of the muscle glycolytic pathway; and the glycolytic pathway of Lactococcus lactis. We also describe the code and methods in sufficient detail to allow researchers to further develop the code or reproduce the experiments described. The code for MIKANA is open source, free for academic and non-academic use and is available for download (Information S1).  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号