首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
In disease screening and prognosis studies, an important task is to determine useful markers for identifying high-risk subgroups. Once such markers are established, they can be incorporated into public health practice to provide appropriate strategies for treatment or disease monitoring based on each individual's predicted risk. In the recent years, genetic and biological markers have been examined extensively for their potential to signal progression or risk of disease. In addition to these markers, it has often been argued that short-term outcomes may be helpful in making a better prediction of disease outcomes in clinical practice. In this paper we propose model-free non-parametric procedures to incorporate short-term event information to improve the prediction of a long-term terminal event. We include the optional availability of a single discrete marker measurement and assess the additional information gained by including the short-term outcome. We focus on the semi-competing risk setting where the short-term event is an intermediate event that may be censored by the terminal event while the terminal event is only subject to administrative censoring. Simulation studies suggest that the proposed procedures perform well in finite samples. Our procedures are illustrated using a data set of post-dialysis patients with end-stage renal disease.  相似文献   

2.
The large choice of Distributed Computing Infrastructures (DCIs) available allows users to select and combine their preferred architectures amongst Clusters, Grids, Clouds, Desktop Grids and more. In these hybrid DCIs, elasticity is emerging as a key property. In elastic infrastructures, resources available to execute application continuously vary, either because of application requirements or because of constraints on the infrastructure, such as node volatility. In the former case, there is no guarantee that the computing resources will remain available during the entire execution of an application. In this paper, we show that Bag-of-Tasks (BoT) execution on these “Best-Effort” infrastructures suffer from a drop of the task completion rate at the end of the execution. The SpeQuloS service presented in this paper improves the Quality of Service (QoS) of BoT applications executed on hybrid and elastic infrastructures. SpeQuloS monitors the execution of the BoT, and dynamically supplies fast and reliable Cloud resources when the critical part of the BoT is executed. SpeQuloS offers several features to hybrid DCIs users, such as estimating completion time and execution speedup. Performance evaluation shows that BoT executions can be accelerated by a factor 2, while offloading less than 2.5 % of the workload to the Cloud. We report on several scenarios where SpeQuloS is deployed on hybrid infrastructures featuring a large variety of infrastructures combinations. In the context of the European Desktop Grid Initiative (EDGI), SpeQuloS is operated to improve QoS of Desktop Grids using resources from private Clouds. We present a use case where SpeQuloS uses both EC2 regular and spot instances to decrease the cost of computation while preserving a similar QoS level. Finally, in the last scenario SpeQuloS allows to optimize Grid5000 resources utilization.  相似文献   

3.
4.
Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has a sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments. © 1994 Wiley-Liss, Inc.  相似文献   

5.
6.
Hsu JB  Bretaña NA  Lee TY  Huang HD 《PloS one》2011,6(11):e27567
Regulation of pre-mRNA splicing is achieved through the interaction of RNA sequence elements and a variety of RNA-splicing related proteins (splicing factors). The splicing machinery in humans is not yet fully elucidated, partly because splicing factors in humans have not been exhaustively identified. Furthermore, experimental methods for splicing factor identification are time-consuming and lab-intensive. Although many computational methods have been proposed for the identification of RNA-binding proteins, there exists no development that focuses on the identification of RNA-splicing related proteins so far. Therefore, we are motivated to design a method that focuses on the identification of human splicing factors using experimentally verified splicing factors. The investigation of amino acid composition reveals that there are remarkable differences between splicing factors and non-splicing proteins. A support vector machine (SVM) is utilized to construct a predictive model, and the five-fold cross-validation evaluation indicates that the SVM model trained with amino acid composition could provide a promising accuracy (80.22%). Another basic feature, amino acid dipeptide composition, is also examined to yield a similar predictive performance to amino acid composition. In addition, this work presents that the incorporation of evolutionary information and domain information could improve the predictive performance. The constructed models have been demonstrated to effectively classify (73.65% accuracy) an independent data set of human splicing factors. The result of independent testing indicates that in silico identification could be a feasible means of conducting preliminary analyses of splicing factors and significantly reducing the number of potential targets that require further in vivo or in vitro confirmation.  相似文献   

7.
Three-dimensional structures of proteins can provide important clues into the efficacy of personalized treatment. We perform a structural analysis of variants within three inherited lysosomal storage disorders, comparing variants responsive to pharmacological chaperone treatment to those unresponsive to such treatment. We find that predicted ΔΔG of mutation is higher on average for variants unresponsive to treatment, in the case of datasets for both Fabry disease and Pompe disease, in line with previous findings. Using both a single decision tree and an advanced machine learning approach based on the larger Fabry dataset, we correctly predict responsiveness of three Gaucher disease variants, and we provide predictions for untested variants. Many variants are predicted to be responsive to treatment, suggesting that drug-based treatments may be effective for a number of variants in Gaucher disease. In our analysis, we observe dependence on a topological feature reporting on contact arrangements which is likely connected to the order of folding of protein residues, and we provide a potential justification for this observation based on steady-state cellular kinetics.  相似文献   

8.
9.
10.
11.
In CAPRI rounds 6-12, RosettaDock successfully predicted 2 of 5 unbound-unbound targets to medium accuracy. Improvement over the previous method was achieved with computational mutagenesis to select decoys that match the energetics of experimentally determined hot spots. In the case of Target 21, Orc1/Sir1, this resulted in a successful docking prediction where RosettaDock alone or with simple site constraints failed. Experimental information also helped limit the interacting region of TolB/Pal, producing a successful prediction of Target 26. In addition, we docked multiple loop conformations for Target 20, and we developed a novel flexible docking algorithm to simultaneously optimize backbone conformation and rigid-body orientation to generate a wide diversity of conformations for Target 24. Continued challenges included docking of homology targets that differ substantially from their template (sequence identity <50%) and accounting for large conformational changes upon binding. Despite a larger number of unbound-unbound and homology model binding targets, Rounds 6-12 reinforced that RosettaDock is a powerful algorithm for predicting bound complex structures, especially when combined with experimental data.  相似文献   

12.
13.
Predicting breed-specific environmental suitability has been problematic in livestock production. Native breeds have low productivity but are thought to be more robust to perform under local conditions than exotic breeds. Attempts to introduce genetically improved exotic breeds are generally unsuccessful, mainly due to the antagonistic environmental conditions. Knowledge of the environmental conditions that are shaping the breed would be needed to determine its suitability to different locations. Here, we present a methodology to predict the suitability of breeds for different agro-ecological zones using Geographic Information Systems tools and predictive habitat distribution models. This methodology was tested on the current distribution of two introduced chicken breeds in Ethiopia: the Koekoek, originally from South Africa, and the Fayoumi, originally from Egypt. Cross-validation results show this methodology to be effective in predicting breed suitability for specific environmental conditions. Furthermore, the model predicts suitable areas of the country where the breeds could be introduced. The specific climatic parameters that explained the potential distribution of each of the breeds were similar to the environment from which the breeds originated. This novel methodology finds application in livestock programs, allowing for a more informed decision when designing breeding programs and introduction programs, and increases our understanding of the role of the environment in livestock productivity.  相似文献   

14.
Selection of protein targets for study is central to structural biology and may be influenced by numerous factors. A key aim is to maximise returns for effort invested by identifying proteins with the balance of biophysical properties that are conducive to success at all stages (e.g. solubility, crystallisation) in the route towards a high resolution structural model. Selected targets can be optimised through construct design (e.g. to minimise protein disorder), switching to a homologous protein, and selection of experimental methodology (e.g. choice of expression system) to prime for efficient progress through the structural proteomics pipeline. Here we discuss computational techniques in target selection and optimisation, with more detailed focus on tools developed within the Scottish Structural Proteomics Facility (SSPF); namely XANNpred, ParCrys, OB-Score (target selection) and TarO (target optimisation). TarO runs a large number of algorithms, searching for homologues and annotating the pool of possible alternative targets. This pool of putative homologues is presented in a ranked, tabulated format and results are also visualised as an automatically generated and annotated multiple sequence alignment. The target selection algorithms each predict the propensity of a selected protein target to progress through the experimental stages leading to diffracting crystals. This single predictor approach has advantages for target selection, when compared with an approach using two or more predictors that each predict for success at a single experimental stage. The tools described here helped SSPF achieve a high (21%) success rate in progressing cloned targets to diffraction-quality crystals.  相似文献   

15.
To facilitate rigorous analysis of molecular motions in proteins, DNA, and RNA, we present a new version of ROTDIF, a program for determining the overall rotational diffusion tensor from single- or multiple-field nuclear magnetic resonance relaxation data. We introduce four major features that expand the program’s versatility and usability. The first feature is the ability to analyze, separately or together, 13C and/or 15N relaxation data collected at a single or multiple fields. A significant improvement in the accuracy compared to direct analysis of R 2/R 1 ratios, especially critical for analysis of 13C relaxation data, is achieved by subtracting high-frequency contributions to relaxation rates. The second new feature is an improved method for computing the rotational diffusion tensor in the presence of biased errors, such as large conformational exchange contributions, that significantly enhances the accuracy of the computation. The third new feature is the integration of the domain alignment and docking module for relaxation-based structure determination of multi-domain systems. Finally, to improve accessibility to all the program features, we introduced a graphical user interface that simplifies and speeds up the analysis of the data. Written in Java, the new ROTDIF can run on virtually any computer platform. In addition, the new ROTDIF achieves an order of magnitude speedup over the previous version by implementing a more efficient deterministic minimization algorithm. We not only demonstrate the improvement in accuracy and speed of the new algorithm for synthetic and experimental 13C and 15N relaxation data for several proteins and nucleic acids, but also show that careful analysis required especially for characterizing RNA dynamics allowed us to uncover subtle conformational changes in RNA as a function of temperature that were opaque to previous analysis.  相似文献   

16.

Background

Barrett''s esophagus predisposes to esophageal adenocarcinoma. However, the value of endoscopic surveillance in Barrett''s esophagus has been debated because of the low incidence of esophageal adenocarcinoma in Barrett''s esophagus. Moreover, high inter-observer and sampling-dependent variation in the histologic staging of dysplasia make clinical risk assessment problematic. In this study, we developed a 3-tiered risk stratification strategy, based on systematically selected epigenetic and clinical parameters, to improve Barrett''s esophagus surveillance efficiency.

Methods and Findings

We defined high-grade dysplasia as endpoint of progression, and Barrett''s esophagus progressor patients as Barrett''s esophagus patients with either no dysplasia or low-grade dysplasia who later developed high-grade dysplasia or esophageal adenocarcinoma. We analyzed 4 epigenetic and 3 clinical parameters in 118 Barrett''s esophagus tissues obtained from 35 progressor and 27 non-progressor Barrett''s esophagus patients from Baltimore Veterans Affairs Maryland Health Care Systems and Mayo Clinic. Based on 2-year and 4-year prediction models using linear discriminant analysis (area under the receiver-operator characteristic (ROC) curve: 0.8386 and 0.7910, respectively), Barrett''s esophagus specimens were stratified into high-risk (HR), intermediate-risk (IR), or low-risk (LR) groups. This 3-tiered stratification method retained both the high specificity of the 2-year model and the high sensitivity of the 4-year model. Progression-free survivals differed significantly among the 3 risk groups, with p = 0.0022 (HR vs. IR) and p<0.0001 (HR or IR vs. LR). Incremental value analyses demonstrated that the number of methylated genes contributed most influentially to prediction accuracy.

Conclusions

This 3-tiered risk stratification strategy has the potential to exert a profound impact on Barrett''s esophagus surveillance accuracy and efficiency.  相似文献   

17.
Endoplasmic reticulum (ER) stress has considerable impact on cell growth, proliferation, metastasis, invasion, angiogenesis and chemoradiotherapy resistance in various cancers. However, the effect of ER stress on the outcomes of glioma patients remains unclear. In this study, we established an ER stress risk model based on The Cancer Genome Atlas (TCGA) glioma data set to reflect immune characteristics and predict the prognosis of glioma patients. Survival analysis indicated that there were significant differences in the overall survival (OS) of glioma patients with different ER stress-related risk scores. Moreover, the ER stress-related risk signature, which was markedly associated with the clinicopathological properties of glioma patients, could serve as an independent prognostic indicator. Functional enrichment analysis revealed that the risk model correlated with immune and inflammation responses, as well as biosynthesis and degradation. In addition, the ER stress-related risk model indicated an immunosuppressive microenvironment. In conclusion, we present an ER stress risk model that is an independent prognostic factor and indicates the general immune characteristics in the glioma microenvironment.  相似文献   

18.
Kim C  Park D  Seol Y  Hahn J 《Bioinformation》2011,6(6):246-247
The National Agricultural Biotechnology Information Center (NABIC) constructed an agricultural biology-based infrastructure and developed a Web based relational database for agricultural plants with biotechnology information. The NABIC has concentrated on functional genomics of major agricultural plants, building an integrated biotechnology database for agro-biotech information that focuses on genomics of major agricultural resources. This genome database provides annotated genome information from 1,039,823 records mapped to rice, Arabidopsis, and Chinese cabbage.  相似文献   

19.
20.
It is well established that different sites within a protein evolve at different rates according to their role within the protein; identification of these correlated mutations can aid in tasks such as ab initio protein structure, structure function analysis or sequence alignment. Mutual Information is a standard measure for coevolution between two sites but its application is limited by signal to noise ratio. In this work we report a preliminary study to investigate whether larger sequence sets could circumvent this problem by calculating mutual information arrays for two sets of drug naïve sequences from the HIV gp120 protein for the B and C subtypes. Our results suggest that while the larger sequences sets can improve the signal to noise ratio, the gain is offset by the high mutation rate of the HIV virus which makes it more difficult to achieve consistent alignments. Nevertheless, we were able to predict a number of coevolving sites that were supported by previous experimental studies as well as a region close to the C terminal of the protein that was highly variable in the C subtype but highly conserved in the B subtype.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号