首页 | 本学科首页   官方微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
本文建立了一个最新的蛋白质亚线粒体定位数据集,包含4个亚线粒体定位的1 293条序列,结合基因本体(GO)信息和同源信息对线粒体蛋白质进行特征提取,利用支持向量机算法建立分类器,经Jackknife检验,对于4个亚线粒体位置的总体预测准确率为93.27%,其中3个亚线粒体位置的总体预测准确率为94.73%.  相似文献   

2.
For most proteins, multiple sequence alignments are a viable method to identify functionally and structurally important amino acids, but for most organisms, there is a subset of proteins that are unique or found in a few closely related organisms. For these proteins, it is not possible to produce sequence alignments that are useful in identifying functionally or structurally important amino acids. We have investigated the relationship between amino acid conservation and five factors (the amino acid’s identity, N-terminal neighbor, C-terminal neighbor, the local hydropathy of surrounding amino acids, and the local expected net charge of the surrounding amino acids based on the primary sequence) in Escherichia coli proteins. For four of the factors examined (all but the amino acid’s identity), there is a significant relationship with conservation for some of the standard 20 amino acids. Using the combination of all five factors, we show that it is possible to calculate a score based on the primary sequences of a subset of E. coli proteins that has statistically significant predictive value with respect to predicting conserved amino acids in other E. coli proteins and Saccharomyces cerevisiae proteins. As these five variables show significant relationships with conservation, we have termed them conservation factors. Electronic supplementary material  The online version of this article (doi:) contains supplementary material, which is available to authorized users.  相似文献   

3.
The identification of protein kinase targets remains a significant bottleneck for our understanding of signal transduction in normal and diseased cellular states. Kinases recognize their substrates in part through sequence motifs on substrate proteins, which, to date, have most effectively been elucidated using combinatorial peptide library approaches. Here, we present and demonstrate the ProPeL method for easy and accurate discovery of kinase specificity motifs through the use of native bacterial proteomes that serve as in vivo libraries for thousands of simultaneous phosphorylation reactions. Using recombinant kinases expressed in E. coli followed by mass spectrometry, the approach accurately recapitulated the well-established motif preferences of human basophilic (Protein Kinase A) and acidophilic (Casein Kinase II) kinases. These motifs, derived for PKA and CK II using only bacterial sequence data, were then further validated by utilizing them in conjunction with the scan-x software program to computationally predict known human phosphorylation sites with high confidence.  相似文献   

4.
The pair-coupled amino acid composition is introduced to predict the secondary structure contents of a protein. Compared with the existing methods all based on singlewise amino acid composition as defined in a 20D (dimensional) space, this represents a step forward to the consideration of the sequence coupling effect. The test results indicate that the introduction of the pair-coupled amino acid composition can significantly improve the prediction quality. It is anticipated that the concept of the pair-coupled amino acid composition can be used to simplify the formulation of sequence coupling (or sequence order) effects and to study many other features of proteins as well.  相似文献   

5.
The high levels of sequence diversity and rapid rates of evolution of HIV-1 represent the main challenges for developing effective therapies. However, there are constraints imposed by the three-dimensional protein structure that affect the sequence space accessible to the evolution of HIV-1. Here, we present a strategy for predicting the set of possible amino acid replacements in HIV. Our approach is based on the identification of likely amino acid changes in the context of these structural constraints using environment-specific substitution matrices as well as considering the physical constraints imposed by local structure. Assessment of the power of various published algorithms in predicting the evolution of HIV-1 Gag P17 shows that it is possible to use these methods to make accurate predictions of the sequence diversity. Our own method, SubFit, uses knowledge of local structural constraints; it achieves similar prediction success with the best-performing methods. We also show that erroneous predictions are largely due to infrequently occurring amino acids that will probably have severe fitness costs for the protein. Future improvements; for example, incorporating covariation and immunological constraints will permit more reliable prediction of viral evolution.  相似文献   

6.
7.
8.

Background

Mortality prediction models generally require clinical data or are derived from information coded at discharge, limiting adjustment for presenting severity of illness in observational studies using administrative data.

Objectives

To develop and validate a mortality prediction model using administrative data available in the first 2 hospital days.

Research Design

After dividing the dataset into derivation and validation sets, we created a hierarchical generalized linear mortality model that included patient demographics, comorbidities, medications, therapies, and diagnostic tests administered in the first 2 hospital days. We then applied the model to the validation set.

Subjects

Patients aged ≥18 years admitted with pneumonia between July 2007 and June 2010 to 347 hospitals in Premier, Inc.’s Perspective database.

Measures

In hospital mortality.

Results

The derivation cohort included 200,870 patients and the validation cohort had 50,037. Mortality was 7.2%. In the multivariable model, 3 demographic factors, 25 comorbidities, 41 medications, 7 diagnostic tests, and 9 treatments were associated with mortality. Factors that were most strongly associated with mortality included receipt of vasopressors, non-invasive ventilation, and bicarbonate. The model had a c-statistic of 0.85 in both cohorts. In the validation cohort, deciles of predicted risk ranged from 0.3% to 34.3% with observed risk over the same deciles from 0.1% to 33.7%.

Conclusions

A mortality model based on detailed administrative data available in the first 2 hospital days had good discrimination and calibration. The model compares favorably to clinically based prediction models and may be useful in observational studies when clinical data are not available.  相似文献   

9.
10.
11.
12.
13.
The mRNAs for two myelin proteins, myelin basic protein (MBP) and myelin-associated oligodendrocytic basic protein (MOBP)-81A, are uniquely located at sites where myelin sheaths are assembled. Here, we use subcellular fractionation to show that four MOBP mRNAs, like MBP mRNA, are located at sites of myelin sheath assembly, and that three other MOBP mRNAs are located in oligodendrocyte soma. The MOBP-81 protein is found in myelin and in another subcellular fraction, whereas other myelin proteins, including MBP, 2',3'-cyclic nucleotide 3'-phosphodiesterase, and myelin-associated glycoprotein, are largely restricted to myelin. Different MBP mRNAs are generated by alternative splicing. All of them contain an RNA transport sequence (RTS) that directs them to sites in oligodendrocytes, where myelin sheaths are assembled. Consequently, all are enriched in myelin. After fractionation, four MOBP mRNAs, MOBP-71, MOBP-81A, MOBP-99, and MOBP-169 (identified in this study), are enriched in myelin. These mRNAs contain a common exon, exon 8b, which has a nucleotide sequence that is similar to MBP mRNA RTS. This sequence likely directs these mRNAs to sites of myelin sheath assembly. Three other MOBP mRNAs, MOBP-69, MOBP-81B, and MOBP-170, lack this exon. Their subcellular distribution indicates that they are largely retained in oligodendrocyte soma. We conclude that the distribution of MOBPs in oligodendrocytes is strongly influenced by alternative splicing of the corresponding mRNAs.  相似文献   

14.
Prediction of patient-centered outcomes in hospitals is useful for performance benchmarking, resource allocation, and guidance regarding active treatment and withdrawal of care. Yet, their use by clinicians is limited by the complexity of available tools and amount of data required. We propose to use Disjunctive Normal Forms as a novel approach to predict hospital and 90-day mortality from instance-based patient data, comprising demographic, genetic, and physiologic information in a large cohort of patients admitted with severe community acquired pneumonia. We develop two algorithms to efficiently learn Disjunctive Normal Forms, which yield easy-to-interpret rules that explicitly map data to the outcome of interest. Disjunctive Normal Forms achieve higher prediction performance quality compared to a set of state-of-the-art machine learning models, and unveils insights unavailable with standard methods. Disjunctive Normal Forms constitute an intuitive set of prediction rules that could be easily implemented to predict outcomes and guide criteria-based clinical decision making and clinical trial execution, and thus of greater practical usefulness than currently available prediction tools. The Java implementation of the tool JavaDNF will be publicly available.  相似文献   

15.
The structural proteins of the extracellular matrix (ECM) form fibers with finely tuned mechanical properties matched to the time scales of cell traction forces. Several proteins such as fibronectin (Fn) and fibrin undergo molecular conformational changes that extend the proteins and are believed to be a major contributor to the extensibility of bulk fibers. The dynamics of these conformational changes have been thoroughly explored since the advent of single molecule force spectroscopy and molecular dynamics simulations but remarkably, these data have not been rigorously applied to the understanding of the time dependent mechanics of bulk ECM fibers. Using measurements of protein density within fibers, we have examined the influence of dynamic molecular conformational changes and the intermolecular arrangement of Fn within fibers on the bulk mechanical properties of Fn fibers. Fibers were simulated as molecular strands with architectures that promote either equal or disparate molecular loading under conditions of constant extension rate. Measurements of protein concentration within micron scale fibers using deep ultraviolet transmission microscopy allowed the simulations to be scaled appropriately for comparison to in vitro measurements of fiber mechanics as well as providing estimates of fiber porosity and water content, suggesting Fn fibers are approximately 75% solute. Comparing the properties predicted by single molecule measurements to in vitro measurements of Fn fibers showed that domain unfolding is sufficient to predict the high extensibility and nonlinear stiffness of Fn fibers with surprising accuracy, with disparately loaded fibers providing the best fit to experiment. This work shows the promise of this microstructural modeling approach for understanding Fn fiber properties, which is generally applicable to other ECM fibers, and could be further expanded to tissue scale by incorporating these simulated fibers into three dimensional network models.  相似文献   

16.
A method for classifying chemicals with respect to carcinogenic potential based on short-term test results is presented. The method utilizes the logistic regression model to translate results from short-term toxicity assays into predictions of the likelihood that a chemical will be carcinogenic if tested in a long-term bioassay. The proposed method differs from previous approaches in two ways. First, statistical confidence limits on probabilities of cancer rather than central estimates of those probabilities are used for classification. Second, the method does not classify all chemicals in a data base with respect to carcinogenic potential. Instead, it identifies chemicals with highest and lowest likelihood of testing positive for carcinogenicity in the bioassay. A subset of chemicals with intermediate likelihood of being positive remains unclassified, and will require further testing, perhaps in a long-term bioassay. Two data bases of binary short-term and long-term test results from the literature are used to illustrate and evaluate the proposed procedure. A cross-validation analysis of one of the data sets suggests that, for a sufficiently rich data base of chemicals, the development of a robust predictive system to replace the bioassay for some unknown chemicals is a realistic goal.  相似文献   

17.
  1. Download : Download high-res image (166KB)
  2. Download : Download full-size image
  相似文献   

18.
19.
20.
Characterization of solvent preferences of proteins is essential to the understanding of solvent effects on protein structure and stability. Although it is generally believed that solvent preferences at distinct loci of a protein surface may differ, quantitative characterization of local protein solvation has remained elusive. In this study, we show that local solvation preferences can be quantified over the entire protein surface from extended molecular dynamics simulations. By subjecting microsecond trajectories of two proteins (lysozyme and antibody fragment D1.3) in 4 M glycerol to rigorous statistical analyses, solvent preferences of individual protein residues are quantified by local preferential interaction coefficients. Local solvent preferences for glycerol vary widely from residue to residue and may change as a result of protein side-chain motions that are slower than the longest intrinsic solvation timescale of ~10 ns. Differences of local solvent preferences between distinct protein side-chain conformations predict solvent effects on local protein structure in good agreement with experiment. This study extends the application scope of preferential interaction theory and enables molecular understanding of solvent effects on protein structure through comprehensive characterization of local protein solvation.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司  京ICP备09084417号